Parkinson's Voice Initiative

Summary

This project aims to collect 10,000 sustained phonations ('aaah' vocal sounds) through telephone-quality digital audio lines, under realistic, non-lab conditions, to test the hypothesis that it is possible to detect Parkinson's disease through these recordings. This follows up on several recent studies in which we have shown that this detection is possible with lab-quality digital audio recordings of sustained phonations [1,3-8], and that these results are not noticeably degraded when the audio is passed through simulated, low-bandwidth mobile telephone audio compression with channel distortion [2]. Furthermore, we are able to accurately predict the severity of Parkinson's symptoms on a standard clinical scale (UPDRS) [3].

Methods

To detect Parkinson's from the voice, we extract a large number of dysphonia features (132 in recent studies [1-2]) from digital audio signals of sustained phonations ('aaah' sounds). These features cover a wide range of classical and novel clinical dysphonia analysis algorithms (see [3] for a comprehensive list). We then apply several feature selection algorithms (Lasso, mRMR, RELIEF, LLBFS [1]), and pass the selected features to standard supervised classifier algorithms (random forests and SVMs). When predicting symptom severity, we use random forests and SVMs in 'regression mode' [2,3] since the UPDRS scale is whole-number. To address overfitting, we use cross-validation, both "leave audio samples out" and "leave subjects out" schemes, in order to approximate the true generalization performance on unknown cases [1-3].

Data

The studies rely on two principal audio datasets: sustained phonations from people with Parkinson's recorded at home, weekly, over a 6-month period each (50 subjects, 5875 audio samples [3]), and lab-based recordings from healthy controls and people with Parkinson's (43 subjects, 263 audio samples [1,2]). Other datasets have been used in earlier studies where new dysphonia features were developed [4-8].

Results

In terms of detecting the disease, at best we achieve 98.6% detection accuracy (that is, the percentage of samples that were correctly identified as being either healthy or Parkinson's, averaged over all cross-validation runs) under lab conditions [1]. In terms of the severity of symptoms, the average prediction error is 3.5 points on the 176-point UPDRS scale (approx. 2% mean absolute, cross-validation error) under simulated mobile telephony conditions [2]. Additionally, we find that detection performance appears to level out at around 10 dysphonia features, which include features that measure vocal fold oscillation irregularity, breathiness and noise, and vocal tract resonance fluctuations [1].

Discussion

While these results are encouraging, they do not address the major potential for confounding factors that occur when voice recordings are collected under non-lab conditions: factors such as environmental noise and unintended caller behaviour cannot be controlled. It is the potential for these confounding factors to upset these results that motivates this study.

References

[1] [PDF] A. Tsanas, M.A. Little, P.E. McSharry, J. Spielman, L.O. Ramig (2012)
Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease
IEEE Transactions on Biomedical Engineering, 59(5):1264-1271

[2] [PDF] A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig (2012)
Using the cellular mobile telephone network to remotely monitor Parkinson's disease symptom severity
IEEE Transactions on Biomedical Engineering (submitted)

[3] [PDF] A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig (2010)
Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity
Journal of the Royal Society Interface, 8(59):842-855

[4] [PDF] A. Tsanas, M.A. Little, Patrick E. McSharry, Lorraine O. Ramig (2009)
Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests
IEEE Transactions on Biomedical Engineering, 57(4):884-893

[5] [PDF] [HTML] M.A. Little, Patrick E. McSharry, Eric J. Hunter, Jennifer Spielman, Lorraine O. Ramig (2009)
Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease
IEEE Transactions on Biomedical Engineering, 56(4):1015-1022

[6] [PDF] M.A. Little (2007)
Biomechanically informed nonlinear speech signal processing
D.Phil., Oxford University, Oxford, UK

[7] [PDF] M.A. Little, P.E. McSharry, S.J. Roberts, D.A.E. Costello, I.M. Moroz (2007)
Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection
BioMedical Engineering OnLine 2007, 6:23.

[8] [PDF] M. Little, P. McSharry, I. Moroz, S. Roberts (2006)
Nonlinear, biophysically-informed speech pathology detection
in 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings.: Toulouse, France. pp. II-1080-II-1083.