Machine Learning Increases Accuracy and Efficiency of Trial Data Collection


URMC physicians




Improve data collection and data analysis processes in clinical trials.


Statistical analysis and machine learning.

Focus Area

Machine learning

Download PDF
Read More

The Opportunity

Myotonic muscular dystrophy is a genetic disorder. Symptoms include gradually worsening muscle loss and weakness. New treatments are being developed and evaluated in clinical trials. The opportunities lie in improving the efficiency and quality of the data collection and the effectiveness and reliability of the data analysis of patients’ response to these treatments to ensure the success of each clinical trial.

The Challenge

QMA is a small device that records the force on a spring during the course of a hand squeeze. A set of handcrafted features extracted from these recordings has been used to quantify the onsite symptom severity of myotonia with and without treatment. Each visit consists of multiple measurements of hand squeezes across multiple trials. Currently, errors in the data collection process are only discovered after the patient leaves his/her visit to the doctor’s office. In such cases, the corrupted data must be rejected and those patients have to be rescheduled for additional visits.

There is a desire to improve the efficiency of data collection and accuracy of data analysis to improve the reliability of treatment evaluation and reduce the number of invalid visits and time consumed in each trial.

“There is a desire to improve pre-processing of the signals and develop state-of-the-art feature extraction algorithms to improve disease diagnosis, especially to improve the efficiency in clinical trials.”


Signal/image/video processing integrated with machine learning and deep learning has demonstrated success in many domains including disease diagnosis and prognosis. Scientists at RDSC have extensive industrial research experience in developing solutions for real world problems using the aforementioned technologies.

Data Science in Action

After working closely with physicians to understand issues regarding the current data collection and data analysis, RDSC:

  1. Developed a new model for data quality verification that evaluates the quality of collected data so that any corrupted recordings can be rejected real time during data collection;
  2. Improved the robustness of signal preprocessing and feature extraction stages; .
  3. Developed deep learning algorithms that achieved 98% classification accuracy for patients vs. healthy-controls, 87% quantification accuracy of the warm-up effect between the first and fifth squeezes and discovered a reverse warm-up effect between trials within each visit;

The Result

  1. The quality-verification module will reduce the frequency at which a trial is wasted and/or a patient must be brought back due to mistakes in data collection;
  2. The improved robustness of pre-processing and processing stages allows better feature extraction in the presence of noise; .
  3. The deep learning algorithms discovered new insights that lead to improved efficiency in data collection in clinical trials.;

What’s Next?

Machine learning and deep learning approaches for time series analysis are powerful tools that can be employed in many fields, and healthcare is not an exception. RDSC is working with physicians and faculty members to explore other applications in healthcare such as disease diagnosis/prognosis via analysis of data acquired with voice and wearable sensors.