Charles Explorer logo
🇬🇧

Data analytics framework for sparse longitudinal structured biomedical data

Publication at Faculty of Mathematics and Physics, First Faculty of Medicine |
2023

Abstract

An increasing amount of data is stored in electronic health records originating from laboratory, imaging, and clinical examinations. However, the automated employment of machine learning algorithms for clinical decision tasks is still limited in the case of long-term medical structured data, such as the observations of patients suffering from multiple sclerosis, including numerical laboratory results and volumes derived from brain MRI segmentation.

The main reason is the complexity of these data caused by high dimensionality, irregular temporal nature, and incompleteness in both time and observation dimensions.This study introduces a comprehensive automated framework designed for an end-to-end analysis of longitudinal structured biomedical data. It comprises a preprocessing component, which includes several methods for regularization and missing values imputation.

Following, a prediction component suitable for various classification and regression tasks features a range of traditional machine learning and deep neural network models. Finally, the data visualization component based on the Potential of Heat-diffusion for Affinity-based Trajectory identifies the patterns in these complex data.Evaluation of this framework was conducted on a real-world dataset involving patients with multiple sclerosis, addressing tasks such as classifying the patient's disability state and predicting the patient's future disability score.

Additionally, with the data visualization techniques, the study demonstrates that even incomplete long-term medical time series data can unveil valuable insights.