Front cover image for Improving deep learning for medical time series data by modeling multidimensional dependencies

Improving deep learning for medical time series data by modeling multidimensional dependencies

Time series data are prevalent in many medical domains. Two major types of medical time series data are (a) biosignals and (b) longitudinal electronic health record (EHR) data. Biosignals are signals measured by sensors placed on the surface of or in a person's body, such as surface electrocardiograms (ECG), electroencephalograms (EEG), and intracardiac electrograms (EGM). Longitudinal electronic health record (EHR) data, on the other hand, are patients' electronic health records over time, such as medical history, diagnoses, medications, radiology images, and laboratory test results. Biosignals and longitudinal EHR data differ in their frequency and time span. Biosignals are typically sampled at high sampling rates (e.g., 500 Hz) and range from seconds to hours, whereas longitudinal EHR data are collected at a much longer time interval (e.g., one lab test per day) and range from days to years. I am interested in developing deep learning methods for effectively modeling medical time series data, for four primary reasons. First, deep learning techniques have shown promising empirical successes in medical imaging domains; however, there are unmet clinical needs in medical time series domains, such as predicting atrial fibrillation (AF) recurrence to improve AF patient outcomes after treatment, improving automated seizure detection algorithms to accelerate the clinical workflow, and predicting patients' risks of hospital readmission to prevent unnecessary readmissions. Second, medical time series data play key roles in many medical classification and prediction tasks. For example, EEG is the major test for diagnosing epilepsy; ECG is the most common test for diagnosing various heart arrhythmias; longitudinal EHR data can be used to predict patients who are at risk of readmission. Third, medical time series data is studied much less than medical imaging data. Lastly, existing methods for modeling medical time series data often have poor performance, which may hinder their utility in real-world clinical settings. Medical time series data involve multidimensional dependencies, including spatiotemporal dependencies in biosignals, multimodal dependencies in multimodal EHR data, and similarity between patients. These multidimensional dependencies impose challenges for deep learning models. For example, how to effectively model spatiotemporal dependencies in biosignals? How to integrate multiple modalities? And how to leverage patient similarity to improve the model performance? In this dissertation, I aim to develop deep learning methods to model multidimensional dependencies in medical time series data to improve performance on medical classification and prediction tasks. First, I will model multidimensional dependencies in biosignals and clinical data using convolutional neural networks and multimodal fusion, with an application to AF recurrence prediction (Chapter 2). Second, I will improve the methods for modeling multidimensional dependencies in biosignals using graph neural networks (GNNs), with applications to EEG-based seizure detection and classification, and AF recurrence prediction (Chapters 3--4). Third, I will apply my GNN-based modeling approach to model multidimensional dependencies in multimodal, longitudinal EHR data, with an application to hospital readmission prediction (Chapter 5). I claim that both CNNs and GNNs can effectively model multidimensional dependencies in medical time series data for improved medical classification and prediction tasks. GNNs can provide superior performance than CNNs by capturing complex spatiotemporal dependencies in the data
Thesis, Dissertation, English, 2023
[Stanford University], [Stanford, California], 2023
Stanford University
1 online resource
Submitted to the Department of Electrical Engineering