Dr. Yijun Zhao
Director of MS in Data Science (MSDS) Program
Computer and Information Sciences Department
Dr. Zhao’s research interests include machine learning, data mining and statistical pattern recognition.
In collaboration with medical experts, she has applied Machine Learning methods to various healthcare related applications, including detecting brain
abnormalities occurring in neurological disorders such as Epilepsy,
and predicting disease course in Multiple Sclerosis patients.
I am actively looking for motivated students to
work on applications of data analysis in various domains
such as healthcare, computational biology, Finance and economics.
If you would like to explore an area of research that is driving
innovations in every sector of our daily life, please contact me to discuss
Past and Current Projects
Predicting Disease Course of Multiple Sclerosis Patients
Multiple Sclerosis (MS) is the number one medical cause of neurological disability amongst young persons in the U.S., with an overall prevalence of 400,000. The majority of cases present with relapses involving neurological deficits such as vision blurring or loss, weakness, numbness, imbalance or cognitive deficits. In our research, we work closely with doctors from Harvard Medical School and Brigham and Women’s Hospital (BWH) in Boston, Massachusetts to predict the disability level of MS patients at the fifth year mark using their first two year's longitudinal data. Our clinical data are collected as part of the CLIMB (Comprehensive Longitudinal Investigation of Multiple Sclerosis at Brigham and Women’s Hospital) study at BWH. The CLIMB study is a large-scale, long-term study of patients with MS. It is designed to investigate the course of the disease in the current era of treatment. The main goals of the study are to identify predictors of future disease course when patients are at the beginning of their illness and determine the effects of treatment on disease progression and accumulation of disability.
Deep Learning for Detecting and Reducing Motion Artifacts in Brain MRI Images
Modern neuroimaging is central to the assessment of patients with epilepsy. However, in-scanner head motion degrades the quality of brain MRI and thereby reduces the utility of MRI for the detection of clinically relevant neuroanatomical abnormalities. In this project, we develop a deep learning model to detect the amount of in-scanner head motion using raw video data and resting functional MRI images. We further reduce the motion-induce artifacts using a denoising autoencoder.
Machine Learning to Monitor and Predict Lupus Disease Course
Systemic lupus erythematosus (SLE) is a heterogeneous disease associated with premature morbidity and
mortality. The clinical course is characterized by disease flares which can range from mild to life-threatening and can affect various organ systems. The objective of our research is to utilize readily available EHR data and machine-learning methods to classify real-world SLE flares and to identify predictors of SLE flares.
Hospital Readmissions Analysis
Reducing unnecessary admissions and readmissions to acute care facilities has been a focus of healthcare quality improvement efforts. The Agency for Healthcare Research and Quality’s (AHRQ) Healthcare Cost and Utilization Project (HCUP) estimated that in 2011, there were approximately 3.3 million adult 30-day all-cause hospital readmissions in the United States. Avoidable admissions and readmissions not only cause patients prolonged illness and pain, but also burden the healthcare system with unnecessary costs. HCUP estimated that in 2011, 30-day adult all-cause readmissions were associated with about $41.3 billion in hospital costs. In our research, we collaborate with researchers from University of Florida and apply the state of the art machine learning techniques to predict individual 30-day readmission probabilities of soon-to-be discharged inpatients. In addition, we examine trends in length of stay, hospital charges, and in-hospital mortality associated with different causes, as well as identified patient-level risk factors associated with 30-day readmissions.