Identifying melanoma prognostic biomarkers by integrating imaging data and genetic data
I am conducting research with Gen Li at the Department of Biostatistics at Columbia University, Yvonne Saenger and Robyn Gartrell from Columbia University Medical Center.
This project involved three major stages:
- Created interactive heat maps and adopted clustering analysis to explore hierarchical structure of melanoma gene expression data.
 
- Built predictive model on recurrence using logistic regression with lasso penalty, and validated it on two separate cohorts using cross-validation with bootstrapping.
 
- Communicated results of analyses, modeling, and tests through data visualization, interactive apps, and writing in R Markdown.
 
Paper in progress.
 
Estimating Influenza Incidence from Diagnostic Codes
This work is mentored by Sasikiran Kandula and Jeffrey Shaman in the Department of Environmental Health Sciences at Columbia University.
Slide
- Analyzed large datasets (~68 million) and explored machine learning methods in MySQL and R to track real-time influenza incidence combining both diagnostic and virologic data.
 
- Compared and Contrasted different machine learning methods (Boosting, SVM, Random Forest…) using cross-validation.
 
- Created clear and compelling reports, visualizations, and interactive apps for collaborators in R Markdown.
 
 
  
  
  
  
  
  
Copyright © 2017 Jiayi Ji