Identifying melanoma prognostic biomarkers by integrating imaging data and genetic data

I am conducting research with Gen Li at the Department of Biostatistics at Columbia University, Yvonne Saenger and Robyn Gartrell from Columbia University Medical Center.

This project involved three major stages:

  1. Created interactive heat maps and adopted clustering analysis to explore hierarchical structure of melanoma gene expression data.
  2. Built predictive model on recurrence using logistic regression with lasso penalty, and validated it on two separate cohorts using cross-validation with bootstrapping.
  3. Communicated results of analyses, modeling, and tests through data visualization, interactive apps, and writing in R Markdown.

Paper in progress.

Estimating Influenza Incidence from Diagnostic Codes

This work is mentored by Sasikiran Kandula and Jeffrey Shaman in the Department of Environmental Health Sciences at Columbia University.

Slide

  1. Analyzed large datasets (~68 million) and explored machine learning methods in MySQL and R to track real-time influenza incidence combining both diagnostic and virologic data.
  2. Compared and Contrasted different machine learning methods (Boosting, SVM, Random Forest…) using cross-validation.
  3. Created clear and compelling reports, visualizations, and interactive apps for collaborators in R Markdown.



Copyright © 2017 Jiayi Ji