Abstract
I present a machine learning based approach on predicting mortality of ICU patients using structured clinical data from two large publicly available data sets: MIMIC-III and eICU Collaborative Research Database. After harmonizing and preprocessing variables for both datasets, features were filtered by missingness thresholds, correlations, mutual information, and random forest variable importance. I developed models such as XGBoost, Feedforward Neural Network and an ensemble model of both. Given that all models are trained on MIMIC-III and tested externally on eICU to assess their generalizability. The ensemble model performed best in most metrics including recall and AUC-ROC and could potentially be used for robust mortality prediction under a wide range of clinical practices. As such, these results show the possibility of generally applicable data driven models for clinical decision support within critical care settings.