Took part in all aspects of the data science pipeline from data ingest, cleaning, and validation, to building and evaluating machine learning models, to deploying AWS cloud infrastructure to leverage predictions.
•Developed real-time ADT integration using various AWS tools (API gateway, Lambda, SQS, S3, Secrets manager), responding to patient admissions with cached Random Forest model predictions
•Helped transition legacy SFTP server to managed AWS Transfer Service, improving reliability and reducing file transfer time
•Performed data ingest of multiple data sources using Apache Spark, including CMS claims, electronic health records, and medical device data
•Created various reports using Matplotlib and Plotly, to track process and outcome measures
•Designed, trained, and evaluated multiple machine learning models using MLLib and Pytorch to predict a variety of outcomes: behavioral health admissions, SNF discharges, significant A1c increases, and hospice eligibility
•Participated in recurring technical journal clubs