Machine Learning Engineer specializing in 𝘀𝗲𝗮𝗿𝗰𝗵, 𝗱𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝘆, 𝗮𝗻𝗱 𝗿𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝘀𝘆𝘀𝘁𝗲𝗺𝘀, with experience building 𝗿𝗮𝗻𝗸𝗶𝗻𝗴 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺𝘀, 𝗠𝗟 𝗶𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲, 𝗮𝗻𝗱 𝗟𝗟𝗠-𝗽𝗼𝘄𝗲𝗿𝗲𝗱 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝘀𝘆𝘀𝘁𝗲𝗺𝘀.

Experience

fuboTVSoftware Engineer, Machine Learning

2022 — Now

New York, NY

Worked on infrastructure powering fuboTV’s recommendation and personalization systems serving 1.5M+ subscribers, including ML ranking services, feature infrastructure, and distributed pipelines for candidate generation and recommendation scoring.

Helped build and scale fuboTV’s recommendation platform, transitioning from a heuristics-based request-time approach to a precomputed machine learning recommender by developing ranking services, feature infrastructure, and candidate generation pipelines supporting personalized content discovery.

Partnered with data science to productionize a LightGBM ranking model for LiveTV recommendations, powering the platform’s highest-traffic carousel and delivering personalized recommendations.

Led the architecture, development and introduction of sp-ranking-service, a FastAPI/Python-based ML ranking service into a Golang microservice ecosystem, establishing CI/CD pipelines and Kubernetes deployment patterns to productionize models.

Designed and implemented the team’s first feature store, introducing an online/offline architecture with Bigtable for low-latency inference and BigQuery for scalable training pipelines.

Built Airflow-based recommendation precomputation pipelines generating predictions ahead of request time, improving latency by up to 20% across personalization services.

Redesigned the content representation and candidate generation pipeline to support multiple metadata sources, decoupling recommendation features from vendor-specific schemas and enabling recommendations for international content as the platform expanded into European markets.

Developed distributed recommendation feature pipelines using Scala, Scio, and Apache Beam on Google Cloud Dataflow, generating user profiles and recommendation features used by personalization systems.

Re-architected large-scale Scio data pipelines by replacing inefficient groupBy operations with aggregateByKey patterns and optimizing distributed joins as the platform scaled.

ConcertAISenior Software Engineer

2020 — 2022

Boston, MA

o Functioned as an individual contributor reporting to the Director of Software Development working on projects across various teams building out internal tooling & backend services for data-intensive SAAS applications utilized by the top 25 biopharmaceutical companies for optimizing clinical trials (~$40B industry) through the novel use of Machine Learning

o Developed power analysis capabilities in Eureka Trial Optimizer, a flagship SAAS product, by extending the Flask backend statistical layer enabling end users to view power and sample size calculations on hypothetical clinical trials

o Architected & built a suite of microservices with REST endpoints for Eureka Digital Trial Solutions using Docker, FastAPI, AWS Redshift/RDS; these suite of microservices acted as a middleware layer between external vendors and Eureka’s eScreening, eConsent & ePRO modules helping optimize clinical studies by identifying patients most likely to meet study criteria (site criteria optimization) and sites most likely to have patients for a trial (site selection optimization)

o Developed & tested a mission critical data pipeline in AWS Glue (PySpark/Redshift) to update OMOP compliant medical terminologies/vocabularies to an internal data model utilized by ConcertAI data products and the Eureka Foundation module

o Developed a data pipeline in AWS Glue (PySpark) utilizing OMOP ontologies that increased the codification and standardization rate of clients’ electronic health record (EHR) data by upwards of 30-50%

ZipcarSenior Analytics Engineer (Data Science / Analytics)

2019 — 2020

Boston, MA

o Functioned 50% as a technical lead for product data science including AB testing analysis, experimental design, analytics support for strategic business initiatives, project scoping & exploratory data analysis, and 50% as the sole analytics engineer for building out robust & efficient data pipelines for search data assets, a critical input for demand-based fleet planning

o Deployed Lifetime Value and RFM machine learning analytical models on Airflow for easier orchestration, better logging, monitoring and automation than previous deployment via AWS Lambda & EC2 instance that was always up and running

o Empowered & educated the analytics & data science team on utilizing dbt & Airflow for authoring & automating analytical modeling pipelines which created a multiplier effect & decreased turn around time by 50% without data engineering support

o Developed & automated search data pipeline utilizing tools such as Dask (structuring raw JSON on S3 into parquet files queryable by AWS Athena), dbt for transforming data on Redshift, and Airflow for orchestration & automation

o Optimized financial data mart ETL through incremental dbt logic and performance tuning Redshift by altering sort and distribution keys increasing query performance by ~20x for analytics team members identifying transactions with waivers

ZipcarSenior Data Engineer

2018 — 2019

Boston, Massachusetts, United States

Iora HealthBusiness Intelligence Engineer

2017 — 2018

Greater Boston

o Member of the Business & Clinical Intelligence (BCI) team charged with end to end support, maintenance, & development of the data warehouse (DW) infrastructure, ETL, BI reporting and ad-hoc analysis capabilities of Iora’s data assets

o Architected & redesigned the Medicare Risk Adjustment (MRA) Engine ETL from legacy PostgreDB to Amazon Redshift by rewriting procedural PL/pgSQL functions to set-based SQL which reduced processing time from 6 hours to 5 minutes

o Updated & added CMS’s logic for Risk Adjustment Processing System (RAPS) & Encounter Data Processing System (EDPS) in MRA Engine allowing for blended risk adjustment factor (RAF) calculations for each and every patient

o Redesigned & updated MRA Engine’s risk model logic, reference and mapping tables adding ETL flexibility to calculate the RAF for the entire patient population for any payment period

o Designed ETL pipeline to help the reconciliation process with CMS by identifying Medicare Advantage patients with unpaid (by CMS) Hierarchical Chronic Conditions (HCCs) that have claim diagnosis evidence in Iora’s data warehouse

o Designed & implemented analytic mining queries to generate a clinical suspects reporting list (possible under coded patients based on clinical mining rules for depression, breast Cancer, CKD, DME Oxygen, Acute stroke, etc.)

o Redesigned & optimized legacy SQL which cut down query execution time by over 50% for quarterly KPIs & cost reporting analytics (admissions, ER visits, specialist visits, imaging visits, etc.) for Carpenters of MA (client) patients

o Designed a Python/Jupyter notebook analysis script on Carpenters’ clinical data (Cholesterol levels, BMI, etc.) creating charts and statistics for different measurements to showcase during annual meeting with sponsor

Education

Cornell University

Master of Engineering (MEng)

Stony Brook University

Bachelor of Engineering (B.E.)

ABRHS

Conant Elementary School

Experience+9

Education

Master of Engineering (MEng)

Bachelor of Engineering (B.E.)

Experience