# Abirvab D. > Machine Learning Engineer | Search, Discovery & Recommender Systems | Ranking, RAG, LLM Systems Location: New York, New York, United States Profile: https://flows.cv/abirvab Machine Learning Engineer specializing in ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต, ๐—ฑ๐—ถ๐˜€๐—ฐ๐—ผ๐˜ƒ๐—ฒ๐—ฟ๐˜†, ๐—ฎ๐—ป๐—ฑ ๐—ฟ๐—ฒ๐—ฐ๐—ผ๐—บ๐—บ๐—ฒ๐—ป๐—ฑ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ๐˜€, with experience building ๐—ฟ๐—ฎ๐—ป๐—ธ๐—ถ๐—ป๐—ด ๐—ฝ๐—น๐—ฎ๐˜๐—ณ๐—ผ๐—ฟ๐—บ๐˜€, ๐— ๐—Ÿ ๐—ถ๐—ป๐—ณ๐—ฟ๐—ฎ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ, ๐—ฎ๐—ป๐—ฑ ๐—Ÿ๐—Ÿ๐— -๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ฒ๐—ฑ ๐—ฟ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ๐˜€. At fuboTV, I work on production recommendation systems serving 1.5M+ users, developing ranking services, feature infrastructure, and distributed pipelines for candidate generation and personalization. My work includes building ML ranking services, designing feature stores, and scaling data pipelines using Apache Beam and Scio. Recently, Iโ€™ve expanded into AI engineering, building systems such as agentic RAG pipelines, hybrid retrieval architectures (BM25 + embeddings), and LLM-based search applications with query understanding, reranking, and interactive feedback loops. Iโ€™m particularly interested in roles focused on: โ€ข Recommender Systems & Personalization โ€ข Search & Discovery Systems โ€ข AI / LLM / Agent Applications โ€ข ML Data/Inferencing Infrastructure GitHub: github.com/adeb09 ## Work Experience ### Software Engineer, Machine Learning @ fuboTV Jan 2022 โ€“ Present | New York, NY Worked on infrastructure powering fuboTVโ€™s recommendation and personalization systems serving 1.5M+ subscribers, including ML ranking services, feature infrastructure, and distributed pipelines for candidate generation and recommendation scoring. Helped build and scale fuboTVโ€™s recommendation platform, transitioning from a heuristics-based request-time approach to a precomputed machine learning recommender by developing ranking services, feature infrastructure, and candidate generation pipelines supporting personalized content discovery. Partnered with data science to productionize a LightGBM ranking model for LiveTV recommendations, powering the platformโ€™s highest-traffic carousel and delivering personalized recommendations. Led the architecture, development and introduction of sp-ranking-service, a FastAPI/Python-based ML ranking service into a Golang microservice ecosystem, establishing CI/CD pipelines and Kubernetes deployment patterns to productionize models. Designed and implemented the teamโ€™s first feature store, introducing an online/offline architecture with Bigtable for low-latency inference and BigQuery for scalable training pipelines. Built Airflow-based recommendation precomputation pipelines generating predictions ahead of request time, improving latency by up to 20% across personalization services. Redesigned the content representation and candidate generation pipeline to support multiple metadata sources, decoupling recommendation features from vendor-specific schemas and enabling recommendations for international content as the platform expanded into European markets. Developed distributed recommendation feature pipelines using Scala, Scio, and Apache Beam on Google Cloud Dataflow, generating user profiles and recommendation features used by personalization systems. Re-architected large-scale Scio data pipelines by replacing inefficient groupBy operations with aggregateByKey patterns and optimizing distributed joins as the platform scaled. ### Senior Software Engineer @ ConcertAI Jan 2020 โ€“ Jan 2022 | Boston, MA o Functioned as an individual contributor reporting to the Director of Software Development working on projects across various teams building out internal tooling & backend services for data-intensive SAAS applications utilized by the top 25 biopharmaceutical companies for optimizing clinical trials (~$40B industry) through the novel use of Machine Learning o Developed power analysis capabilities in Eureka Trial Optimizer, a flagship SAAS product, by extending the Flask backend statistical layer enabling end users to view power and sample size calculations on hypothetical clinical trials o Architected & built a suite of microservices with REST endpoints for Eureka Digital Trial Solutions using Docker, FastAPI, AWS Redshift/RDS; these suite of microservices acted as a middleware layer between external vendors and Eurekaโ€™s eScreening, eConsent & ePRO modules helping optimize clinical studies by identifying patients most likely to meet study criteria (site criteria optimization) and sites most likely to have patients for a trial (site selection optimization) o Developed & tested a mission critical data pipeline in AWS Glue (PySpark/Redshift) to update OMOP compliant medical terminologies/vocabularies to an internal data model utilized by ConcertAI data products and the Eureka Foundation module o Developed a data pipeline in AWS Glue (PySpark) utilizing OMOP ontologies that increased the codification and standardization rate of clientsโ€™ electronic health record (EHR) data by upwards of 30-50% ### Senior Analytics Engineer (Data Science / Analytics) @ Zipcar Jan 2019 โ€“ Jan 2020 | Boston, MA o Functioned 50% as a technical lead for product data science including AB testing analysis, experimental design, analytics support for strategic business initiatives, project scoping & exploratory data analysis, and 50% as the sole analytics engineer for building out robust & efficient data pipelines for search data assets, a critical input for demand-based fleet planning o Deployed Lifetime Value and RFM machine learning analytical models on Airflow for easier orchestration, better logging, monitoring and automation than previous deployment via AWS Lambda & EC2 instance that was always up and running o Empowered & educated the analytics & data science team on utilizing dbt & Airflow for authoring & automating analytical modeling pipelines which created a multiplier effect & decreased turn around time by 50% without data engineering support o Developed & automated search data pipeline utilizing tools such as Dask (structuring raw JSON on S3 into parquet files queryable by AWS Athena), dbt for transforming data on Redshift, and Airflow for orchestration & automation o Optimized financial data mart ETL through incremental dbt logic and performance tuning Redshift by altering sort and distribution keys increasing query performance by ~20x for analytics team members identifying transactions with waivers ### Senior Data Engineer @ Zipcar Jan 2018 โ€“ Jan 2019 | Boston, Massachusetts, United States ### Business Intelligence Engineer @ Iora Health Jan 2017 โ€“ Jan 2018 | Greater Boston o Member of the Business & Clinical Intelligence (BCI) team charged with end to end support, maintenance, & development of the data warehouse (DW) infrastructure, ETL, BI reporting and ad-hoc analysis capabilities of Ioraโ€™s data assets o Architected & redesigned the Medicare Risk Adjustment (MRA) Engine ETL from legacy PostgreDB to Amazon Redshift by rewriting procedural PL/pgSQL functions to set-based SQL which reduced processing time from 6 hours to 5 minutes o Updated & added CMSโ€™s logic for Risk Adjustment Processing System (RAPS) & Encounter Data Processing System (EDPS) in MRA Engine allowing for blended risk adjustment factor (RAF) calculations for each and every patient o Redesigned & updated MRA Engineโ€™s risk model logic, reference and mapping tables adding ETL flexibility to calculate the RAF for the entire patient population for any payment period o Designed ETL pipeline to help the reconciliation process with CMS by identifying Medicare Advantage patients with unpaid (by CMS) Hierarchical Chronic Conditions (HCCs) that have claim diagnosis evidence in Ioraโ€™s data warehouse o Designed & implemented analytic mining queries to generate a clinical suspects reporting list (possible under coded patients based on clinical mining rules for depression, breast Cancer, CKD, DME Oxygen, Acute stroke, etc.) o Redesigned & optimized legacy SQL which cut down query execution time by over 50% for quarterly KPIs & cost reporting analytics (admissions, ER visits, specialist visits, imaging visits, etc.) for Carpenters of MA (client) patients o Designed a Python/Jupyter notebook analysis script on Carpentersโ€™ clinical data (Cholesterol levels, BMI, etc.) creating charts and statistics for different measurements to showcase during annual meeting with sponsor ### Sr. IT Database Analyst/Healthcare Data Analyst @ UnitedHealth Group Jan 2014 โ€“ Jan 2017 | Greater Boston Area o A member of the Data Management Analytics (DMA) team that developed quality metric reports in SQL, reviewed ETL code changes, confirmed data model/mapping updates, and verified the data quality for each client in the data warehouse o Developed SQL scripts to standardize measuring/gathering metrics used for data quality checks of development builds o Validated neoFM, an internal software tool built using JavaScript and Perl which flagged and clustered quality checks on over 5,000 FM charts for each client every data release o Developed a Python post-processing step to the neoFM pipeline which decreased its false positive flag rate of FM charts by an average of 30% with an upward rate of 90% through use of heuristics o Created a Bash/Python process to automate the post-processing step during the research phase allowing for thorough, efficient testing of heuristics on historical data to see how well the flag rate would decrease without allowing false negatives o Heuristics were focused on identifying FM charts flagged due to more new data which is expected and should not be flagged; the reduction of these flags lead to faster and more efficient reviews of FMs for the DMA team o Developed Spark scripts using the PySpark library to perform data quality checks Hadoop builds for a life sciences project o Lead ongoing design and development of a Spark engine for computing and gathering data quality metrics on Hadoop builds o Began automation effort for volumetric quality checks by developing a Python/SQL process to algorithmically compute significant volumetric changes and displaying only those charts, effectively reducing volumetric chart review by over 50% ### Product Development Engineer @ Varada Innovations Inc. Jan 2013 โ€“ Jan 2014 | Ithaca, NY -Testing and development of a novel soft tissue medical device prototype -Verifying experimental data with finite element model ### Consultant @ eNeura Therapeutics Jan 2013 โ€“ Jan 2014 | Sunnyvale, CA -Industrial MEng project to redesign the Spring TMS migraine relief device from eNeura Therapeutics -Systems Engineer role to work cross functionally & safely package electrical components into mechanical case prototypes -2 current prototypes have been designed and 3D printed by Shapeways; electrical components will be housed in the last prototype ### Research Intern @ Lux Research Inc. Jan 2013 โ€“ Jan 2013 -Collected & organized market trends in microfluidics and medical device electrodes -Gathered & estimated pricing inputs for business model predicting market trends in microfluidics from 2012-2022 -Contributed to A Materials Perspective on Medical Sensors, The Quarter 2 State of the Market Report for the Bioelectronics Service ### Research (REU) Intern @ Harvard School of Engineering and Applied Sciences Jan 2012 โ€“ Jan 2012 | The Mooney Lab โ€ข Worked independently 40-45 hours/week with Post Docs on Optimization of Ultrasound Responsive Hydrogels โ€ข Researched on novel on-demand drug delivery system using alginate based hydrogels as the scaffold to hold the drug and ultrasound to break bonds for on-demand drug release โ€ข Focused on changing ultrasound and gel parameters to see how the drug release rates would be affected โ€ข Mentor was away for 2.5 weeks during the REU Program where I gained a lot of independence on drug delivery project โ€ข Presented research twice in symposium with poster and wrote paper with possibility of publishing ### Undergraduate Researcher @ BME Department, Stony Brook University Jan 2011 โ€“ Jan 2012 -Worked at the Professor Qin's Orthopedic Bioengineering Research Lab to analyze data from ultrasound experiments where ultrasound was run through six femur bones for every 10ยฐ around three axes to verify whether ultrasound can be used in place of microCT scanning -Developed MATLAB programs to analyze microCT scans of bone that were taken in the previous semester with the objective to change the angle of the scans to allow for virtual Ultrasound Testing using Wave3000 and virtual Mechanical Testing using Abaqus ### Undergraduate Research @ BME Department, Stony Brook University Jan 2011 โ€“ Jan 2011 -Worked at the Professor Judex's Integrative Skeletal Adaptation and Genetics Lab under the supervision of a graduate student to study the effects of multiple exposures of bone to zero gravity -Carried out microCT scans of mice femurs and gathered data for further analysis ### Customer Service Technician @ Inforonics Jan 2008 โ€“ Jan 2009 | Littleton, MA -I answered calls to help people with their calling card problems. -I did various services on the calling cards including adding minutes for customers and selling cards over the phone. -I also worked with VoIP customers in helping them setup and install VoIP. ### Volunteer @ Emerson Hospital Jan 2007 โ€“ Jan 2008 Worked at transport and helped transferring patients from room to room Worked at front desk assisting people around the hospital Helped making Excel spreadsheets for volunteer records ## Education ### Master of Engineering (MEng) in Biomedical Engineering Cornell University ### Bachelor of Engineering (B.E.) in Biomedical Engineering Stony Brook University ### ABRHS ### Conant Elementary School ## Contact & Social - LinkedIn: https://linkedin.com/in/abirdeb --- Source: https://flows.cv/abirvab JSON Resume: https://flows.cv/abirvab/resume.json Last updated: 2026-04-13