Passionate Software Engineer with experience in machine learning and quantitative modeling spanning research and industry, specializing in natural language processing, bioinformatics and mobile security. You can reach me at: saksham.arora.23@dartmouth.edu.

2025 — NowSavant BioFounding Engineer

2025 — Now

New York, United States

Transforming medical data into actionable insights.

2025 — NowRoivant SciencesSoftware Engineer — AI Product (Roivant Health)

2025 — Now

New York, United States

Building 0 to 1 @ Savant Bio (Roivant's latest health technology incubation)

2024 — 2025Roivant SciencesSoftware Engineer — ML/Product, Real‑World Evidence (RWE)

2024 — 2025

New York, United States

Worked on accelerating patient pre-screening for clinical trial recruitment

2024 — 2024Roivant SciencesSoftware Engineer — Voice AI (Computational Research, Sumitomo Pharma America)

2024 — 2024

New York, United States

Working under Dr. Carson Tao at the Computational Research Team at Sumitomo-Pharma America (contracted)

Architected and implemented an end-to-end automatic speech-to-text system (from data pipelines to model re-training) that transcribes and de-hallucinates doctor-patient conversations for the extraction of meaningful audio and linguistic signals.

Prompt-engineered and deployed LLMs such as GPT-4, GPT-3.5, Mistral-7B, Llama-3 for speaker diarization with zero-shot, few-shot and chain-of-thought learning approaches.

Fine-tuned Mistral-7B-Instruct and Llama-3 models using PEFT-based approaches, including LoRA, QLoRA, DoRA, and QDoRA, for speaker diarization pipeline, achieving state-of-the-art results.

Conducted evaluation of internal SMPA pipeline against open-source and proprietary ASR models from Google Cloud, Amazon, and Microsoft Azure and achieved SOTA results for multi-lingual recordings.

2023 — 2024Roivant SciencesSoftware Engineer — Data Engineering (Sumitomo Pharma America)

2023 — 2024

New York, United States

Orchestrated distributed ETL pipelines for a production Redshift data warehouse that streamlines sales team tracking and analysis for improved customer targeting.

Led the design and implementation of an automated AWS Lambda-based ETL pipeline, extracting and integrating data incrementally from RAVE Medidata API into AWS Aurora.

Created 20+ data jobs with SCD2 and history tables using SQL and Matillion for Redshift.

Collaborating with data engineers and Urovant consultants to normalize pharmaceutical data.

Education

2019 — 2023

Dartmouth College

Mathematics

2019 — 2023

2004 — 2018

Delhi Public School - R. K. Puram

2004 — 2018

Experience+

Education

Mathematics