Working under Dr. Carson Tao at the Computational Research Team at Sumitomo-Pharma America (contracted)
•Architected and implemented an end-to-end automatic speech-to-text system (from data pipelines to model re-training) that transcribes and de-hallucinates doctor-patient conversations for the extraction of meaningful audio and linguistic signals.
•Prompt-engineered and deployed LLMs such as GPT-4, GPT-3.5, Mistral-7B, Llama-3 for speaker diarization with zero-shot, few-shot and chain-of-thought learning approaches.
•Fine-tuned Mistral-7B-Instruct and Llama-3 models using PEFT-based approaches, including LoRA, QLoRA, DoRA, and QDoRA, for speaker diarization pipeline, achieving state-of-the-art results.
•Conducted evaluation of internal SMPA pipeline against open-source and proprietary ASR models from Google Cloud, Amazon, and Microsoft Azure and achieved SOTA results for multi-lingual recordings.