As a passionate Software Engineer, I thrive on turning complex challenges into elegant, scalable solutions.
Experience
As part of the ML team, I architect, train, and deploy domain-specific large language models on our on-premises GPU clusters to automatically detect and classify adverse events in clinical and pharmacovigilance data.
Key Responsibilities
Design and implement end-to-end fine-tuning pipelines for transformer-based models (e.g., BERT, T5) using PyTorch and Hugging Face Transformers on local high-performance servers.
Optimize data preprocessing workflows, including annotation parsing, tokenization, and domain-specific vocabulary expansion, to maximize model performance.
Collaborate with cross-functional stakeholders—data scientists, clinicians, and DevOps—to define adverse-event taxonomy and refine training datasets.
Integrate Kubernetes-based orchestration and Docker containers for reproducible training and inference across development and production environments.
Monitor model metrics (accuracy, precision, recall, latency) and iterate on hyperparameters, leveraging tools such as Weights & Biases for experiment tracking.
Key Achievements
Achieved a 25% improvement in model recall on rare adverse-event categories by customizing the tokenizer and augmenting training data.
Reduced average inference latency by 40% through model quantization and optimized hardware utilization.
Automated weekly retraining workflows to incorporate new clinical reports.
2020 — 2021
San Jose, California, United States
Incorporated cloud computing into demanding programs to reduce costs and increase the reliability and performance of tests, services used include EC2, DynamoDB, as well as a suite of other AWS services.
Utilized a broad tech stack, including Node.js, Python, Java, .NET, React, Angular, and various database technologies.
Developed a robust, scalable software across the stack, from UI to database, for diverse environments including cloud, on-prem, and mobile.
Designed solutions incorporating telemetry, monitoring, and logging leveraging Python libraries to enhance software.
Led the development of a water quality monitoring system, utilizing Python and Node.js, to process and analyze data from IoT sensors, improving real-time reporting accuracy.
Implemented a cloud-based data storage solution with MySQL, enhancing data retrieval efficiency and scalability for large datasets.
Collaborated on the integration of machine learning algorithms with Python, leveraging TensorFlow and Scikit-Learn to predict water quality trends, contributing to the development of a predictive maintenance model for water infrastructure.
Developed a user-friendly dashboard with React, enabling clients to easily access and visualize water quality metrics in real-time.
Education
University of Pennsylvania
Master of Science - MS
Ramapo College of New Jersey
Bachelor of Science - BS
Fordham University
Bachelor of Science - BS
Bergen County Academies