AI/ML Engineer with 5+ years of experience designing, building, and deploying production-grade machine learning and deep learning systems.

Experience

NVIDIAAI/ML Engineer

2025 — Now

California, United States

Contributed to the development and extension of GPU-accelerated Python inference services using FastAPI,supporting transformer-based models for internal reference and customer validation workloads.

Worked on Retrieval-Augmented Generation (RAG) prototypes using LangChain for orchestration and

LlamaIndex for document ingestion, focused on internal benchmarking and customer proof-of-concept

demonstrations.

Evaluated vector retrieval approaches using FAISS (GPU) for internal experimentation and supported

Pinecone-based setups exclusively for external customer PoCs, maintaining clear separation between tooling.

Applied parameter-efficient fine-tuning techniques (LoRA / QLoRA) during controlled experimentation using Hugging Face Transformers, leveraging pre-trained checkpoints and existing GPU infrastructure.

Supported data preprocessing and evaluation workflows using Pandas, NumPy, and SQL to enable structured experimentation, offline evaluation, and performance comparison across model variants.

Assisted in optimizing inference workflows through ONNX export and collaboration on TensorRT

benchmarking, observing improvements in latency and throughput.

Containerized ML services using Docker and collaborated with platform teams to deploy GPU workloads on Kubernetes, emphasizing scalability and efficient resource utilization.

Used MLflow for experiment tracking and model version comparison within prototyping and benchmarkingpipelines to support reproducibility.

Supported customer-facing validation workloads on AWS (EC2 GPU instances, EKS, S3, ECR), assisting with deployment verification, benchmarking, and technical demonstrations.

Participated in internal model evaluation discussions, contributing analysis on LLM behavior, hallucination patterns, and grounding quality in RAG systems under senior guidance

MicrosoftMachine Learning Engineer

2020 — 2023

India

Developed and deployed end-to-end machine learning pipelines using Python, PyTorch, and TensorFlow for enterprise-facing applications.

Built supervised learning models (classification and regression) on large, structured datasets, improving

prediction accuracy by 10–15% through feature engineering and iterative experimentation.

Performed data preprocessing, feature engineering, and EDA using Pandas, NumPy, and Scikit-learn,

enabling stable and reusable training pipelines.

Integrated ML models into production systems in collaboration with software engineering teams, following internal deployment and validation standards.

Deployed and monitored models using Azure Machine Learning, leveraging cloud-based compute for

training and batch inference workloads.

Conducted hyperparameter tuning and A/B testing to improve model performance, stability, and inference efficiency.

Used Jupyter Notebooks and visualization tools to communicate experimental results and insights to

technical and business stakeholders.

Worked within Agile/Scrum development processes, participating in sprint planning, backlog grooming, daily stand-ups, and retrospectives to deliver incremental ML features.

Education

Montclair State University

Master's degree

GITAM Deemed University

Experience

Education

Master's degree

Bachelor of Technology - BTech