Machine Learning Engineer with 5+ years of experience building production-scale AI systems, specializing in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and autonomous AI agents.
Experience
2025 — Now
2025 — Now
New York, United States
Working on production-scale LLM systems, RAG pipelines, and enterprise AI platforms with a focus on reliability, scalability, and evaluation.
●Designed and deployed RAG pipelines integrating vector databases and LLM APIs, improving retrieval accuracy by 32% and reducing hallucinations by 27%
●Fine-tuned transformer-based LLMs using domain-specific datasets and RLHF, increasing response relevance by 35%
●Built automated LLM evaluation frameworks measuring hallucination, latency, and answer quality, improving reliability by 30%
●Developed scalable AI microservices using Python, FastAPI, Docker, and Kubernetes for production inference systems
●Engineered RAG architectures using LangChain, LlamaIndex, Pinecone, and FAISS for semantic search
Implemented distributed ML pipelines using PyTorch, Hugging Face, and Ray Serve for GPU-based inference
●Designed monitoring systems using Prometheus and Grafana for model observability and performance tracking
2025 — 2025
2025 — 2025
San Francisco, CA
Built autonomous AI agents and multimodal systems for web interaction and task execution using reinforcement learning and LLMs.
●Designed autonomous AI agents for multi-step web tasks, improving task completion accuracy by 34%
●Developed multimodal pipelines combining vision transformers and LLMs for UI understanding
●Implemented hierarchical planning systems converting natural language into structured actions, reducing failures by 27%
●Applied reinforcement learning (PPO, actor-critic) to optimize agent behavior in dynamic environments
●Built scalable data pipelines for large-scale web interaction datasets
●Optimized distributed training and inference on GCP with TPU infrastructure
●Deployed models using Vertex AI with monitoring and automated retraining pipelines
2020 — 2023
2020 — 2023
India
Developed NLP and computer vision pipelines for document processing, automation, and fraud detection in financial systems.
●Built OCR + NLP pipelines extracting structured data from financial documents, reducing manual effort by 70%
●Developed document classification and entity extraction models improving accuracy by 35%
●Designed fraud detection validation rules enabling real-time anomaly detection
●Engineered ML pipelines using TensorFlow, PyTorch, OpenCV, and spaCy
●Deployed containerized ML services using Docker, FastAPI, Kubernetes, and AWS
●Implemented end-to-end MLOps workflows, including CI/CD and model monitoring
Education
University of North Texas