• Fine-tuned computer vision car classification models using state-of-the-art LLMs and CV architectures, addressing edge cases and increasing production accuracy from 76% → 98%, reducing misclassification rate by 74%.
• Engineered an automated model training pipeline with FastAPI backend, integrated Weights & Biases for model logging, resulting in a 25% reduction in training and processing time, tracking 100+ experiments seamlessly.
• Deployed models for real-time inference via Triton Inference Server, achieving <50ms latency per request and supporting 1000+ concurrent inference requests in production.
• Managed AWS cloud infrastructure, enabling scalable storage, distributed training, and fault-tolerant deployment, reducing downtime by 30% and improving model availability to 99.9%.