• Designed and deployed a transformer-based demand forecasting system on AWS, integrating historical sales, seasonality signals, and external variables; improved forecast accuracy by 15% and reduced stockout incidents by 20% across supply chain operations
• Built a production-grade retrieval system for regulatory and compliance documents using hybrid search (dense + lexical), reducing document lookup time by 40% and significantly improving decision turnaround for compliance teams
• Engineered high-throughput model serving infrastructure using Kubernetes (EKS) and optimized inference via batching and GPU acceleration, increasing real-time inference throughput by 10x+ under peak loads
• Developed end-to-end MLOps pipelines with automated training, validation, and deployment using MLflow and CI/CD workflows, reducing model release cycles from weeks to days and improving deployment reliability
• Implemented distributed data pipelines using Spark and Kafka to process ~8M daily records, enabling near real-time feature generation and reducing data latency by 50%s