AI Engineer with 4+ years building production-scale ranking systems, forecasting models, and GenAI solutions across large retail ecosystems. Skilled in PyTorch, PySpark, Scala, Snowflake, Delta Lake, vector databases, RAG pipelines, Triton inference, and feature-store architectures.
Experience
2024 — Now
2024 — Now
United States
• Constructed real-time product-ranking models using PyTorch, feature stores, and Snowflake signals, improving conversion
lift by 6.8% across high-traffic categories through optimized inference pipelines and continuous monitoring workflows.
• Architected demand-forecasting system integrating PySpark pipelines, incremental Delta ingestion, and SHAP analysis,
reducing weekly inventory volatility by 12% and stabilizing replenishment across electronics and seasonal assortments.
• Engineered anomaly-detection framework combining embeddings, Kafka inputs, and drift-detection metrics, detecting
pricing irregularities 23% faster and safeguarding pricing experimentation across merchandising clusters.
• Developed retrieval-grounded assistant using vector embeddings, RAG routing, and structured prompts, decreasing agent
resolution time by 18% for warranty, billing, and device-troubleshooting queries.
• Refined multi-agent LLM orchestrations for search-intent rewriting, improving relevance ranking by 9.4% across long-tail
queries through prompt specialization, model pruning, and contextual grounding from historical navigation logs.
• Implemented response -safety filtering with quantized instruction-tuned models, reducing hallucination rates 31% and
ensuring compliant customer assistance workflows across enterprise geographies and product categories.
2020 — 2023
2020 — 2023
India
• Designed an item-embedding pipeline using Scala, PySpark, and Delta Lake to cluster SKUs by behavioral affinity,
improving cross-category recommendation depth by 14% during seasonal campaigns.
• Built scalable candidate-generation models with hierarchical sampling and ANN retrieval, raising same-session discovery
rates by 9% while decreasing tail-exposure gaps across diverse merchandise hierarchies.
• Constructed a feature store-backed training fabric integrating transactional events, feed signals, and contextual
metadata, shortening model refresh latency from 22 hours to 9 hours during peak traffic.
• Deployed sequence-based re-ranking models in Triton inference servers with optimized batching heuristics, reducing P95
latency by 31% and sustaining stable throughput during mega-sale concurrency surges.
• Developed synthetic-cart augmentation using behavioral perturbations and constrained sampling, enhancing robustness
to sparse-item interactions and improving add-to-cart prediction recall by 6% across underrepresented assortments.
• Implemented continuous-monitoring dashboards with drift metrics, embedding-health diagnostics, and slice-based
stability alerts, enabling rapid anomaly triage and maintaining consistent performance in dynamic retail patterns.
• Collaborated with merchandising analysts to validate model shifts through uplift experiments, aligning algorithmic
behaviors with commercial objectives and ensuring compliant rollout of recommendation updates across categories.