Remote
Lead engineer for large-scale LLM serving and GPU infrastructure powering tier-1 customers across multi-cloud Kubernetes platforms.
Designed and operated large-scale LLM serving pipelines using disaggregated prefill/decode on multi-node H100 clusters (vLLM, Go, KubeRay), improving P99 latency by 60% and 4x-ing throughput for mission-critical production workloads
Built fair-share GPU scheduling across shared clusters using Kueue, KEDA, and fractional GPU partitioning, cutting idle GPU cost by ~$200K/month while maintaining 99.9% job success and cluster availability
Designed and deployed multi-tenant, cloud-native Kubernetes platform (AppHub IDP) with GitOps (ArgoCD, Tekton), service mesh, and disaster recovery; cut deployment time from 2 days to 15 minutes across multiple engineering teams
Standardized end-to-end observability for AI agents using OpenTelemetry, Prometheus, and high-cardinality metrics, enabling sub-second root cause analysis of distributed failures and improving incident MTTR
Built MLflow-backed experiment and model infrastructure on Kubernetes (PostgreSQL, MinIO) as a shared platform, automating deployments via Go CLIs and Terraform across AWS, GCP, and Azure
Contributed open-source patches to NVIDIA Dynamo, improving distributed inference throughput and memory efficiency on A100/H100 GPU clusters
Led ISO/IEC 27001 and SOC2 Type II compliance programs, architecting security controls across cloud infrastructure
Seattle, Washington, United States
Developed and benchmarked distributed inference systems using Triton Inference Server, Kubeflow, and Kubernetes, achieving significant throughput improvements for LLM serving workloads in HPC environments
Built RAG (Retrieval-Augmented Generation) pipelines using LangChain and LLMs, enabling semantic search over enterprise knowledge bases with sub-2-second query latency
Deployed and managed multi-node GPU clusters using Slurm workload manager and HPCC systems, running MPI-based distributed training jobs with NVIDIA cuDNN and CUDA optimization
Implemented end-to-end MLOps pipelines using Apache Airflow and MLflow, automating model training, validation, and deployment workflows on AWS and GCP
Architected Nginx-based reverse proxy and service mesh configurations with Kubernetes, improving API gateway performance and enabling zero-downtime deployments
Contributed to internal developer platform (IDP) tooling using Helm Charts and YAML-based Kubernetes operators, reducing onboarding time for new engineering teams
2019 — 2022
Bengaluru, Karnataka, India
Designed and maintained distributed microservices architecture using Java, Python, and Scala on Kubernetes, supporting enterprise clients across banking and financial services sectors
Built and optimized CI/CD pipelines using Jenkins, Docker, and Terraform on AWS and Azure, reducing deployment time by automating infrastructure provisioning and release management
Implemented container orchestration solutions using Kubernetes and Docker Swarm, migrating legacy monolithic applications to cloud-native microservices architecture
Developed high-throughput data processing systems using Apache Airflow and SQL, handling large-scale ETL workflows for enterprise analytics platforms
Engineered reliability improvements including Nginx load balancing, DNS configuration, and Single Sign-On (SSO) integrations, improving system uptime and security posture for client-facing applications
Collaborated with cross-functional teams to deliver SOC2 compliant software systems, implementing security controls and audit logging across distributed services running on AWS and Azure infrastructure
2021 — 2022
Bengaluru, Karnataka, India
Co-founded an Ed-tech startup focused on developing an integrated, single-window application portal to streamline access to academic and career resources for students, contributing to the platform’s design, functionality, and market readiness.
Drove product adoption by onboarding schools, colleges and universities across regions.
Built and led a team of 20 across tech, operations, and outreach functions.
Made strategic partnerships with relevant resource providers, software providers and onboarded advisory board.
Led core functions including team building, operational strategy, and technology development management.
Led platform design, market readiness, and go-to-market planning for the student-facing application.
Vellore Area, India
Education
Stevens Institute of Technology
Master's degree
Vellore Institute of Technology