Software Engineer with 5+ years of experience specializing in AI/ML cloud platforms, distributed systems, and cloud-native architectures across AWS and GCP. Strong expertise in Kubernetes (EKS/GKE), microservices, and infrastructure-as-code with Terraform.

Experience

ProwesysSoftware Engineer, AI/ML & Cloud Platform

2024 — Present

Led cloud-native deployment of distributed systems on AWS and GCP, leveraging Kubernetes (EKS/GKE) to build scalable, highly available services across multi-region environments using Terraform.

Built and optimized event-driven data pipelines using Kafka and RabbitMQ, processing 500M+ messages/day with S3/ADLS backed storage, reducing event propagation latency by 70%.

Developed observability and monitoring solutions using Grafana, ELK, CloudWatch, and GCP Profiler, improving system visibility and reducing MTTR by 35% across production services.

Built and deployed LLM inference pipelines using vLLM and Triton Inference Server, improving system scalability and resource utilization across Kubernetes-based deployments.

Developed end-to-end LoRA and SFT fine-tuning pipelines for domain-specific models, improving compliance text classification accuracy using BigQuery and Snowflake feature engineering workflows.

Designed and deployed FastAPI-based ML services serving 1M+ daily requests, leveraging async processing and Redis caching to reduce p95 latency by 60%.

University of North TexasStudent Assistant

2023 — 2024

Tata ElxsiSoftware Engineer, Distributed AI Systems

2020 — 2023

Bengaluru

Designed and deployed identity resolution microservices in Python/C++, unifying 50M+ Entra ID and HR records with versioned audit trails, self-healing workflows, and high availability.

Refactored a monolithic system into scalable microservices architecture, implementing contract testing and phased rollouts to improve system throughput by 45% without regressions.

Built high-throughput data pipelines using Kafka and Spark Streaming, processing 200M+ events/day with secure archival to S3/Glacier and compliance with FINRA/SEC retention policies.

Built and deployed LLM inference services using Triton Inference Server and vLLM, enabling scalable, production-grade model serving for enterprise workloads.

Optimized large language model performance (GPT, BERT, LLaMA) through quantization and efficient inference strategies, reducing memory usage by 60% and improving latency.

Developed and optimized multi-modal inference pipelines with advanced decoding strategies (Greedy, Beam, Sampling), improving token generation performance by 20%.

Education

University of North Texas

Masters

Sree Venkateshwara College of Engineering

Experience

Education

Masters

Engineer's degree