# Gautam B. > AI Engineer | AI/ML & Full Stack | Java, Python, Spring Boot, AWS, React | LLM Agents (RAG, LangGraph) | MS CS @ GMU Location: San Francisco, California, United States Profile: https://flows.cv/gautamb Performance-focused Software Engineer with hands-on experience in deep learning inference optimization on GPU/CPU architectures, distributed systems, and full-stack development. MS CS at George Mason University (3.87 GPA) with published NLP research. What I work on: ▸ DL & GPU Inference Optimization: Built Sentinel AI — a distributed CUDA-accelerated PyTorch pipeline (Go gRPC ingestion → Kafka → Python GPU workers on A100s via SLURM). Achieved 30× latency reduction (500ms → 16.1ms) at 24.3 FPS/1080p. CPU side: multi-backend auto-detection (OpenVINO, ONNX Runtime, INT8 quantization) with memory arena pre-allocation and async multi-threaded inference — 10–15× speedup (500ms → 30–50ms). ▸ Systems & Concurrency: Built a concurrent KV store in C++ with 16-way hash sharding, RW synchronization, and per-shard LRU caching — 10.7M ops/sec, sub-3µs p95 latency. Benchmarked concurrency designs under multi-threaded load: 2.2× throughput, near-linear scaling through 16 threads. Validated with 25+ GoogleTest cases and ThreadSanitizer. ▸ AI/ML & Agentic Systems: Architected HireReady — a 9-agent LangGraph pipeline with parallel DAG execution and hybrid evaluation (semantic similarity + NER). 200 analyses, 0 failures. Hardened with Resilience4j + Redis caching, sustaining 100-concurrent load tests with 0 failures under fault injection. Also built TRUST Agents — a 4-agent fact-checking system with Delphi consensus verification. ▸ Backend Engineering: 1 year at Backflipt building Spring WebFlux microservices — optimized DocumentDB reads (p95 latency −35ms), fixed async race conditions with request deduplication, standardized error contracts with correlation IDs (debug time −20%), and automated regression testing (15 min → 6 min). Proficient in React, FastAPI, Flask, PostgreSQL, MongoDB, Redis. ▸ Cloud & DevOps: AWS (EC2, RDS, S3, DocumentDB), Docker, Kubernetes, Jenkins, GitHub Actions, Kafka, Prometheus, Grafana. AWS Cloud Practitioner + AI Practitioner certified. ▸ Publications: — Code-Mixed Telugu-English Hate Speech Detection (arXiv: 2502.10632) — How Does A Multilingual LM Handle Multiple Languages? (arXiv: 2502.04269) 300+ LeetCode problems solved. Open to backend, systems, AI/ML, and full-stack roles — always happy to talk distributed systems, GPU optimization, or interesting problems. ## Work Experience ### Software Developer @ KeelWorks Foundation Jan 2026 – Present ### Associate Software Engineer @ Backflipt Jan 2023 – Jan 2023 | Hyderabad, Telangana, India ## Education ### Master of Science - MS in Computer Science George Mason University ## Contact & Social - LinkedIn: https://linkedin.com/in/satya2603 - GitHub: https://github.com/GautamaShastry/ - Portfolio: https://leetcode.com/u/gautam-2603/ - Portfolio: https://gautamportfolio.com --- Source: https://flows.cv/gautamb JSON Resume: https://flows.cv/gautamb/resume.json Last updated: 2026-04-10