# Harthik Sonpole > Actively Seeking Full Time Roles | MSCSE at Santa Clara University | Certified Solutions Architect @ AWS | Machine Learning | NLP | Python | Linux Location: Santa Clara, California, United States Profile: https://flows.cv/harthik I build AI that solves real-world problems—from recommendation systems to NLP and text classification. Proficient in Python, PyTorch, and TensorFlow, I’m also an AWS Solutions Architect who knows SQL inside out. On the side, I craft cross-platform mobile apps with Flutter for iOS and Android. Let’s create something groundbreaking together. ## Work Experience ### Software Developer @ SCU Frugal Innovation Hub Jan 2024 – Present | Santa Clara, CA Designed and deployed production-grade agentic LLM services for automated report generation, summarization, and internal knowledge workflows, integrating secure LLM agents into live systems. Built LLM-backed cloud APIs with authentication, rate limiting, request isolation, and usage controls to ensure safe and reliable production inference. Architected serverless AI backends using AWS and Firebase, optimizing cold starts, function concurrency, and data access patterns for low-latency inference. Implemented cost-optimized LLM inference pipelines, balancing throughput and latency through batching, caching, and selective model routing. Developed event-driven cloud workflows integrating LLM services with internal tools and data sources for automated document processing and analytics. Set up CI/CD pipelines for AI services, enabling automated builds, secure deployments, and rapid iteration without downtime. Monitored and improved system reliability and observability, adding structured logging, metrics, and alerts for AI workloads in production. Collaborated cross-functionally with researchers, product teams, and educators to translate AI research prototypes into scalable, production-ready systems. ### Software Application Developer @ AtlasProds Technologies LLP Jan 2022 – Jan 2023 | Hyderabad, Telangana, India Designed and deployed AWS-native AI platforms for large-scale semantic retrieval and LLM-powered knowledge systems. Architected vector search services processing 1M+ documents using transformer embeddings and distributed indexing, achieving sub-second P95 latency. Built multi-stage retrieval pipelines (vector partitioning, approximate nearest neighbor search, and similarity re-ranking) to enable high-throughput, cost-efficient inference. Implemented retrieval-augmented generation (RAG) pipelines combining vector search with LLM-based reasoning and summarization for enterprise knowledge applications. Developed end-to-end MLOps pipelines on AWS, covering data ingestion, embedding generation, model versioning, CI/CD, and controlled model rollouts. Integrated AI services across hybrid cloud and on-prem environments, ensuring secure data access and scalable inference. Optimized inference performance and cloud costs through index sharding, caching strategies, batching, and infrastructure tuning. Collaborated with platform, data, and product teams to productionize research prototypes into reliable, scalable AI systems. ## Education ### Master of Science - MS in Computer Science Santa Clara University ## Contact & Social - LinkedIn: https://linkedin.com/in/harthikss9 --- Source: https://flows.cv/harthik JSON Resume: https://flows.cv/harthik/resume.json Last updated: 2026-04-10