# Sai Siva Hemanth P. > AI Engineer at Apple Location: San Francisco Bay Area, United States Profile: https://flows.cv/saisivahemanthp - Results-driven Backend Software Engineer with 8+ years of expertise architecting scalable microservices, high-performance REST/gRPC APIs, and cloud-native solutions across AWS, Azure, and GCP. - Deep expertise in distributed systems and data engineering, now applying these foundations to GenAI platform development. Currently focused on RAG architectures, LLM orchestration with LangChain/LangGraph, and building production AI systems at scale. - Passionate about the intersection of scalable systems and AI. Staying current with the rapidly evolving GenAI ecosystem - memory, MCP, safety, governance, and responsible AI - Proven track record building enterprise systems using Python, Java Spring Boot, and Go with deep experience in event-driven architectures, big data pipelines. - Expert in full SDLC delivery from design to production, with proven experience in SaaS platform migrations, real-time data processing, and performance optimization using Kafka, Redis, Kubernetes, and distributed databases. Consistently deliver solutions that reduce operational costs, improve system reliability, and drive significant business growth. - Known for turning messy, ambiguous problems into scalable, production-grade systems that deliver real business impact. I’ve led cross-functional teams, influenced product roadmaps with rapid prototypes, and bridged engineering with business needs through AI-first thinking. Technical Skills Languages: Python [Flask, Fast API], Go, Java [Spring Boot] | Familiarity: JavaScript [NodeJS] Database: PostgreSQL, MySQL, Snowflake, ElasticSearch, Cassandra Cloud skills: GCP, Azure, AWS Big Data: Apache Spark, Apache Flink, Apache Airflow, Iceberg, Hadoop Libraries: Pandas, Sc-kit learn, PyTorch AI/ML Tools: Lang Chain, Lang Graph, Llama Index, OpenAI, Vector Database Technologies: RESTful web services, Kafka, Redis, gRPC, Protocol buffers, Flask, Kibana, Grafana CI/CD: Kubernetes, Docker, Git, Jenkins, ELK AI Concepts: LLMs, Prompt Engineering [ReAct], RAG, LangGraph, LangChain, LlamaIndex, Multi Context Protocol (MCP), Agent2Agent Protocol, LangSmith, LLM Evaluation, Similarity Search, Guardrails, memory Looking to contribute to teams building AI-native products, large-scale systems, and data-intelligent platforms. Let’s connect if you're building the future. I'm on an H1-B visa and would require visa sponsorship. #GenAI #AIEngineering #BackendEngineer #DataPipelines #LLM #RAG #TechLeadership #OpenAI #ProductEngineering #VibeCoding #DataEngineering #Microservices #DistributedSystems ## Work Experience ### Senior AI Engineer @ Apple Jan 2026 – Present | San Francisco Bay Area Building Agentic AI at Apple ### Senior Software Engineer @ Walmart Global Tech Jan 2023 – Jan 2026 | San Francisco Bay Area 🎯 Driving scale, efficiency, and innovation across the Walmart Creator platform using microservice architecture, data engineering, cloud, and AI/ML infrastructure. Key Projects & Impact πŸš€ Big Data-ML Pipeline infrastructure – Multi-Source Data Ingestion & Processing: Built enterprise-scale data pipelines processing TBs daily using Apache Spark on Google Dataproc, orchestrated via Airflow across multi-cloud infrastructure. Collaborated with ML/Data Science teams on creator classification models, reducing onboarding SLA from 10-15 days to 72 hours (75% improvement) and unlocking multi-million-dollar revenue streams. Optimized Spark performance by 47% (15β†’8 hours) and delivered $120K annual savings through custom rate limiting and partitioning strategies. Automated ML deployment pipeline using Vertex AI Model Registry(MLOps) πŸ€– Intelligent Support AI Agent - Production LLM+RAG Service Built and deployed a production microservice using LLM and RAG, processing 10K+ documents with OpenAI embeddings and Chroma vector DB, achieving 85% response accuracy. Optimized performance through async processing and multi-level caching, reducing latency from 3.2s to <800ms (76% improvement). Engineered a comprehensive safety layer with content filtering, prompt injection prevention and PII detection. Deployed on Azure Kubernetes with auto-scaling, supporting 150% user engagement increase and 70% reduction in support tickets. πŸ“© Real-time Event Processing & Data Pipeline Architected event-driven notification service using Apache Kafka and Java for Creator approval workflows. Provided technical leadership and mentored three junior engineers, delivering automated end-to-end processing that reduced review time by 80% and saved hundreds of engineering hours quarterly through scalable microservice architecture. 🎯 Led 4+ Hackathons, Built AI prototypes: trend detection engines, AI voice storefronts, competitor analysis - earned leadership recognition and influenced product roadmap ### Senior Software Engineer @ Chainalysis Jan 2022 – Jan 2023 | United States Blockchain Risk Intelligence Platform – Real-time Address Monitoring System - Architected and developed enterprise-grade blockchain address monitoring microservices using Java Spring Boot, enabling compliance teams to perform real-time risk assessment and streamline user onboarding for decentralized apps. - Implemented event-driven architecture with Apache Kafka to process blockchain transaction data at scale, integrated with PostgreSQL for risk scoring and compliance audit trails. - Delivered transformational business impact by establishing automated compliance workflows that enabled onboarding of 50+ decentralized applications, resulting in 300% revenue growth and positioning the platform as a trusted Web3 gateway. - Deployed cloud-native solution using Docker containerization and Kubernetes orchestration, ensuring high availability and scalability for processing millions of blockchain addresses daily. Tech stack: Java, Spring Boot, Kafka, PostgreSQL, Kubernetes, Docker ### Software Engineer @ Clarifai Jan 2021 – Jan 2022 | San Francisco Bay Area ML Data Infrastructure and Platform : - Developed highly scalable async metadata tagging API using Go/Redis task queues, reducing latency from 8 seconds to 200ms. - Built highly available (99.5%) gRPC endpoint ingesting data from S3/GCS/Azure, achieving 5x upload acceleration, resulting in securing high-value multiple enterprise partnerships and revenue increase through improved platform performance. - Led end-to-end software development lifecycle while cross-collaborating with product, business and UI/UX - The feature had a huge impact on making the customer happy, as it made the user journey easy and pleasant, as the feature was the first point of action on our platform. - It increased the scalability of the platform by offloading the front-end, which was making numerous Post requests previously, and also enabled the extensibility of the tagging feature. Data Pipeline - Ingesting from Cloud Object Storage(s) - Developed a feature to ingest data from the three major cloud storage providers- AWS S3, GCP storage, and Azure blob storage into Clarifai’s data input pipeline. - Implemented it as a scalable async microservice, running on Kubernetes pods in an AWS EKS cluster using Go lang, PostgreSQL, and Redis, integrating SDKs provided for S3, GCP, and blob storage. - Collaborated with cross-functional teams to gather user stories, design requirements, and implementation details. Co-authored the design doc with the team lead. - This is a feature that delighted customers and is the most awaited functionality since it eases the uploading process of high volumes of large datasets with media files at scale onto the Clarifai platform. ### Member of Technical Staff @ Nutanix Jan 2018 – Jan 2021 | San Jose Working with the Data Analytics team for "Nutanix Files"- a petabyte-scale distributed file storage solution from Nutanix that has more than 1500+ enterprise customers with multi-million dollars in annual revenue. SaaS Platform Migration & API Development – Anomaly Detection & Capacity Analytics - Architected and developed high-performance REST APIs for anomaly detection and capacity explorer services using Python Flask, implementing caching strategies, pagination, and asynchronous processing to optimize API response times and handle concurrent user loads. - Led critical backend development during enterprise SaaS migration from on-premises infrastructure to AWS cloud, leveraging EC2, RDS (PostgreSQL), S3, Lambda, Kafka, and Snowflake to achieve enhanced scalability, 99.9% availability, and reduced operational costs. - Optimized API performance through database query optimization, connection pooling, and AWS ECS containerization, enabling the platform to scale from single-tenant to multi-tenant SaaS architecture supporting hundreds of concurrent users. - Implemented cloud-native data pipeline integrating Snowflake data warehouse with real-time Kafka streaming and S3 storage, ensuring seamless data flow between analytics services and improved system reliability - Developed a cluster monitoring, alerting and logs collection service that gets critical cluster information across several multi-node clusters in a data center. This project improved the efficiency of resource utilization, management by 75% and saved $250k on resource spending. ### Web Application Developer @ San Jose State University Jan 2017 – Jan 2017 | San Jose, California ### Software Development Intern @ Deary LLC Jan 2017 – Jan 2017 | San Jose, CA ## Education ### Master of Science in Computer Engineering San JosΓ© State University Jan 2016 – Jan 2017 ### Bachelor of Technology in Electronics and Communications Engineering Jawaharlal Nehru Technological University Jan 2011 – Jan 2015 ## Contact & Social - LinkedIn: https://linkedin.com/in/hemanthpratury --- Source: https://flows.cv/saisivahemanthp JSON Resume: https://flows.cv/saisivahemanthp/resume.json Last updated: 2026-03-22