# Mitch Not Mitchell > Staff Software Engineer | Agent Infrastructure | MCP | Go | Kubernetes Location: San Francisco, California, United States Profile: https://flows.cv/mitchnotmitchell Staff Software Engineer with 8+ years building distributed infrastructure and AI agent systems at scale. Creator of a scalable Go autonomous agent platform with 1,860+ MCP tool endpoints, durable execution, state checkpointing, fault recovery, and sandboxed compute isolation serving 17 enterprise customers. Deep expertise in Go, Python, Kubernetes, service-oriented architectures, and the Model Context Protocol (MCP). Specializing in agent execution infrastructure: trust-boundary isolation modes, circuit breakers, and containerized multi-tenant environments for autonomous AI systems. Highlight: Designed and built an autonomous agent infrastructure platform in Go at Galileo AI – featuring multi-agent orchestration, state checkpointing, and fault recovery – deploying across 17 enterprise customers and earning company-wide recognition at All-Hands as "most complex agent built at Galileo." Looking to bring deep MCP and agent infrastructure experience to a team pushing the boundaries of autonomous AI systems, agent safety, and scalable inference infrastructure. ## Work Experience ### Software Engineer - Agent Infrastructure , Platform @ Galileo Jan 2025 – Present | Burlingame, California, United States AI observability and evaluation platform for enterprise ML teams. • Designed and built a large-scale Go autonomous agent platform (1,860+ MCP tool APIs across 72 modules) with sandboxed execution environments, state checkpointing, fault recovery, and trust-boundary isolation – recognized at All-Hands as "most complex agent built at Galileo." • Served as Incident Lead for 4 P0 production incidents (6,200+ stuck jobs, Redis misconfigurations across 3 clusters, production API outages), resolving all within 24 hours with documented root causes and permanent fixes. • Implemented agent execution infrastructure and auth frameworks – OAuth-based RBAC across 17 customer environments, tiered trust boundaries, rate limiting and circuit breakers for 15+ external integrations, and OpenTelemetry metrics with 12 Grafana dashboards. • Built a durable execution engine with 5-minute checkpointing, atomic state writes, fault-tolerant consensus, and batch API optimization (50% cost reduction, 90% prompt cache hit rate) – enabling 24-hour autonomous execution cycles with crash recovery across 23 concurrent agent workers. • Led enterprise customer migration from v1 to v2 with a Go data-migration CLI, ClickHouse/Postgres backup-restore, DNS cutover automation, and Grafana tracking – zero data loss, zero unplanned downtime. • Managed deployments for 17 enterprise customers including Fortune 20 and Fortune 50 companies across AWS, GCP, and Azure, driving CVE remediation, GPU model bundle configuration, and TLS 1.3 hardening. ### Staff Software Engineer, MLOps @ Primer.ai Jan 2023 – Jan 2024 | San Francisco, California, United States AI platform for NLP and document understanding, serving defense and intelligence customers. • Automated ML workflow deployments using GitHub Actions, transitioning 12+ production ML deliverables from manual Kubernetes deployments to ArgoCD automated workflows, eliminating deployment toil. • Consolidated development infrastructure by 43% and parallelized ML workloads on GPU instances, reducing build times by 80%. • Developed an IaC platform with GitOps automation, reducing infrastructure fulfillment from weeks to under 10 minutes (93% reduction), enabling self-service provisioning for ML research and sales MVPs. • Designed and developed PrimerCLI (Bash, Python, Go), a centralized developer toolkit adopted by 100 engineers, reducing manual processes by 1,200 hours annually. • Led the design and implementation of Kubernetes for the company's production migration to Azure, achieving 46% completion and delivering $2.8M in cost savings. ### Senior DevOps Engineering Lead, Platform Operations @ Capella Space Jan 2021 – Jan 2023 | San Francisco, California, United States A space technology company providing SAR imagery and satellite solutions for commercial and government use. • Reduced build times by 80% by creating a unified toolkit that eliminated 8,000 lines of redundant code, containerized CI processes, and consolidated GitLab pipeline modules. • Overhauled cloud infrastructure using IaC, improving code delivery reliability by 30% and building a centralized tools cluster that enabled local-parity development and testing environments. • Built a redundant container registry for production systems, preventing satellite imagery pipeline outages during peak collection windows. ### DevOps Engineer @ Concourse Labs Jan 2020 – Jan 2021 Cloud governance automation solutions specializing in Security-as-Code to prevent data breaches. • Built the company's first CI/CD pipeline using Terraform and IaC (300 commits), integrating AWS CloudFormation and Azure deployments and enabling 1.5+ production deliveries per day. • Reduced production errors by 50% through expanded end-to-end testing, frontend and unit test coverage, and automated monitoring and alerting. ### DevOps Engineer @ Google Jan 2018 – Jan 2020 | Mountain View, CA • Built a speech model evaluation system adopted by 300 engineers and 80 PMs — comparing quality data across usage history, forecasting performance, and shortening release cycles through automated tooling. • Developed automated Python tooling that cut 800 hours from data collection, using ML models to detect user sentiment, map user journeys, and surface pain points. • Sole designer of mouse-to-keyboard interaction for Google Assistant on web — filed for U.S. patent. • Managed GPU-focused cloud workloads and CI runners for speech model training and evaluation pipelines. • Led infrastructure migration enabling 100K+ Google employees to work remotely during COVID-19 — maintaining communications, troubleshooting engineering environments, and adapting internal tools for external access. • Modernized ML build/release documentation for Assistant (5 years outdated, 2 tech stack changes) — met with dozens of engineers to map processes for bleeding-edge ML release cycles. • Volunteered for Google's AI Ethics advisory group (2019-2020), contributing to discussions on responsible AI development, model interpretability, and evaluation standards. ### Founder @ Unifide Jan 2017 – Jan 2018 | San Francisco Bay Area Founded and led a consumer platform that unified social, messaging, email, and media APIs into a single web UI with composable widgets. Built the backend in Python/Flask, recruited and led a 5-person engineering team, and shipped the MVP. ## Education ### Associate's in Communication and Media Studies American River College ### Computer Science California State University-Sacramento ## Contact & Social - LinkedIn: https://linkedin.com/in/mitchnotmitchell --- Source: https://flows.cv/mitchnotmitchell JSON Resume: https://flows.cv/mitchnotmitchell/resume.json Last updated: 2026-04-10