# Steve Sledzieski > Infrastructure Solutions Architect Location: San Francisco Bay Area, United States Profile: https://flows.cv/stevesledzieski Systems Architect with 16+ years leading the design of scalable, high-availability infrastructure and automation frameworks across multi-cloud and on-prem environments. Proven impact on $1B+ ARR platforms, specializing in KPI-linked observability, capacity planning, and reliability engineering at 10,000+ VM scale. Drove cross-functional alignment from customer discovery through feature delivery. Known for building structured solutions to complex, production-critical challenges. I love a good problem. If something interests me I dig in -- reading papers, watching videos, taking courses, loaning books from my local library (and buy a copy if I really like it), and inevitably delivering monologues to my wife about why it's important. CONTACT ME Book a time: https://calendly.com/ssledzie Email me at: ssledzie@gmail.com ## Work Experience ### Software Engineer @ Strong Tower Consulting Jan 2025 – Present • Contributing to evaluation infrastructure for Jules, Google's offline coding assistant • Developed tooling to prepare evaluation datasets for human raters and automated reporting on rater feedback • Built scripts to extract code metrics from evaluation data (LOC, language distribution) and implemented AI-powered signal generation for complexity analysis and change categorization (bug fixes, features, refactors) • Created dashboards and wrote SQL queries to process and analyze metrics from Jules task executions ### Staff Infrastructure Engineer @ CloudTrucks Jan 2025 – Jan 2025 | San Francisco, California, United States • Designed and scaled GCP infrastructure for a containerized Python/Django/Celery/PostgreSQL platform; owned uptime, observability, capacity planning, and on-call rotations for critical systems. • Automated infrastructure provisioning using Terraform, Packer, and Docker Compose—cutting manual overhead by 80% and enabling consistent, rapid deploys across environments. • Developed production-grade tooling in Python and Bash, including webhook-based incident workflows and log analyzers to monitor production Celery worker saturation and task latency. AI assisted development using Cursor IDE and GitHub Copilot. • Implemented full-stack observability with Grafana, Prometheus, and Telegraf; integrated Sentry for MTTR reduction. • Developed CI/CD pipelines in CircleCI with enforced linting, test gates, and rollback safety, reducing regressions and improving deploy confidence. • Led infrastructure reviews for new features, shaping rollout strategies, safe database migrations, and scale-readiness plans across service boundaries. ### Staff Engineer @ Broadcom Jan 2023 – Jan 2024 ### Staff Engineer & Technical Lead, Network and Security @ VMware Jan 2014 – Jan 2024 | Palo Alto, CA • Architected automated traffic generation framework using python, scalable design patterns (ZeroMQ, Redis, pywin32) to monitor and optimize large enterprise infrastructure performance (3000+ VM workloads). • Identified key scalability and performance factors for NSX Security Access Monitoring and Identity Firewall, DFW solutions, achieved 99.99% of availability and target P99 for key metrics. • Led team of 4+ engineers, managed scope and specifications for timely 50+ product releases, collaborated with Product Mgmt. • Developed technical assets - best practices, deployment guides, Customer Sizing and Configuration Guidance to enable sustaining engineering, sales and customer support. Led customer escalations within SLAs. • Managed product environments with Jenkins and Infrastructure as Code (IaC), deployment to private clouds, qualified L2 network stack performance with SmartNIC DPU, NUMA and x86 architecture. • Architected large infrastructure environments for microsegmentation with solutions scaling to 1000+ workloads. • Developed automated cross-platform performance metric monitoring & collection framework using Python & shell scripting, public APIs & custom data collectors with python pyvmomi. • Developed Python performance frameworks and Django dashboards for diverse system topologies, traffic profiles, and metrics, improved 8-13% reporting metrics. • Developed semi-automated python framework to orchestrate recovery and monitor security posture of distributed Disaster Recovery solution during failover events; scaled KPIs & ensured 99.99% SLAs. • Improved source code to enhance feature performance, scalability and reliability. Utilized profiling techniques to identify and address memory leak within JNI application. ### Senior Systems and Scale Engineer @ VMware Jan 2007 – Jan 2014 | Palo Alto, CA • Designed & implemented automated scalability frameworks for View Server appliances (10,000+ real VMs) using View Client APIs. Created stubs for authentication libraries to simulate failures & authentication flow cases. • Managed large infrastructure (50+ hosts, 10,000+ VMs on NFS network and iSCSI/Fiber Channel storage to manage and broker connections for 10K virtual desktops) in internal deployment. • Executed operations at scale in the VMware View (End User Computing) lifecycle management. • Designed automated frameworks to orchestrate production workflows (PowerShell, Linux OS, kernel fundamentals) ## Education ### Bachelor of Science (BS) in Computer Science California Polytechnic State University-San Luis Obispo ### Practical Deep Learning for Coders Artificial Intelligence ### Blockchain Technology Stanford University ### Project Management Stanford University ## Contact & Social - LinkedIn: https://linkedin.com/in/steve-sledzieski - GitHub: https://github.com/bitorchard --- Source: https://flows.cv/stevesledzieski JSON Resume: https://flows.cv/stevesledzieski/resume.json Last updated: 2026-04-10