# Steve Sledzieski

> Infrastructure Solutions Architect

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/stevesledzieski

Systems Architect with 16+ years leading the design of scalable, high-availability infrastructure and automation frameworks across multi-cloud and on-prem environments. Proven impact on $1B+ ARR platforms, specializing in KPI-linked observability, capacity planning, and reliability engineering at 10,000+ VM scale. Drove cross-functional alignment from customer discovery through feature delivery. Known for building structured solutions to complex, production-critical challenges.

I love a good problem. If something interests me I dig in -- reading papers, watching videos, taking courses, loaning books from my local library (and buy a copy if I really like it), and inevitably delivering monologues to my wife about why it's important.

CONTACT ME
Book a time: https://calendly.com/ssledzie
Email me at: ssledzie@gmail.com

## Work Experience
### Software Engineer @ Strong Tower Consulting
Jan 2025 – Present
• Contributing to evaluation infrastructure for Jules, Google's offline coding assistant
• Developed tooling to prepare evaluation datasets for human raters and automated reporting on rater feedback
• Built scripts to extract code metrics from evaluation data (LOC, language distribution) and implemented AI-powered signal generation for complexity analysis and change categorization (bug fixes, features, refactors)
• Created dashboards and wrote SQL queries to process and analyze metrics from Jules task executions

### Staff Infrastructure Engineer @ CloudTrucks
Jan 2025 – Jan 2025 | San Francisco, California, United States
• Designed and scaled GCP infrastructure for a containerized Python/Django/Celery/PostgreSQL platform; owned uptime, observability, capacity planning, and on-call rotations for critical systems.
• Automated infrastructure provisioning using Terraform, Packer, and Docker Compose—cutting manual overhead by 80% and enabling consistent, rapid deploys across environments.
• Developed production-grade tooling in Python and Bash, including webhook-based incident workflows and log analyzers to monitor production Celery worker saturation and task latency. AI assisted development using Cursor IDE and GitHub Copilot.
• Implemented full-stack observability with Grafana, Prometheus, and Telegraf; integrated Sentry for MTTR reduction.
• Developed CI/CD pipelines in CircleCI with enforced linting, test gates, and rollback safety, reducing regressions and improving deploy confidence.
• Led infrastructure reviews for new features, shaping rollout strategies, safe database migrations, and scale-readiness plans across service boundaries.

### Staff Engineer @ Broadcom
Jan 2023 – Jan 2024

### Staff Engineer & Technical Lead, Network and Security @ VMware
Jan 2014 – Jan 2024 | Palo Alto, CA
• Architected automated traffic generation framework using python, scalable design patterns (ZeroMQ, Redis, pywin32) to  monitor and optimize large enterprise infrastructure performance (3000+ VM workloads).  
• Identified key scalability and performance factors for NSX Security Access Monitoring and Identity Firewall, DFW solutions, achieved 99.99% of availability and target P99 for key metrics.
• Led team of 4+ engineers, managed scope and specifications for timely 50+ product releases, collaborated with Product Mgmt.  
• Developed technical assets - best practices, deployment guides, Customer Sizing and Configuration Guidance to enable sustaining engineering, sales and customer support. Led customer escalations within SLAs.
• Managed product environments with Jenkins and Infrastructure as Code (IaC), deployment to private clouds, qualified L2 network stack performance with SmartNIC DPU, NUMA and x86 architecture.
• Architected large infrastructure environments for microsegmentation with solutions scaling to 1000+ workloads.
• Developed automated cross-platform performance metric monitoring & collection framework using Python & shell scripting, public APIs & custom data collectors with python pyvmomi.
• Developed Python performance frameworks and Django dashboards for diverse system topologies, traffic profiles, and metrics, improved 8-13% reporting metrics.
• Developed semi-automated python framework to orchestrate recovery and monitor security posture of distributed Disaster Recovery solution during failover events; scaled KPIs & ensured 99.99% SLAs.
• Improved source code to enhance feature performance, scalability and reliability. Utilized profiling techniques to identify and address memory leak within JNI application.

### Senior Systems and Scale Engineer @ VMware
Jan 2007 – Jan 2014 | Palo Alto, CA
• Designed & implemented automated scalability frameworks for View Server appliances (10,000+ real VMs) using View Client APIs. Created stubs for authentication libraries to simulate failures & authentication flow cases.
• Managed large infrastructure (50+ hosts, 10,000+ VMs on NFS network and iSCSI/Fiber Channel storage to manage and broker connections for 10K virtual desktops) in internal deployment.
• Executed operations at scale in the VMware View (End User Computing) lifecycle management.
• Designed automated frameworks to orchestrate production workflows (PowerShell, Linux OS, kernel fundamentals)


## Education
### Bachelor of Science (BS) in Computer Science
California Polytechnic State University-San Luis Obispo

### Practical Deep Learning for Coders
Artificial Intelligence

### Blockchain Technology
Stanford University

### Project Management
Stanford University


## Contact & Social
- LinkedIn: https://linkedin.com/in/steve-sledzieski
- GitHub: https://github.com/bitorchard

---
Source: https://flows.cv/stevesledzieski
JSON Resume: https://flows.cv/stevesledzieski/resume.json
Last updated: 2026-04-10