Experience
2023 — Now
2020 — 2022
2020 — 2022
Implemented two generations of metrics pipeline improvements handling petabytes of data weekly while reducing cost by more than 70%. Created cloud service provider telemetry streams widely used for critical operations and security workflows. Rolled out multiple global changes across regions and cloud providers and across generations of infrastructure configurations. Worked with teams across the organization to solve specific observability and operational issues, improve the observability baseline, and plan for systemic improvements. Advocated for Snowflake’s needs with vendors and open source leaders.
2019 — 2020
2019 — 2020
Develop Lightstep Research initiative to provide in depth technical information and working code for practices, techniques, and observations using cloud service APIs. Initiative to monitor cloud APIs in real time at global scale. Experience based technical writing on development, observability, organization structures and challenges, and other topics as requested. Interact with customers on various media providing guidance for building and using observability effectively. Product feedback from a practitioner’s perspective. Public speaking on research and organizational practices including Chaos Engineering. Public Relations quotes and interviews for various publications.
2017 — 2019
2017 — 2019
Prototype, develop, demonstrate, and advocate for shared instrumentation and metrics libraries usage across company. Present about and advocate for microservices strategies and best practices. Present about and train teams on effective operational troubleshooting. Requirements gathering, systematic analysis, design, and prototyping of new merchandising data platform services. Mentorship of engineers at all experience levels. Participation in and guidance of various subject matter working groups including observability / operability, messaging, microservices, and platforms. Analysis and suggested prioritization of company wide security posture including ad-hoc pentesting.
2016 — 2017
San Francisco Bay Area
Design, implementation, and production deployment of log aggregation system handling more than 500k messages per second and 500 MB/s with best in class operational insight. Lead, implement, and advocate use of distributed tracing system using the OpenTracing API across a large portfolio of microservices. Applied insights from distributed tracing to isolate previously invisible failure modes. Lead investigation of container based deployment models and container orchestration systems. Reviewer / editor for Twilio Cloud Security Standard. Participate in on call rotation and incident handling. Lead and participate in Chaos Gamedays to verify system resilience. Languages, services, and platforms used: Go, Python, Java, OpenTracing, Datadog, BigQuery, Kinesis, HAProxy, AWS, Google Cloud Platform.
Education
Yale University
Master of Arts
Ohio Wesleyan University