21 years of experience building systems and software to tackle some of the most challenging problems in startups, small, and large companies at a scale of millions and billions of users. I build reliable and efficient systems to provide the infrastructure that enables products to delight users.
Experience
2023 — Now
2023 — Now
San Mateo, California, United States
I lead the Scalability engineering team (engineering manager, formerly tech lead).
My team owns the TACO (squeeze load testing) and C3 (capacity prediction) systems mentioned in https://corp.roblox.com/newsroom/2025/06/roblox-infrastructure-supporting-record-breaking-games in addition to the load testing platform and rotrace, a tool that traces API HTTP requests from Roblox games to find inefficiencies and the impact of games on the Roblox platform.
We design and develop systems for:
• Capacity prediction, bottleneck detection, and performance analysis
• Load testing
• Squeeze testing
Our systems analyze production data and results from load testing with ML to provide prediction for scaling Roblox systems. We built a tailored system to scale for record-breaking growth driven by Grow a Garden and Steal a BrainRot.
I'm the co-creator, with Jan Berkold, and initial developer of the TACO squeeze testing system mentioned in https://corp.roblox.com/newsroom/2025/06/roblox-infrastructure-supporting-record-breaking-games. Went 0 to 1 from idea to first run in production in 3 weeks. Now running automated squeeze tests to prepare for major peaks.
2019 — 2022
2019 — 2022
San Francisco Bay Area
Observability team.
Previously: provisioning, deployment pipeline, CI, containerization, mysql reliability, redis/caching reliability and performance.
As a member of the infrastructure group, I build software and systems to support reliability, scalability, and developer effectiveness with the goal of provided the best experience to Airtable's customers. I also work across functions to make reliability part of our processes and culture.
2015 — 2019
2015 — 2019
Menlo Park, California
Production Engineering Tech lead (staff+ level leading 8 engineers) in the Presto team (~30 engineers, part of AI infra).
Facebook's data warehouse operates at exabyte scale. I've deployed the first largest Presto cluster in production.
• Technical leadership: defined the strategy and the roadmap. Grew the team. Mentored team members.
• Scalability and reliability of Presto in the Data Warehouse/Lake: exabytes of data, tens of thousands of machines.
• Migration of Facebook's A/B testing framework on Presto Raptor: hundreds of petabytes with < 30s p90 latency on experiments that impact billions of users. Clusters in multiples regions. Drove the reliability, scalability, and performance.
• Testing and deployment of DCTCP: the scale of Presto at Facebook and the amount of data it processed led to saturate the network. DCTCP is a congestion-avoidance protocol that allows to maximize both throughput and reliability.
• Non-disruptive deployments of Presto: I led the evolution of the deployment process to update Presto clusters without disruption (~10k internal daily active users) across datacenters.
• Automation, on boarding, and capacity management of ads products with Presto on sharded MySQL and Thrift: use case customer-facing dashboards and report for ads.
• Observability for Presto's infrastructure and query execution: with clusters executing complex SQL queries on thousands of distributed nodes, I developed new tools to scale debugging, monitoring, and address internal customer needs quickly. Deep debugging of the JVM and Presto internals.
• Maintainer of the Presto Python client library (https://github.com/prestodb/presto-python-client): I created the new Presto Python client library initially to support transaction and then to become the reference library.
• Community engagement with the Open Source community (F8 Classroom, The Diff Podcast, meetup, coordination, definition of processes).
2012 — 2015
Paris
Built the backend and infrastructure of the two core products from scratch. Grew the team and the culture.
Founding engineer, reported directly to the founders, member of the executives board, participated to the definition of the strategy, defined the development process, build, ran, and scaled the infrastructure. This was a hands-on role where I worked as a tech lead while also designing systems and writing a lot of code.
• Backend programming: log parsing in C/C++ (able to filter and parse tens of terabytes of data on a single machine in less than an hour) and Python, processing in Python (and prototypes in Scala with Spark).
• Created and developed a distributed workflow execution framework (Python, Amazon SWF, S3, EC2): powers the crawls and distributed computations. Implemented a dataflow model with Python code, horizontally scalable and resilient to failures.
• Distributed systems architecture on dedicated servers and in the cloud (built the platform from scratch on Amazon AWS)
• Optimization and troubleshooting of performance-intensive algorithms for data processing and analytics
• Configuration management with Puppet and Ansible
• Amazon AWS (EC2, S3, SWF, route53, ...) programming (custom tools and integration to applications) and architecture (automation and immutable/disposable servers)
• Linux system administration
2011 — 2012
Paris Area, France
Founding engineer. Reported directly to the founders.
I co-developed a machine learning framework (Python, Scipy/Numpy) with a Python embedded DSL that compiled to computations running on AWS. Implemented a dataframe API in Python (before pandas existed).
Education
EPITA: Ecole d'Ingénieurs en Informatique
System
EPITA: Ecole d'Ingénieurs en Informatique