Responsible for managing CockroachDB infrastructure. Key accomplishments: successfully implement/rollout the multiple-region active/active across 3 geographical regions. Design and implement a custom metrics framework that can define and capture any metrics in a yaml file based on any SQL and external commands (open source TBD). Identify the gap of CockroachDB's Point-In-Time-Recovery product offering, and design/implement a solution to enhance both the RPO & RTO(open source git repo: https://github.com/sql888/cockroach/tree/gli/add_restore_from_cdc_cmd).
Responsible for managing the Storage/Queue team's Kafka infrastructure. Successfully migrate from Confluent Kafka Cloud to the Roblox in-house Kafka infrastruture, saving the companies, millions of dollars. Customize the Apach Kafka open source to fit the Roblox internal use-cases.