Software Engineer at Databricks. Master's Degree in Computer Science & Engineering recipient from University of Michigan. Skier.
2023 — Now
San Francisco, California, United States
Spearheaded the design and rollout of an event-driven system for model metadata events, migrating customers to our unified governance platform and completing the MLOps experience.
Drove cross-functional alignment across backend, infra, and product teams to define system requirements and SLAs, balancing trade-offs between scalability, availability, and throughput.
Continuously evolved the system architecture in response to shifting load demands to refine the design and ensure performance and adaptability through each phase of the project.
Built lightweight producer and listener service that subscribes to metadata changes, supporting 100K events per day, generating $500K yearly.
Automated downstream workflows with reliable, idempotent execution respondent to events.
Delivered an in-UI “duplicate and promote” workflow, enabling customers to copy model versions across environments without leaving their current view.
Reduced context switches by 80% and accelerated migration tasks with asynchronous job execution and real-time progress updates.
Automated access control checks to maintain security and compliance during model transfers.
Implemented key tracing capabilities for MLflow 3, improving visibility for GenAI applications.
Unified all trace repositories behind a single endpoint and search metric, enabling a cohesive view and analysis of related traces across the Databricks ecosystem.
Integrated trace insights with model registry, enabling key GenAI development metrics tracking.
Designed and implemented mechanism for models to live in Databricks-managed storage.
Provided live support during engineering incidents during on-call shifts. Delivered code and configuration changes reducing the incidence of subsequent alerts by 50%.
2021 — 2023
San Mateo, California, United States
Distributed systems engineering role responsible for development and scaling of distributed data orchestration platform, handling over 1 million requests a day, 10 PB of data. Handled customer issues and grew key testing dimensions.
Designed and implemented a new multi-threaded snapshotting mechanism for a Raft-based journaling system that improved snapshotting performance by over 90% using Apache Ratis, gRPC, and RocksDB.
Optimized internal metadata storage using RocksDB by tuning compression parameters to reduce the metadata size by 70%.
Designed and implemented internal chaos testing by integrating the Chaos Mesh fault injection framework into the nightly regression tests using Kubernetes, Helm, and Go.
Created an internal microbenchmarking tool using Java JMH for critical path performance regression and evaluation. Identified a 30% performance gap between gRPC and Alluxio gRPC
Expanded monitoring capabilities by creating dozens of nightly dashboards and graphs for correctness, performance regression, stability, and monitoring of the system using Go.
Handled customer issues in PoC contexts and production context, delivering support and fixes for S&P 500 clients.
Ann Arbor, MI
Academic Environment
Wrote an autograder for class projects in Python and C++ considering the possibility of non- deterministic or non-standard output, reducing human intervention by 90% through a grading accuracy of 90% and decreasing the correctness grading time by 50%.
Decided autonomously on projects’ grading rubrics using general guidelines given by the professor reducing his involvement in the grading process by 90%.
2019 — 2019
Luxembourg City, Luxembourg
Data Science Intern – Growth Environment | Multicultural teamwork
Designed and implemented new document retrieval algorithms for Talkwalker’s AI document categorization engine using XGBoost to quantify improvements, and ElasticSearch to search and filter results, overall requiring 10% less human input to achieve the same accuracy.
Developed a comprehensive test suite in Java for Talkwalker’s AI document categorization engine using the JUnit framework exposing five bugs in the current architecture.
Adapted to a multilingual environment of over 30 nationalities thanks to my bilingualism and biculturalism to seek solutions from employees of all linguistic backgrounds.
Lyon Area, France
Start-up Environment | Autonomy | Scarce resources
Charged with delivering an iOS application aimed at streamlining the social security paperwork of traveling French professionals using Xcode and Swift for front end mobile development alongside Google Firebase and its iOS SDK as backend and database service.
Decided on a Google Firebase NoSQL database for its simplicity when handling customer information and designed the database structure, improving frontend development efficiency.
Education
University of Michigan
Master of Science - MS
University of Michigan
Bachelor of Science - BS
2009 — 2016
Cité Scolaire Internationale de Lyon
High School Diploma
2009 — 2016