Tech lead of Twitter's Presto and Zeppelin, helping evolve Twitter's SQL federation system into a world-class large-scale system that processes ~10 PB of data daily. Also led and contributed to other cross-team or cross-functional projects, spanning a wide range of data systems, including BigQuery, Druid, and Spark/Neo4j for graph analytics.
Presto
•Led Twitter's Presto federation (10+ Presto clusters with 3000+ nodes, processing ~10 PBs of data daily).
•Drove the project of creating an end-to-end machine learning pipeline learning from request logs to forecast resource usages (92%+ accuracy) of SQL queries of Presto and storing data exploration jobs in JupyterLab.
•Contributed to the router and scheduler to improve system performance (P99 query queued time decreased by 90%).
BigQuery
•Worked on the low-code ML project to advocate using BigQuery ML to simplify ML pipelines on multiple components such as model training and feature store. Helped re-build Twitter's notification ML models and reduce the dislike rate by 2+%.
Zeppelin
•Led Twitter's Zeppelin notebook service (~400 weekly active users).
•Drove the migration of on-premises Twitter's Zeppelin, to Kubernetes in the cloud (GCP) in a move-and-improve strategy.
Druid
•Worked on a unified ingestion web service for Apache Druid, a real-time analytics database, to manage data ingestion jobs.
Graph Analytics
•Contributed to the company-wide graph analytics project and evaluated a hybrid architectural design with Spark and Neo4j for next-generation large-scale graph analytics.