Build analytics products at scale. I bring together product experience and distributed systems expertise to create reliable, high-performance infrastructure that users depend on.
Experience
2023 — Now
San Francisco Bay Area
tech lead on ETL pipelines to cloud storage & warehouse
ops lead on real-time events API & warehouse ingestion
2021 — 2023
San Jose, California, United States
Ads Callback Platform
➤ Developed and maintained distributed stream processing pipelines that ingest ads impressions and engagements in real-time for downstream use cases like Ads Serving, Billing, and Prediction Model.
2020 — 2021
Sunnyvale, California, United States
User Data Platform / Membership Platform
➤ Designed and implemented large-scale stateless RESTful Profiling API, which is a centralized Profiling Data Ecosystem, provides data to customers across the company, and serves 150K+ QPS globally
➤ Designed and built a secure data ingestion pipeline to consume customer’s Profiling data from Yahoo Grid; the ingested Profiling data is served as the source of multiple company-wise backend services. The new design pipeline reduced onboard of new Grid feeds from weeks to days
➤ Maintained and implemented scalable, replicated, and low-latency key-value NoSQL user database service which contains 3B+ user accounts and serves 3M+ QPS globally
➤ Was the leader of team’s service operation maintenance, including keeping packages and platforms update-to-date, building configuration management tools and CI/CD pipelines for service deployment on 1000+ hosts, and providing good documentation and an easy-to-understand operational model
2019 — 2019
Sunnyvale, California, United States
Identity Platform
➤ Implemented a stats-logging-service API which handles 1B+ HTTP requests per day for Yahoo login service and saved 30% man-hours in stats logs analysis
➤ Wrote 100% coverage unit tests and integration tests for the API using JUnit/JMockit and Cucumber
➤ Utilized Splunk to analyze accounts activity and performed data exploration to identify login pattern and prevent from hacking activities using distributed computing (Hadoop, Hive, and PySpark)
Project: Password Security Guard
➤ Implemented a new web service serving 700M+ global accounts including Yahoo, AOL, etc, to prevent 50% users from using breached passwords
➤ Collaborated with UI designers and implemented grid layout for the new change-password page which adds easy-understanding instructions, saving user 20% time on the page, based on Nodejs architecture and developing/testing on Docker
➤ Built a robust service server using Undertow and RESTful API with SSL certificates
➤ Integrated codes between Yahoo login service stack, frontend stack, and this new service, including unit tests and integration tests
2015 — 2018
Hsinchu County/City, Taiwan
Design Signoff Team
➤ Developed various machine learning algorithms to build predictive analytics metrics to improve chip performances. The scalable solution also reduced run time by 20%
➤ Automated 5+ workstreams in the verification phase for IC design using Python-based high-performant algorithms. My work resulted in a 30% performance increase and was standardized into the workflow at MediaTek adopted by more than 50 engineers
➤ Managed 3 crossed-team smartphone projects, and applied a more data-driven approach into IC design verification, static timing analysis, dynamic power analysis, yield analysis, and failed chip troubleshooting
Project: Dynamic Power Prediction
➤ Developed a machine-learning solution for IC chips to predict power fluctuation under 10% error rate with a 7x time performance increase
➤ Trained multiple neural nets and Xgboost Python-based models
Education
National Taiwan University
Master's Degree
National Taiwan University
Bachelor's Degree
University of Southern California