A data scientist is good at programming and loves visualization.

Experience

GoogleSoftware Engineer

2023 — Now

San Francisco Bay Area

AppleData Engineer

2019 — 2023

San Francisco Bay Area

Established end-to-end automation test pipelines for evaluating computer vision models, from capturing data, executing model, running evaluation, gathering failure analysis breakdown, and generating visualizations; optimized for the large-scale dataset.

Designed test cases and built regression datasets for sanity tests, which could quickly generate initial testing reports or find regression fields, to help shorten the test turnaround time.

Developed tools to visualize computer vision models results, which give more intuitive examples on delivery reports to algorithm developers and management, to help understand the models' performance and triage the failure pattern.

Developed HTML-based test report delivering the overall model performance and detailed breakdown metrics to help the audience directly get information.

Built PostgreSQL database + web-based visualization querying tools, which store evaluation metrics, to show the clear trend of model performances

San Francisco County Transportation AuthorityTechnology, Data & Analysis Intern

2019 — 2019

San Francisco Bay Area

ConnectSF visualizations [JavaScript (Vue.js, leaflet.js, d3.js) | PostgreSQL | Python]

Created web-based visualization tools for presenting the number of accessible jobs and trip patterns within San Francisco.

Designed data pipelines for formatting data, connected database through RESTful API to front-end to enhance security and efficiency.

Aimed to help planning department interactive with data to make decisions for city transportation in next 50 years.

链家/Lianjia (Homelink Real Estate Agency Co., Ltd.)Data Analyst

2017 — 2018

Qingdao, Shandong, China

Database Administration [MySQL | Hive]

Established MySQL database from nothing with cleaning historical data with different formats, and built data pipelines for future data importing.

Responsible for Qingdao branch data demand and querying from headquarter database.

Real Estate City Map [JavaScript (Echarts.js) | MySQL]

Visualized data on the top of the Baidu map with the locations and information of 200+ stores, 700+ competitor’s stores, and 90% of Qingdao apartments.

Helped inform decisions regarding the opening of stores or the closing of unprofitable stores.

Employee Separation Rate Prediction [R | MySQL]

Developed logistic regression models to predict the separation rate of employees.

Optimized employee management by updating monthly predictions and providing top resignation reasons for individual agents.