Tech lead - Batch processing system:
· Responsible to lead a team of 4 SDEs to build an offline analytics ETL application that takes customer event data, transforms it to joined and aggregated business metrics to be shown on an UI interface.
· Designed a system architecture to batch process data, led communication with UX, business and dependency teams, and implemented data lake data ingestion and data publishing components.
· The system will run analytics ETL jobs for 30k customers with 5.2GB/hour peak throughput.
Tech lead - Online low-latency analytics system:
· Responsible to lead a team of 3 SDEs to build an online low-latency analytics application that ingests business metrics data from different data sources into an Elasticsearch and serves the data on the UI.
· Designed the system architecture and the public API schema, implemented an Elasticsearch query builder which translates API requests to Elasticsearch queries.
· The system serves UI reporting requests for 30k customers with 100M total document count, 55GB total data size, and 50 TPS.
Tech lead - re-architecture Amazon advertising reporting campaign placement reporting engine to reduce UI latency, improve scalability, improve development velocity and reduce hardware cost.
· Responsible to lead a team of 2 SDEs to improve Amazon advertising UI online reporting latency and campaign placement index scalability, reduce hardware cost, and increase developer velocity.
· Designed a re-architecture solution for existing Amazon advertising system, implemented new ETL job to create a new dataset, migrated five clients to the dataset and built test script to make sure the data consistency between new dataset and old dataset.
Tech lead - develop a offline report generation system for Amazon advertisers to allow them generate campaign performance report at daily granularity up to two years.