Developed Python backend for Amazon application, Heartbeat, with 35,000 internal users that provides data visualizations and text analyses for 1B+ customer feedback data points.
•
Integrated NLP and machine learning models into backend to support text summarization, auto-completion search, and semantic search. Wrote unit and integration tests using Python's mock object and behave library.
•
Parallelized serverless text summarization system (based on AWS Lambda) to reduce latency by 12x from 5000ms to 400ms per page of 50 feedback records.
•
Lead POC for data migration project to upgrade the business metrics framework to publish Heartbeat user/usage data for weekly business reviews with senior leadership.
•
Created API to vertically-partition dashboard object configurations in DynamoDB to store annotations in S3 bucket for improved scalability and for identifying trending customer feedback topics.
•
Led team hardware budgeting for 2023 (internally referred to as IMR), forecasting system cloud spend through historical usage analysis and project/peak estimation.
Designed system architecture for an in-app notifications system for dashboard activity in Heartbeat, e.g. data filter changes/updates, user subscriptions, etc.
•
In-app notifications system was launched and expanded upon by the team post-internship, building upon my initial design.
•
Implemented 4 APIs for retrieving, posting, and updating user subscription and notification data in DynamoDB tables.
Created a natural language processing recommendation engine in Python that provides users with content recommendations using the cosine similarity scores between items in a bag-of-words model.
•
Generated forecasts in R for digital ecosystem data metrics using time series techniques, including Holt-Winters and decomposition.
Created lagged linear mixed-effects models in R to predict the value of an individual's privacy and evaluated these models for statistical significance.
•
Classified a collection of behavioral words and personality traits using two data-driven approaches: multi-dimensional scaling and agglomerative hierarchical clustering.
Education
2018 — 2022
University of California, Berkeley
Bachelor's degree
2018 — 2022
2014 — 2018
Thomas Jefferson High School for Science and Technology