# Tommy Lee > Software Engineer at Disney | MS Data Science Location: San Francisco Bay Area, United States Profile: https://flows.cv/tommylee Data Analyst with 4 years of work experience in the energy and real estate industry combined and MS in Applied Data Analytics (Data Science) graduate from Boston University. I have a solid foundation with data analytics and a strong understanding of statistics and machine learning. -Skills: API Data Extraction, Web Scraping, Data Visualizations -Knowledge: Hypothesis Testing, Central Limit Theorem, Probability Theory, Linear Regression, Logistic Regression, KNN (k-nearest neighbors), Naïve Bayes, Decision Tree, Random Forest, Neural Networks, K-Means Clustering, Hierarchal Clustering, NLP (Natural Language Processing) -Tech Stack: Python, MySQL Workbench, R Studio, Tableau, Looker Studio, AWS (S3, Athena and Quicksight), JupyterHub, Terminal/Command Prompt, Cron/Task Scheduler, Jira, MongoDB, SalesForce ## Work Experience ### Software Engineer @ The Walt Disney Company Jan 2024 – Present As a Software Engineer at Disney, I design, build, and maintain large-scale data pipelines and backend systems that power analytics and business intelligence across the streaming organization. My work focuses on creating reliable, high-performance data infrastructure that supports millions of daily records and enables downstream applications and reporting platforms. I develop and optimize ETL workflows using Python, SQL, Airflow, and cloud-native tools, ensuring data accuracy, scalability, and fault tolerance. I also build automation and monitoring solutions to improve system reliability and reduce manual intervention. Collaborating closely with data scientists, analysts, and product teams, I translate complex business logic into efficient, production-grade data models and APIs. Over the past year, I’ve improved pipeline latency, enhanced observability, and contributed to platform-wide initiatives that strengthen Disney’s data ecosystem and engineering standards. ### Product Data Analyst @ SkySlope Jan 2022 – Jan 2024 -Served as the source of truth and overseer for SkySlope’s product and MLS (multiple listing service) datasets coming from SQL, SalesForce and Mongo by analyzing and testing the data is reliable and accurate -Specialized on researching SkySlope’s version of DocuSign (DigiSign) B2B and B2C revenue data that would be used for predictive modeling to pinpoint agents and brokerages more inclined to adopt DigiSign’s subscription plan enhancing the targeting of customer support efforts -Collaborated with data engineers and developers to address challenges in ETL processes and data integrity -Created, maintained and presented Tableau and Looker dashboards for operations, marketing and customer support teams and PMs to visualize and monitor customer behavior and product usage which sped-up data-driven business decisions by at least 75% -Upheld 100% completion rate on providing daily ad-hoc reports that define target audiences for new feature launches and historical customer product usage and behavior for retention using analytic tools such as SQL, Python and Heap ### Graduate Student @ Boston University Jan 2021 – Jan 2023 ### Data Analyst @ GridX, Inc. Jan 2020 – Jan 2022 •Automated analysis process and ad hoc reporting using Python scripts reducing manual workload for all team members •Managed, educated, and built multiple data visualizations using AWS Quicksight and Athena for internal team and external clients to display summary statistics and detect anomalies daily •Analyzed MySQL database that includes billing data, customer information and SQMD (Sequential Meter Data) and settle any data errors •Communicated with PG&E through email to validate and resolve incorrect or missing EDIs (Electronic Data Interchanges) ### Energy Advisor @ GridX, Inc. Jan 2020 – Jan 2020 •Informed customers about CCAs (Community Choice Aggregations) and how they help reduce customer’s PG&E bill statement •Assisted customers over the phone on understanding their PG&E’s bill statements •Mentored call center team on how to query from MySQL database to improve data retrieval speed ### Yelp Data Analytics Passion Project @ Self-Employed Jan 2019 – Jan 2020 Analysis Link: https://tommylee3014.wixsite.com/yelp-boba Using Yelp's Fusion API, I have extracted number of reviews, ratings, geospatial data and etc. for all milk tea stores in California. The goal is to find out what are the best boba stores in California and why are they the best. Phase 1 : Experimentation (Sept 2019 -Dec 2019) • Developed Python scripts that will automatically extract and cleanse data for boba shops in 482 cities • Used NLP library to find popular keywords within multiple milk tea shop reviews • Wrote SQL queries to join tables as well as aggregate information • Created numerous interactive dashboards in Tableau that are tailored to general audience Phase 2: Enhancement (Jun 2020 - Jul 2020) • Implemented parallel computing to reduce execution time (runs 2-3 times faster) • Export data to Excel file and MySQL database • Created AWS EC2 instance for Python script to run on a daily schedule using cron ### Tradeshow Events Assistant (Temp) @ SIGN DISPLAY AND ALLIED CRAFTS LOCAL UNION 510 TRAINING TRUST Jan 2019 – Jan 2020 | San Francisco Bay Area ### Finance and Corporate Strategy/ Data Science Intern @ Western Digital Jan 2017 – Jan 2017 | San Jose, CA • Experimented with machine learning by designing a news aggregator for multiple websites using web scrapping • Calculated and presented cash flows and net present values in Powerpoint graphs for visualizations during meetings ## Education ### Master of Science - MS in Applied Data Analytics Boston University ### Bachelor of Science - BS in Statistics Major/ Economics Minor University of California, Davis ## Contact & Social - LinkedIn: https://linkedin.com/in/tommy-lee-ds - Website: https://github.com/tolee824 --- Source: https://flows.cv/tommylee JSON Resume: https://flows.cv/tommylee/resume.json Last updated: 2026-04-05