# Seungjoon Lee > Staff SWE @ Apple | Data infrastructure management, Startup Advisor, Educator Location: Los Altos, California, United States Profile: https://flows.cv/seungjoon With over 10 years of experience in data engineering, I am currently at Apple, where I manage 10+ PB of data and 100 TB of memory, handling 34B daily events and operating more than 2,000 jobs. I am responsible for designing, developing, and maintaining data platforms, infrastructure, and pipelines that support machine learning, analytics, and web services for various Apple products and features. I am passionate about solving complex data problems and delivering innovative solutions that enhance efficiency, quality, and performance. I have a strong background in Python, Django, Spark, Hadoop, AWS, Kubernetes, and other cutting-edge technologies and tools that enable me to build scalable, reliable, and secure data systems. I also have a B.S. in Electrical Engineering and Computer Science from UC Berkeley, where I learned the fundamentals of data engineering and computer science. I am always eager to learn new skills and collaborate with other professionals to achieve shared goals. ## Work Experience ### Staff Software Engineer @ Apple Jan 2020 – Present | Cupertino, California, United States - Managing 10+ PB data and 100TB memory - Handles 34B daily events - Operating more than 2,000 jobs ### Senior Site Reliability Engineer @ Box Jan 2018 – Jan 2020 | United States - Built highly available DataPlatform Infrastructure using AWS, Terraform. Plus, built an infrastructure RESTful API, Cost optimized/tracking service with Django - Built CICD pipeline to build custom AMI with Packer - Built Airflow services in multiple regions - Manage hundreds of Elasticsearch, Cloudera Hadoop, Qubole, Kafka clusters as SRE - Used Puppet to build on-premise host - Used Docker/Kubernetes to build containerized services ### Principal Architect @ ScoreData Corporation Jan 2017 – Jan 2018 | United States Architect entire platform service from scratch - Machine Learning platform as a service(PaaS) - RESTFul API service with Django framework - Kubernetes with Dockerized services - Deploy with Ansible Playbook - Datapipe with Apache Spark, Hive and Airflow ### Senior Rocket Scientist, Data Infrastructure acq. Sizmek & A9 @ Rocket Fuel Inc. Jan 2017 – Jan 2017 | United States - Introduced Apache Airflow for efficient workflow scheduling and applied to the datapipe. - Building new data pipeline of 100TB/day, 2M/sec data set, using Spark, HBase, Hive and HP Vertica ### Principal Data Engineer @ Polymorph (acq. Walmart) Jan 2015 – Jan 2017 | San Francisco Bay Area Build huge impact products to advertising industry - Architect scalable ad data pipeline from scratch using Apache Spark(40B+/Month) - Built scalable real time Apache Spark Streaming pipeline with Apache Kafka(100K+ records/sec) - Maintainable data workflow management using Spotify Luigi - Micro service architecture(Fraud detection, Delivery report, Segment audience, Machine Learning for CTR optimization and etc) - Lambda architecture(Real time + Batch) - Built Big Data process on top of Amazon cost effective EMR infra structure - Integrated data processing with external companies for BI - Data process optimization with Apache Parquet - AWS Athena, Quicksight integration ### Senior Software Engineer @ ShareThis Jan 2013 – Jan 2015 | Palo alto, CA - Leading a Platform(Media Business) team - Architect Ad Business Intelligent RESTful API framework for the scalable insight products - Optimize multiple internal advertising exchange data flow - Research social network data for digital advertising model(including VAST and VPAID) - Scalable Advertising User Segmenting(5B+/Month) flow with Campaign Delivery Tracking - ETL with Hadoop(60M+ events/Month), Sqoop and Oozie or Shell Script ### Web Application Developer @ ShareThis Jan 2011 – Jan 2013 | Palo Alto, California - Collaborated with Senior engineers and Ad Operation team to determine project plan and delegate tasks to engineering team. - Developing ShareThis advertising platform with integration of first and third party ad networks to optimize social network and delivery process - Implementing Object Oriented PHP Patterns for better maintenance. - Akamai, Memcached Scalable Services for delivering advertisement segment. - RDBMS for large data systems and MongoDB for realtime campaign log service. - Front-End Web Programming(XHTML/HTML5/CSS/OO Javascript/JQuery) - API integration for first and 3rd party ad network with Zend Mail, Force.com, Google Double Click and Dart services. - Back-end Java Programing with efficient time scheduling. ### Webmaster @ KGSA(Korean Graduate Student Association) Jan 2010 – Jan 2010 Managed website for KGSA encouraged the members to communicate to each other online http://www.kgsa.net ### Computer Engineering Soldier @ KOREAN ARMY Jan 2005 – Jan 2007 - Created a dynamic website tools for soldiers trainings - Took the Korean army from paper and pen communication to a modern internet based communication, building a new more functional website. - Developed a new communication webpage for Korean soldiers to improve work flow and daily scheduling procedures. ## Education ### B. S in Electrical Engineering and Computer Science University of California, Berkeley ## Contact & Social - LinkedIn: https://linkedin.com/in/seungjoonlee1984 - Portfolio: http://www.seungjoonlee.com --- Source: https://flows.cv/seungjoon JSON Resume: https://flows.cv/seungjoon/resume.json Last updated: 2026-04-12