# Jie C. > Software Engineer Location: San Francisco Bay Area, United States Profile: https://flows.cv/jiec 6+ years' hands-on experience in building data processing pipelines. Solid knowledge in large scale data storage, batch data processing, stream data processing,query optimization, and distributed systems, and machine learning pipeline. Technical skills: - Programming languages: Python, C++, SQL - Data storage and query engines: MySQL, Snowflake, Presto, SparkSQL - Data pipeline tools: Kafka, KStream, Airflow, Dbt, Pandas - Statistical modeling and deep learning: Scikit-learn, Pytorch, Stan ## Work Experience ### Senior Software Engineer @ Gatik Jan 2024 – Present ### Software Engineer @ Zoox Jan 2022 – Jan 2023 - Designed and implemented a database solution to store autonomous vehicle sensor measurements for creating high definition (HD) maps. - Created a version control and data lifecycle management solution for updating HD maps. ### Data Engineer @ Meta Jan 2021 – Jan 2022 - Created large scale batch data processing pipelines for ads data. - Worked extensively on troubleshooting and query optimization for Presto and Spark SQL, gained deep understanding of MPP database. - Set up A/B testing experiments to analyze impact of individual features ### Software Engineer @ Zymergen, Inc. Jan 2019 – Jan 2021 | Emeryville, California Worked on data warehouse construction, and database performance improvement - Designed and Created data warehouse tables, and implemented both batch ETL solution using Airflow and streaming ETL solution using Kafka Streams - Created indexes and analyzed query execution plans to optimize SQL queries - Designed and created views to provide abstraction over tables - Analyzed query pattern from DB logs to provide suggestions for data warehouse design ### Data Scientist @ Zymergen, Inc. Jan 2017 – Jan 2019 | Emeryville, California Developed machine learning pipeline with Airflow: - Worked with statisticians to build machine learning model using Scikit-learn and Stan - Created machine learning pipeline related data schema using Avro - Extracted data using SQL query, and applied complex data transformation using Pandas - Stored model prediction and parameters to S3, and retrieved with Athena ### Research Associate @ The Ohio State University Jan 2016 – Jan 2016 | Columbus, Ohio Area Managed terabytes of scientific data for Polar Earth Observing Network (POLENET) project: - Designed and created tables in PostgreSQL to store GPS time series data - Server maintenance including database backup and statistics update - Applied regression model to extract long-term trend and seasonal pattern in time series data ## Education ### Master's Degree in Geology/Earth Science, General The Ohio State University ### Bachelor's Degree in Geological and Earth Sciences/Geosciences Zhejiang University ## Contact & Social - LinkedIn: https://linkedin.com/in/jiechen1 --- Source: https://flows.cv/jiec JSON Resume: https://flows.cv/jiec/resume.json Last updated: 2026-04-01