# Catherine Shen > Software Engineer at Plaid Location: San Francisco Bay Area, United States Profile: https://flows.cv/catherineshen Passionate in software engineering and proficient in Java and Python programming Currently working on several Cloud-based data applications. With work experience in machine learning data engineering and software engineering. ▪ Language: Java, Python, Scala ▪ Data Engineering: MySQL, Cassandra, Hadoop, MapReduce, Spark, Pig, Hive, Kafka ▪ Tools and Platform: AWS, GCP, Git, Bash, Docker, Django, Flask, Dash ▪ Front End: Javascript, jQuery, HTML&CSS ▪ Data Visualization: Tableau, PowerBI, D3.js ## Work Experience ### Software Engineer @ Plaid Jan 2024 – Present | San Francisco Bay Area Storage Infrastructure Design, implement, and maintain robust distributed storage systems, optimizing scalable database solutions like TiDB and MongoDB for large-scale data operations ### Software Engineer @ Opendoor Jan 2021 – Jan 2024 | San Francisco Bay Area Data platform / Data Infrastructure / Observability / DevOps Change Data Capture / DBT / Data Quality Framework / Airflow / Kubernetes ### Senior Software Engineer @ Palo Alto Networks Jan 2020 – Jan 2021 | San Francisco Bay Area Building Data infrastructure and Big Data platform • Backend: Java Spring, Kafka, Kubernetes, GCP • Pipeline: Spark, Python, Airflow, Prometheus ### Software Engineer @ Earnin Jan 2019 – Jan 2020 | San Francisco Bay Area Infrastructure Building end to end Data science platform • AWS Kinesis, Spark, Lambda, DynamoDB, Kubeflow, Kubernetes, Airflow, Jenkins ### Alumni Consultant @ Insight Data Science Jan 2019 – Jan 2020 Mentor data engineer fellows on their insight projects. ### Data Engineering Fellow @ Insight Data Science Jan 2019 – Jan 2019 | San Francisco Bay Area Implemented a batch data processing platform using HDFS, Spark to analyze 3TB GitHub event data for users to find social influencers within GitHub network • AWS, Spark, HDFS, S3, Airflow, Flask ### Graduate Teaching Assistant @ University of Maryland Jan 2018 – Jan 2018 | Washington D.C. Metro Area - Designed 2 labs and mentored graduate students for Big Data Course • Cloud computing Lab: AWS Lambda, Route53, DynamoDB, SageMaker and S3 • Apache Spark Lab: Introduction to Apache Spark, SparkML, Spark Streaming ### Research Assistant @ University of Maryland Jan 2017 – Jan 2018 | Washington D.C. Metro Area - Designed Human Affect Analytics Pipeline on AWS Built data pipelines that collect, process, and compute emotion analysis using OpenCV in python, developed deep-learning models on automatic human emotion detection using OpenCV and Tensorflow - Dockerized ML models and deployed model with AWS lambda, S3 and EC2 - Wrote a python package for automating social network mining process using asynchronous programming and distributed web scraping - Implemented sentiment analysis using Scikit-learn, spaCy and StatsModels ### Data pipeline Engineer @ Fuchun Oriental Real Estate Investment. Jan 2015 – Jan 2017 | Guangzhou,China - Data Integration, Data warehouse, ETL pipelines • Python, SQL, SSIS, MS SQL Server ## Education ### Master of Science in Business Statistics in Data Science & Artificial Intelligence University of Maryland ### Information System & Statistics UCLA ### Data Engineering on Google Cloud Platform Specialization in Cloud Engineering Google pour les pros ### Full Stack Web Developer Nanodegree in Computer Science Udacity ### Bachelor’s Degree in Economics, Information System Guangdong University of Foreign Studies ## Contact & Social - LinkedIn: https://linkedin.com/in/chuqiao-catherine-shen - Website: https://catherine-shen.medium.com/ --- Source: https://flows.cv/catherineshen JSON Resume: https://flows.cv/catherineshen/resume.json Last updated: 2026-03-23