# Siyuan Zhu > Data Scientist & Data Engineer| Experienced in Python, SQL, AWS, and Machine Learning | Columbia MS Applied Analytics Location: Brooklyn, New York, United States Profile: https://flows.cv/siyuanzhu I'm a Master’s student in Applied Analytics at Columbia University, actively seeking full-time opportunities as a Data Scientist, Data Analyst, or Data Engineer starting in 2025. With hands-on experience at Publicis Groupe, Alibaba, and a fintech startup, I’ve built end-to-end solutions using Python, SQL, AWS, PyTorch, Spark, and LLMs (GPT, LLaMA2, BERT)—ranging from NLP-driven medical diagnosis engines to cloud-based recommendation systems and financial analytics platforms. Technical Highlights: ✅ Languages & Tools: Python (pandas, scikit-learn, PyTorch), SQL, R, Java, MongoDB, PostgreSQL ✅ ML & NLP: XGBoost, SVD, LSTM, BERT/ClinicalBERT, sentiment analysis, recommendation systems ✅ Cloud & Big Data: AWS (Lambda, EC2, S3), Oracle Cloud, Spark, Kafka, Flink, DBT ✅ Visualization: Tableau, Power BI, matplotlib, seaborn, Streamlit ✅ Data Engineering: ETL pipelines, data warehousing, cross-platform integration I enjoy building scalable, data-driven systems that solve real-world problems and deliver measurable business impact. Native in Mandarin, I’m also passionate about hiking, cooking, and exploring innovations in AI and analytics. 📬 Let’s connect on LinkedIn or reach out directly at zhusiyuan0101@gmail.com. ## Work Experience ### Software Engineer @ General Motors Jan 2025 – Present | Austin, Texas, United States ### Data Engineer @ Cmind Inc Jan 2024 – Present | Boston, MA • Designed and automated ETL/ELT data pipelines (Python, SQL) to ingest and transform 20k+ financial data sources from REST APIs (e.g., FRED, Morningstar) into Oracle/PostgreSQL Autonomous Datawarehouse, achieving 99.9% reliability. • Migrated MongoDB to a centralized data warehouse on Oracle Cloud and built cross-database synchronization for seamless data integration, implemented data quality checks, scheduled daily updates, incident alerts (Slack, E-mail), and failure retry logic, reducing product error rate by 90%. • Engineered CSORE, a profitability composite indicator synthesizing peer group metrics and financial indicators reflecting company profitability and quarterly performance to improve EPS (Earnings Per Share) prediction accuracy. • Applied NLP (prompt engineering, few-shot learning with GPT-4o, FinBERT) to extract evasiveness/bullishness/sentiment from earnings call transcripts, enhanced EPS predictive modeling and integrated results into warehouse tables for downstream BI. ### Data Scientist Intern @ Publicis Groupe Jan 2023 – Jan 2023 | Shanghai, China • Analyzed advertising strategies and monitored competitive placements, helping secure key client order renewals and win new pitches. • Preprocessed data (Pandas) and applied regression models (linear regression, decision tree, XGBoost) to remove outliers, enhancing data quality and reliability in strategic marketing decisions. • Evaluated various Time Series Models (ARIMA, LSTM.) to predict traffic trends and choose promising video types for traffic allocation. • Managed data (SQL) and built dashboards (Tableau) delivering market trend insights and performance benchmarks for 10+ clients. ### Teaching Assistant of CSE 3241: Introduction to database @ The Ohio State University College of Engineering Jan 2022 – Jan 2022 | Columbus, Ohio, United States • Collected feedback from students, corrected 200+ weekly SQL assignments, rendered weekly reports to the professor • Held an office hour twice a week to assist students in learning algorithms and concepts of database systems and to answer students’ questions in SQL and database design projects ### Data Analyst Intern @ Alibaba Group Jan 2022 – Jan 2022 • Developed an RFM model in R to segment 23k+ customers into 8 categories, enabling targeted marketing strategies. • Analyzed 7M shopping records from 44k users on Taobao to identify key behaviors driving customer retention. • Presented retention insights and actionable recommendations to high-level stakeholders, improving decision-making. ## Education ### Master of Science - MS in Applied Analytics Columbia University Jan 2023 – Jan 2024 ### Undergraduate in Data Analytics The Ohio State University Jan 2018 – Jan 2022 ## Contact & Social - LinkedIn: https://linkedin.com/in/siyuan-zhu --- Source: https://flows.cv/siyuanzhu JSON Resume: https://flows.cv/siyuanzhu/resume.json Last updated: 2026-04-01