# Moya Zhu > Software Engineer @ Nvidia ♦ ex-Amazon Robotics Data Science intern Location: San Francisco Bay Area, United States Profile: https://flows.cv/moya I am a Software Engineer with strong Machine Learning and Data Science background. At Shoreline.io, I design and implement semantic search solutions for website and remediation runbook catalog using large language models and embedding techniques. Prior to joining Shoreline.io, I completed my MS degree in Data Science from Columbia University, where I learned and applied various data analysis, machine learning, and deep learning methods to text, image, audio, and protein structure data. I also gained valuable industry experience as a data science intern at Amazon Global Robotics and UNDP, where I designed and implemented data preprocessing and modeling pipelines using AWS SageMaker. I am open for job opportunities in Software Engineer, machine learning engineering, and data analytics in general. ## Work Experience ### Software Engineer @ NVIDIA Jan 2024 – Present | Santa Clara, California, United States ### Software Engineer @ Shoreline.io (Acquired by NVIDIA) Jan 2023 – Jan 2024 | Redwood City, California, United States Semantic Search for website and runbook catalog • Designed and applied four comprehensive GPT prompt-based test metrics to assess search results and embedding models • Evaluated embedding models(Ada, Multiqa), chunked and embedded runbooks into FAISS and sqlite databases using langchain • Established a Git workflow for automated embedding files generation and Docker image creation for deployment • Implemented a responsive search endpoint for 800 runbooks with associated commands, achieving 500 ms response time • Developed server-specific custom search index workflow, allowing users to perform searches on their private runbooks using an optimistic concurrency control approach, caching with S3 storage and Redis background workers • Mentored and supervised two interns through onboarding, collaborating to successfully launch search endpoint on production End-to-end data pipelines for ticketing tools(Pagerduty, Opsgenie, Servicenow) • Created and configured multiple Airbyte connectors(TypeScript) and data streaming scripts for data extraction, transformation, daily synchronization, and loading into PostgreSQL, providing user with near-instance data access • Unified diverse data sources schema and migrated 14000 tables using Flask and SQLAlchemy, optimizing codebase clarity • Implemented interactive UI data visualization reports(React.js), transforming backend SQL queries into filter-enabled graphs • Enabled runbook generation utilizing GPT-3.5 and prompt techniques to auto-create Generative AI incident remediation ### Data Science Intern @ Amazon Global Robotics Jan 2022 – Jan 2022 • Designed and implemented data preprocessing and modeling pipeline using AWS SageMaker for dynamically chute mapping update on sortation floor to replace expensive runtime simulator • Embedded high dimensional destination distribution data with station spatial location into multi-channel tensors and scatter images with aggregation to extract floor information, built surrogate models with Multi-layer Perceptron that improved 28.8% MAE and 0.527 R2 from Baseline to predict package throughput evaluation metrics • Compared approaches performance with PCA reduced Linear Regression, MLP, CNN and pretrained networks, interpreted features measures floor congestion, and developed recommendations for further modeling improvement ### Data Visualization Scientist Intern @ UNDP Jan 2022 – Jan 2022 Analyzed the website data impact and visualized into dashboard by using web scrapping and Google Analytic tools ### FairVision researcher @ Purdue University CAM2 Jan 2020 – Jan 2022 Co-lead of the FairVision Cam2 subteam. Utilizing Generative Adversarial Network(GAN) to remove image dataset biases in the training data of image recognition and classification algorithms. ### S.O.S challenge participant @ TechPoint Jan 2020 – Jan 2020 - Developed a one-stop-shop COVID-19 web-app that providing COVID forecasts, News, State Policies, and Realtime safe place to travel with the team - Prototyping and design the UI of the app - Implementing all features into the app and ensure interactive and smooth transition - Creating Policy checker feature by collecting states social distancing policy and visualizing into interactive geomap ### Computational Biomolecular & Mesoscopic Physics Group-Research Assistant @ Purdue University Jan 2020 – Jan 2020 • Analyzing and superimposing protein structures with Python Bio.PDB and calculating RMSD between structures • Manipulating, scratching, and visualizing protein residue sequence using software including Pymol, Chimera and Discovery Studio ## Education ### Master of Science - MS in Data Science Columbia University Jan 2021 – Jan 2022 ### Bachelor of Science - BS in Applied Statistics and Data Science Purdue University Jan 2017 – Jan 2021 ## Contact & Social - LinkedIn: https://linkedin.com/in/moya-zhu-b93283173 --- Source: https://flows.cv/moya JSON Resume: https://flows.cv/moya/resume.json Last updated: 2026-03-22