# Yachao Lu > Principal Software Engineer at Roblox Location: San Mateo, California, United States Profile: https://flows.cv/yachao I'm interested in architect large scale distributed system and complex algorithms. Feel proud of the system I built and constantly improves my designs and architectures. PHD thesis focus on Deep Web Data Analytics(Web Hidden Database Exploration). ## Work Experience ### Principal Software Engineer @ Roblox Jan 2020 – Present | San Francisco Bay Area * Scale up 10+ Millions QPS distributed system on critical path of content platform of Roblox * Designed and build entire data ingestion pipeline from 0 of in Roblox with one of the largest matchmaking traffic in this world, including counters, logging, SNS/SQS data process, Elasticsearch, Hadoop/HDFS * Design & build brand new ML online prediction services integration with legacy Net Core applications with ONNX. Horizontal scaled and optimized with low latency & high availability requirement ### Senior Software Engineer @ JD.COM Jan 2018 – Jan 2020 | San Francisco Bay Area * Built experimentation platform, enable A/B testing and result monitor for scientists as easy as click buttons; * Built core components of innovative AutoML time-series Prediction Platform in production environment, automates the entire time series prediction workflow in autoML way. Technology including: * Spring Boot, DASK, Vue, Express * Redis, MySql, Spark Cluster * Full-stack web application, Distributed Data pipeline, Microservices, AutoML ### Graduate Research Assistant @ George Washington University Jan 2013 – Jan 2018 Tech Part: * Designed and developed application Timbr, a large-scale Meteor application that turns an arbitrary web page into structure JSON data; plug-and-play APIs and interfaces for users without programming experience (i.e., automatic web crawler/scraper) * Used Vue.js and D3.js on the front-end component of Timbr to visualize/collect user’s action toward target websites, provide view components and dynamically visualize the JSON data generated * Used Node and Electron on the back-end component of Timbr to render pages and mimic user’s actions to crawl/sample web pages * Deployed Timbr on both AWS and local servers to crawl/sample millions of web pages from sites such as eBay, Amazon, GitHub Research Part: * Focus on Markov Chain Monte Carlo(Random Walk) method for real world web data mining, sampling from hidden database of online social networks, graph theory and web information retrieval * Make aggregate estimation over big picture of the datasets of online social network based on the samples we capture, e.g., Total number of the users in Facebook. * Work on speeding up about 50 percent of convergence rate of the state-of-art random walk sampling method on both Graphical models and Bayesian Hierarchical Models * Using revised Timbr showing the on-the-fly sampling and estimation process for random walk methods on real world websites, like Amazon and eBay ### Data Scientist Intern @ District Department of Transportation (DDOT) Jan 2017 – Jan 2017 | Washington D.C. Metro Area * Communicated with internal divisions and leveraged data analysis to optimize the average roadway operation time and decrease errors (e.g., missing data error, duplicate data error, etc. ) in MS SQL Server. * Built pipeline (MS SQL Server and Python) to process real-time traffic event/accident data, and streamlined the workflow including data extraction, cleaning and exploratory analysis. * Developed and automated dashboards and visualizations by using Tableau and Python packages (Matplotlib and Smtplib). * Deployed Machine Learning models on traffic accidents data to better prioritize the CCTV cameras on monitoring screen for Transportation Management Center of DDOT. * Created maps and layers containing hundreds of communication networks(routing , detection devices) using ArcGIS. ## Education ### Doctor of Philosophy (Ph.D.) in Computer Science The George Washington University ### Master's Degree in Computer Science The George Washington University ### Bachelor of Applied Science (B.A.Sc.) in Computer Software Engineering Nanjing University of Posts and Telecommunications ## Contact & Social - LinkedIn: https://linkedin.com/in/yachao-lu --- Source: https://flows.cv/yachao JSON Resume: https://flows.cv/yachao/resume.json Last updated: 2026-04-12