# Fanyue Xia > Software Engineer at Databricks Location: United States, United States Profile: https://flows.cv/fanyue ## Work Experience ### Software Engineer at Databricks @ Databricks Jan 2024 – Present | Mountain View, California, United States Spark Structured Streaming ### Teaching Assistant 15418 Parallel Computer Architecture @ Carnegie Mellon University School of Computer Science Jan 2023 – Present | Pittsburgh, Pennsylvania, United States - Developed course project optimizing code under message passing model using OpenMPI. - Regularly held office hour to help students debug/profile their codes and offer additional conceptual reviews. ### Software Engineer Intern @ Google Jan 2023 – Jan 2023 | Sunnyvale, California, United States - Developed a service that supports same container task migration for Borg jobs (with Borg public APIs, Spanner DB and Queue, Scaffolding, gMock for testing). - The service supported Borg job upgrade without rescheduling outside the container, thereby reducing downtime and saving resources when new packages need to be installed or updated. - Scaled out and improved efficiency by having multiple processors of state machine to support concurrent job updates. ### Research Assistant at CAOS Lab @ Carnegie Mellon University School of Computer Science Jan 2023 – Jan 2023 | Pittsburgh, Pennsylvania, United States - Worked with Prof. Dimitrios on benchmark the performance increase of hugepage usage in virtual machine. - Benchmarked MEMCACHED workload in different settings of hugepage usages for both hosts and guests. - Found out hugepage benefits might come from reduced page walk cycles more than TLB misses, contrary to previous expectation. ### Teaching Assistant 15445 Databasse @ Carnegie Mellon University School of Computer Science Jan 2023 – Jan 2023 | Pittsburgh, Pennsylvania, United States - Developed course projects (Bustub Database), designed two phase locking protocols with different isolation level and autograder testcases. - Led weekly office hours to help students debug and additional office hours to clarify complex database concepts ### Research Assistant at Living Edge Lab @ Carnegie Mellon University School of Computer Science Jan 2022 – Jan 2022 | Pittsburgh, Pennsylvania, United States - Worked with Prof. Satya to support cloud native application to discover nearby cloudlets for low-latency edge computation. - Experimented with linux network namespace to set up Wireguard VPN as a user without root privilege for message encryption ### STEP Intern @ Google Jan 2022 – Jan 2022 | Mountain View, California, United States - Developed a handy tool to automatically help clients select more efficient data retrieval method. - Constructed a data processing pipeline (Flume) that scales the above tool to support clients with hundreds of millions of rows in a table. - Identified all clients capable of performance improvement via the above pipeline; quantified the possible loading time reduction (75%). ### Research Assistant @ Carnegie Mellon University Jan 2021 – Jan 2022 | Pittsburgh, Pennsylvania, United States - Constructed a data processing pipeline (python and pandas) to transform noisy raw data from csv files into clean training, testing datasets for machine learning algorithm - Defined metrics and quantified the effects of promotions on sales trends; distinguished promotions with large effects; generated an additional promotion-effect feature; compared prediction accuracy with and without that feature - Applied k-means clustering, grouping similar customers, to discover statistics patterns within the same cluster - Facilitated more accurate prediction of future sales based on real-time signals collected as a result ## Education ### Bachelor of Science - BS in Computer Science Carnegie Mellon University School of Computer Science ## Contact & Social - LinkedIn: https://linkedin.com/in/fanyue-xia-380754221 --- Source: https://flows.cv/fanyue JSON Resume: https://flows.cv/fanyue/resume.json Last updated: 2026-04-05