# Carol Sun > Software Engineer @ Databricks Location: San Francisco, California, United States Profile: https://flows.cv/carolsun ## Work Experience ### Software Engineer @ Databricks Jan 2022 – Present | San Francisco, California, United States AI Systems - ML Feature Store AI Systems - Agent Framework & Evaluation AI Data Quality - Data Classification ### Software Engineer Intern @ Datadog Jan 2021 – Jan 2021 Data Visualization Library Team ### Software Engineer Intern @ Databricks Jan 2021 – Jan 2021 Machine Learning Platform Team - Feature Store - Implemented pipeline to store which features in datasets are being used to train machine learning models in the model registry - Contributed to feature packaging python class for customers to create and manage features ### STEP Intern @ Google Jan 2020 – Jan 2020 - Worked on the CDPush (Configuration and Data Push) Team, Google Cloud Core Infrastructure, collaborated with two other STEP interns - Designed a visualization feature for Googlers to compare the status of their push to previous pushes, providing a better understanding of the process of the push and a better estimation of its completion time. ### Machine Learning Intern @ OpenX Jan 2019 – Jan 2019 | Pasadena, CA - Created a machine learning model that predicts if ad requests are going to monetize using early delivery stack features - Investigated many different types of models, such as neural networks and boosted trees, using Tensorflow 2.0 - Investigated the potential of the project using Google Cloud Platform's AutoML Tables - Provided proof of concept for data science team who are actively working to create a production-level model ### Data Science Biomedical Research Intern @ National Institutes of Health (NIH): Intramural Research Program (IRP) Jan 2017 – Jan 2018 | Bethesda, Maryland - Worked at the National Institute of Neurological Disorders and Stroke under Dr. Bielekova, studying how to improve a Multiple Sclerosis molecular diagnosis test by adjusting for DNA - Ran simulations to determine the effect of standard deviation, sample size, and minor allele frequency on the power of the approach - Optimized a random-forest model that predicted the rate of progression for Multiple Sclerosis using method for adjustment ### Data Analysis Intern @ David Geffen School of Medicine at UCLA Jan 2015 – Jan 2017 | Greater Los Angeles Area - Developed a blood-based cancer diagnosis using methylation data from circulating cell-free DNA - Designed a computational method to estimate the fraction of tumor-derived cfDNA by using a maximum likelihood approach and beta distributions to model cell-free DNA in blood samples - Ran over 200 simulations, investigating the effects of sequencing depth and the fraction of tumor-derived cfDNA and predicted liver cancer status ## Education ### Bachelor of Science - BS in Computer Science Caltech ## Contact & Social - LinkedIn: https://linkedin.com/in/carolkailesun - Website: https://carolksun.github.io --- Source: https://flows.cv/carolsun JSON Resume: https://flows.cv/carolsun/resume.json Last updated: 2026-04-05