Waltham, Massachusetts, United States
2024 — 2025
Waltham, Massachusetts, United States
Building Recommendation and Search Systems, MCP, Agents for ZoomInfo
● Built and owned the Contact Recommendation public API for ZoomInfo, serving 1M+ users in production and handling 700 QPS at peak load with P95 latency of 30ms. This system is a vector search engine that provides personalized contact recommendations (across 300M contacts) for sellers to engage with.
● Built streaming pipelines that process ~100M records daily for the ZoomInfo Homepage to recommend accounts to sellers.
● Large Language Model Serving: Built and maintained the ZoomInfo self-hosted LLM platform providing large-scale inference services (35k tokens/s) with TGI/vLLM, reducing costs by 79% compared to vendor APIs.
● Large Language Model Fine-tuning: Built and maintained the ZoomInfo self-hosted LLM platform providing LoRA fine-tuning services to serve internal teams building custom models.
● MLOps: Designed and operated production ML pipelines using Airflow, implemented monitoring and alerting with Datadog, and built CI/CD workflows with Jenkins and ArgoCD while managing infrastructure with Terraform.
● Built a Python MCP framework to enable internal teams to adopt existing services as MCPs.
● Built and managed several MCP servers used across the team.
Boston, Massachusetts, United States
Building the GenAI Hiring Platform for Merck
● Platform Development: Led a team to build the GenAI hiring platform from scratch, covering design, experimentation, implementation, and testing, in compliance with software development lifecycle (SDLC) standards.
● Efficiency Improvement: The platform saved 2 hours per job requisition creation.
● Scalability: Ensured the platform provided a smooth experience for 15,000 hiring managers generating job requisitions daily.
● Primary Responsibilities: Developed REST APIs, designed data pipelines (Databricks), and managed databases (RDS, Athena, DynamoDB).
● Secondary Responsibilities: Developed recommendation systems utilizing Large Language Models (LLMs) and a lot of prompt engineering with Langchain.
2021 — 2024
Boston, Massachusetts, United States
Deliver end-to-end AI solutions(ETL, model development, deployment)
● Time-series: Developed data pipelines with machining records of 12 million rows via Databricks, developed regression models, and deployed as Azure functions.
● Time-series: Developed Fraud detection model to identify anomalous activities in transaction data.
● NLP: Built entity recognition models to reduce 90% keying time for a financial company with MLflow pipelines to monitor model performance.
● NLP: Built an end-to-end document clustering(K-means, DB scan, t-SNE) pipeline to reduce 70% of human labor for a financial company with AWS Sagemaker.
● CV: Built an end-to-end video classification pipeline to reduce 80% of incorrectly broadcasted streaming content for a streaming company via Azure ML.
2019 — 2021
McLean, Virginia, United States
Develop features for an MLOps platform.
● Creating standard project templates that incorporates MLOps, including CI/CD, monitoring, retraining
● Build tools for governance of model lifecycle and performance, including memory profiling, drift detection, and explainability
● Building, containerizing, and deploying ML models with Azure and AWS platforms.
Develop data and modeling pipelines for an MLOps platform
● Develop web scraping tools to extract form data from websites.
● Develop complex SQL queries for extracting information on structured data.
● Develop scripts for cleaning and processing unstructured data.
● Computer Vision and NLP model development (object detection, sentiment analysis and machine translation models)
● One paper accepted in CVPR 2021 on Explainable AI (https://arxiv.org/abs/2005.10284)
Education
Georgia Institute of Technology
Master's degree
National Cheng Kung University
Master's degree
National Chiao Tung University