Software Engineer at Meta building AI/ML infrastructure that powers model serving and capacity management at scale.
Menlo Park, CA
Building internal platform tooling that enables ML engineers across Meta to manage GPU capacity, model serving quotas, and inference resources at scale.
Led the end-to-end design and execution of the Quota Wizard, a self-service tool that simplified how ML engineers allocate and transfer GPU serving capacity. Drove adoption from 25% to 70.4% of all quota moves through MLHub, and reduced what was previously a multi-step, hours-long manual process into a single automated request that completes in minutes. Hosted training sessions for GenAI and RecSys teams and connected directly with ML platform leaders to iterate on feedback.
Executed E2E Capacity Accounting across ML infrastructure in collaboration with cross-functional teams and MLHub. Introduced model-type as a new GPU tracking unit, unifying capacity queries across Facebook-Taxonomy and Model Type — giving platform customers a single, consistent view of their resource allocation.
Led the Solver Explainability project, building a self-service "Capacity Tab" that enables ML engineers to debug capacity allocation decisions independently. Collaborated with Capacity team engineers to refine upstream data pipelines and designed the UX for surfacing allocation logic, reducing dependency on manual support.
Designed and shipped a Quota Usage Dashboard serving 700+ unique users with 285 daily events. This surfaced previously undiscoverable model-level resource utilization data with aggregated filtering and hover-card model details, enabling ML engineers to make informed capacity decisions without navigating multiple disconnected data sources.
Built a Permission Management UX for the inference capacity platform, working directly with designers to tailor the experience for ML platform customers. Drove over 600 permission updates in H2 2025, enabling teams to manage access controls without filing manual requests.
San Francisco Bay Area
Built the developer-facing APIs and AI systems that power LLM-driven NPC characters for creators in Meta Horizon Worlds.
Drove the creation of the `horizon/npc` TypeScript SDK, a new module offering 15 API methods that enable game creators to build LLM-powered conversational AI characters with custom game logic, dialogue control, and embodiment behaviors. Proposed and aligned the data structure for access control in the TS class with 4 stakeholders across 3 teams.
Led the AI Transparency launch — a cross-platform initiative required for legal and policy compliance of generative AI content. This included three interconnected systems: automated AI World tagging that programmatically labels AI-generated content, a policy acknowledgment NUX ensuring users consent to AI interaction terms, and an NPC Engagement Reception system built with Unity shaders in C# that visually guides users through 6 interaction stages with 4 distinct icons.
Built the initial LLM fallback system spanning WWW, C#, and TypeScript, ensuring graceful degradation when model inference fails or returns unexpected results. This was a critical unblock for the Shootball flagship game launch in H1 2024.
Led the youth safety strategy for AI-generated content, designing and coordinating access controls to exclude users under 13 from LLM-powered NPC interactions. This required alignment across multiple cross-functional teams including policy, legal, and platform safety.
Drove Scripted Avatars to broad creator adoption — updated avatar editing preview, VR creation flows, and configured roll-out GKs for 2P and MHCP access.
Reduced the HUR bug count by 40% as Bug POC, created the NPC Team Testing guidance doc, and served as Eng Excellence Champion maintaining test coverage above 70%. Drove AI Policy QED coverage and fixed critical issues for a Bobber Bay demo at Connect.
Raleigh-Durham-Chapel Hill Area
Feature owner of Service Map and Dashboard in InsightFinder SAAS application.
Developed data collector agent(Golang) to collect variable meric/log data from customer provided APIs, AWS S3 or Kubernetes. Employed goroutines to optimize agent efficiency, ensuring maximum data collection throughput.
Designed and implemented multiple data pipelines handling data ingestion from various sources. Implemented RESTful APIs for third-party integrations, and managed authentication using OAuth, JWT, and mTLS.
Utilized Cassandra as the data store for efficient and reliable data persistence, ensuring data durability and integrity.
Spearheaded efforts to enhance system-wide observability. Developed and integrated logic to collect data across different categories, allowing for comprehensive data aggregation, providing valuable insights for stakeholders.
2021 — 2023
Redwood City, CA
GraphStudio, the graph database development toolkit, is a highly interactive application built with Angular, TypeScript, Gin, and Golang. It processes the user interaction input and transforms it into data models that the backend database understands.
Performed Spike with Postman for new features in the early developing stage and collaborated with upstream teams (core database, infrastructure) to architect RESTful APIs.
Utilized Test-driven development methodology. Added unit tests(Jasmine) and automated integration tests(Protractor).
Improved cloud CI/CD enabling release time 2 times faster than the original process. Integrated cloud-customized code/functions (TypeScript, Golang) into the main product using build flag and environment variables.
Implemented a new visualization library for ETL data transformation. Refactor the transformation of data model and rendering logic.
Data Loading Error Report
Boosted the user data loading efficiency by 3 times with new UI, which consumes results from the kafka data pipeline.
Made a recursive HTTP request calls to the backend internal API every 10 seconds and utilized lock to handle multiple results of loading results from the distributed system.
Kept the results in the key-value store(ETCD) with cache and dynamically displayed error messages in the UI.
Data Ingestion from Cloud Platforms
Simplified the data-in logic and reduced user data importing time by 80% with new UI workflow. Supported data ingestion from AWS S3, GCP(Google Cloud Platform) and ABS(Azure Blob Storage).
Collected basic login credentials in UI and transformed them into data models used in the backend(Golang) to create the internal kafka connector.
Day to Day Work
Worked with designers and PMs to conduct user interviews and usability tests on customers/potential users using Figma.
Practiced the Agile Sprint development life cycle, and worked on the whole process of CI/CD pipeline development in the docker environment.
Chapel Hill, NC
Participate in daily maintenance of Project Gutenberg production site and the construction of new Project Gutenberg site. The production site has over 6 million downloads per month. My work here is equal to full stack engineer.
Compile htaccess files on Centos7 server to manage developing PG sitelinks. Assist in new Centos8 server deployment using Ansible for Php and Apache installation and firewall set up.
Use SQLAlchemy and PostgreSQL to implement ORM in Libgutenberg greatly facilitating the maintenance. Research into and fix long existing abnormal in search result on PG developing site. The problem is caused by Genshi template used by PG search engine Autocat3.
Test and improve CSS and HTML5 design for Gutenberg site based on the survey from over 200 users. Improve taxonomy structure and redesign Bookshelf cataloging function using Django on new PG site.
Education
2019 — 2021
The University of North Carolina at Chapel Hill
Master of Science in Information Science- MSIS
2019 — 2021
2018 — 2018
University of California, Berkeley
Bachelor's exchange student
2018 — 2018
2015 — 2019
Tianjin University
Bachelor of Management
2015 — 2019