Experience
2024 — Now
Seattle, Washington, United States
Designed an OpenAI-style Responses API and server architecture (ReAct loop, multi-turn, multi-tool, conversation
management), standardizing agent execution across runtimes and enabling consistent client integration across GenAI
workloads. Adopted by 40 services; processes around ~1K requests/min.
Led tool invocation integration end-to-end: implemented multi-tool orchestration including MCP tools, function calling,
and web search; coordinated cross-team delivery to provide a consistent developer experience while enforcing server-side
tool governance.
Drove throughput and safety improvements: implemented streaming tool-call execution and token/size limit enforcement
to prevent runaway executions and large-payload failures, improved p95 latency 20%; reduced tool timeouts 65%
Led Long-Term Memory (LTM) design with Product + Data Science; produced an end-to-end architecture for offline
memory extraction plus online retrieval from user history to enable personalized agent behavior.
Implemented a reusable worker service (async job dataplane) for reliable offline extraction with leases/retries/backoff;
designed as a general platform for future async workloads, with capacity ~4.3M jobs/day.
Developed RAG tooling for agent services, implementing retrieval mechanisms on Oracle Database and OCI OpenSearch
to provide fast, relevant knowledge access.
Built RAG agent-space isolation (per-agent “space” deployed as an isolated pod/runtime boundary) to improve security and
reliability under concurrent workloads, launching thousands of agents per month.
Enabled multi-vendor model selection for agent generation (across OpenAI GPT, Google Gemini, and Meta Llama),
including per-agent configuration and safe defaults to support enterprise requirement
2023 — 2024
Seattle, Washington, United States
Developed and maintained key features for the TikTok Shop Merchant platform using React and TypeScript.
Optimized homepage TikTok shop card loading speed, reducing skeleton screen time from 4s to under 500ms (8x
improvement).
Collaborated with PM and UX teams to enhance user onboarding experience, leading the design and independent
implementation. Increased the first-time successful on-boarding rate from 5.61% to 16.2% and reduced user time spent by
32.9%.
Seattle, Washington, United States
Data center core service. Implemented automation region build pipeline. Accelerating region build process
2020 — 2021
Seattle
2019 — 2019
Seattle, WA
Worked as full-stack engineer in Amazon sagemaker(AI platform)
implemented a self-publish feature to help data scientists publish their model
Backend: Scala
Frontend: ReactJS | Redux | CSS |
AWS:DyanmoDB | SNS | S3
Education
2019 — 2019
UC Irvine
Master of Science - MS
2019 — 2019
2014 — 2018
Southeast University
学士
2014 — 2018