Experience
2025 — 2025
旧金山, CA
Designed and deployed a probabilistic ETA prediction model using DoorDash’s NextGen architecture, replacing single-point baselines with non-parametric distribution-based forecasts to capture uncertainty.
Led multiple iterations of Base Layer model experiments, improving MAE by 3% and CRPS by 7% over production benchmarks.
Explored and evaluated 20+ modeling strategies and distance-aware loss functions for predicting non-parametric distributions to balance accuracy, long-tail performance, and reliability at scale.
Optimized Decision Layer model parameters through grid search, raising On Time Accuracy by 0.41% and reducing 20-min Lateness by 3.1% without degrading other business metrics.
Delivered two superior model candidates launched as online experiments serving 5M+ daily orders, now progressing toward full production rollout.
2024 — 2024
Led full-stack development for an RAG-based conversational search engine and an LLM-driven Slack Bot, enabling real-time intelligent query handling for enterprise users.
Developed an LLM-based Slack Bot, delivering context-aware responses (drawn from search results, chat history, and file attachments).
Enhanced Lepton Search by integrating SearXNG as search backend and incorporating WizardLM and Llama3.
Optimized query concurrency and data pipelining, reducing average LLM response times
while supporting concurrent user queries.
Containerized Slack Bot server with Docker and deployed both SearXNG backend and Slack App on Kubernetes.
Toronto, Ontario, Canada
Pioneered a real-time BCI signal classification pipeline, enabling rapid detection and visualization of neural responses with minimal latency.
Extended OpenBCI codebase to handle real-time signal data ingestion and filtering from OpenBCI
headsets.
Crafted a CNN-based model in ONNX to classify P300 wave signals at 82% accuracy.
Incorporated classifier into OpenBCI GUI, ensuring live feedback within 100ms latency.
Markham, Ontario, Canada
Spearheaded development and integration of advanced Computer Vision pipelines (YOLOv7) and Neural Rendering (Nerfstudio) into full-stack solutions, boosting detection and interactive 3D capabilities.
Adapted Nerfstudio into a Flask–React web interface.
Developed and fine-tuned a YOLOv7-based custom hand detector for motion-blurred videos.
Integrated legacy PyTorch code with MindSpore.
Led data collection, hyperparameter tuning, and environment setup for projects using YOLO, MANO, ViT, and NeRF.
Toronto, Ontario, Canada
Contributed to a distributed inference framework for deep learning models using PyTorch and Node.js, focusing on partition algorithms, performance optimization, and scalability.
Designed and implemented a partition algorithm for convolution layers in distributed YOLOv5 inference.
Applied linear scheduling to distributed inference tasks, cutting overall latency by 41% on various models.
Collaborated on a Node.js-based orchestration module with Prof. Li and Ph.D. students, ensuring seamless scaling and communication for large-scale AI workloads.
Education
UC San Diego
Master of Science - MS
University of Toronto
Bachelor of Applied Science - BASc
Nanjing Foreign Language School