# Leting Ni

> MLE Intern @ DoorDash | CS @ UCSD | ECE+AI @ UofT | Ex MLE @ Huawei | Ex SWE @ Lepton

Location: San Francisco, California, United States
Profile: https://flows.cv/leting

## Work Experience
### Machine Learning Engineer @ DoorDash
Jan 2025 – Jan 2025 | 旧金山, CA
• Designed and deployed a probabilistic ETA prediction model using DoorDash’s NextGen architecture, replacing single-point baselines with non-parametric distribution-based forecasts to capture uncertainty.
• Led multiple iterations of Base Layer model experiments, improving MAE by 3% and CRPS by 7% over production benchmarks.
• Explored and evaluated 20+ modeling strategies and distance-aware loss functions for predicting non-parametric distributions to balance accuracy, long-tail performance, and reliability at scale.
• Optimized Decision Layer model parameters through grid search, raising On Time Accuracy by 0.41% and reducing 20-min Lateness by 3.1% without degrading other business metrics.
• Delivered two superior model candidates launched as online experiments serving 5M+ daily orders, now progressing toward full production rollout.

### Software Engineer @ Lepton AI
Jan 2024 – Jan 2024
• Led full-stack development for an RAG-based conversational search engine and an LLM-driven Slack Bot, enabling real-time intelligent query handling for enterprise users.
• Developed an LLM-based Slack Bot, delivering context-aware responses (drawn from search results, chat history, and file attachments).
• Enhanced Lepton Search by integrating SearXNG as search backend and incorporating WizardLM and Llama3.
• Optimized query concurrency and data pipelining, reducing average LLM response times
while supporting concurrent user queries.
• Containerized Slack Bot server with Docker and deployed both SearXNG backend and Slack App on Kubernetes.

### Research Assistant @ University of Toronto
Jan 2023 – Jan 2024 | Toronto, Ontario, Canada
• Pioneered a real-time BCI signal classification pipeline, enabling rapid detection and visualization of neural responses with minimal latency.
• Extended OpenBCI codebase to handle real-time signal data ingestion and filtering from OpenBCI
headsets.
• Crafted a CNN-based model in ONNX to classify P300 wave signals at 82% accuracy.
• Incorporated classifier into OpenBCI GUI, ensuring live feedback within 100ms latency.

### Machine Learning Research Engineer @ Huawei Canada
Jan 2022 – Jan 2023 | Markham, Ontario, Canada
• Spearheaded development and integration of advanced Computer Vision pipelines (YOLOv7) and Neural Rendering (Nerfstudio) into full-stack solutions, boosting detection and interactive 3D capabilities.
• Adapted Nerfstudio into a Flask–React web interface.
• Developed and fine-tuned a YOLOv7-based custom hand detector for motion-blurred videos.
• Integrated legacy PyTorch code with MindSpore.
• Led data collection, hyperparameter tuning, and environment setup for projects using YOLO, MANO, ViT, and NeRF.

### Research Assistant @ University of Toronto
Jan 2021 – Jan 2021 | Toronto, Ontario, Canada
• Contributed to a distributed inference framework for deep learning models using PyTorch and Node.js, focusing on partition algorithms, performance optimization, and scalability.
• Designed and implemented a partition algorithm for convolution layers in distributed YOLOv5 inference.
• Applied linear scheduling to distributed inference tasks, cutting overall latency by 41% on various models.
• Collaborated on a Node.js-based orchestration module with Prof. Li and Ph.D. students, ensuring seamless scaling and communication for large-scale AI workloads.


## Education
### Master of Science - MS in Computer Science
UC San Diego

### Bachelor of Applied Science - BASc in Computer Engineering
University of Toronto

### Nanjing Foreign Language School


## Contact & Social
- LinkedIn: https://linkedin.com/in/leting-ni

---
Source: https://flows.cv/leting
JSON Resume: https://flows.cv/leting/resume.json
Last updated: 2026-03-29