# Leting Ni > MLE Intern @ DoorDash | CS @ UCSD | ECE+AI @ UofT | Ex MLE @ Huawei | Ex SWE @ Lepton Location: San Francisco, California, United States Profile: https://flows.cv/leting ## Work Experience ### Machine Learning Engineer @ DoorDash Jan 2025 – Jan 2025 | 旧金山, CA • Designed and deployed a probabilistic ETA prediction model using DoorDash’s NextGen architecture, replacing single-point baselines with non-parametric distribution-based forecasts to capture uncertainty. • Led multiple iterations of Base Layer model experiments, improving MAE by 3% and CRPS by 7% over production benchmarks. • Explored and evaluated 20+ modeling strategies and distance-aware loss functions for predicting non-parametric distributions to balance accuracy, long-tail performance, and reliability at scale. • Optimized Decision Layer model parameters through grid search, raising On Time Accuracy by 0.41% and reducing 20-min Lateness by 3.1% without degrading other business metrics. • Delivered two superior model candidates launched as online experiments serving 5M+ daily orders, now progressing toward full production rollout. ### Software Engineer @ Lepton AI Jan 2024 – Jan 2024 • Led full-stack development for an RAG-based conversational search engine and an LLM-driven Slack Bot, enabling real-time intelligent query handling for enterprise users. • Developed an LLM-based Slack Bot, delivering context-aware responses (drawn from search results, chat history, and file attachments). • Enhanced Lepton Search by integrating SearXNG as search backend and incorporating WizardLM and Llama3. • Optimized query concurrency and data pipelining, reducing average LLM response times while supporting concurrent user queries. • Containerized Slack Bot server with Docker and deployed both SearXNG backend and Slack App on Kubernetes. ### Research Assistant @ University of Toronto Jan 2023 – Jan 2024 | Toronto, Ontario, Canada • Pioneered a real-time BCI signal classification pipeline, enabling rapid detection and visualization of neural responses with minimal latency. • Extended OpenBCI codebase to handle real-time signal data ingestion and filtering from OpenBCI headsets. • Crafted a CNN-based model in ONNX to classify P300 wave signals at 82% accuracy. • Incorporated classifier into OpenBCI GUI, ensuring live feedback within 100ms latency. ### Machine Learning Research Engineer @ Huawei Canada Jan 2022 – Jan 2023 | Markham, Ontario, Canada • Spearheaded development and integration of advanced Computer Vision pipelines (YOLOv7) and Neural Rendering (Nerfstudio) into full-stack solutions, boosting detection and interactive 3D capabilities. • Adapted Nerfstudio into a Flask–React web interface. • Developed and fine-tuned a YOLOv7-based custom hand detector for motion-blurred videos. • Integrated legacy PyTorch code with MindSpore. • Led data collection, hyperparameter tuning, and environment setup for projects using YOLO, MANO, ViT, and NeRF. ### Research Assistant @ University of Toronto Jan 2021 – Jan 2021 | Toronto, Ontario, Canada • Contributed to a distributed inference framework for deep learning models using PyTorch and Node.js, focusing on partition algorithms, performance optimization, and scalability. • Designed and implemented a partition algorithm for convolution layers in distributed YOLOv5 inference. • Applied linear scheduling to distributed inference tasks, cutting overall latency by 41% on various models. • Collaborated on a Node.js-based orchestration module with Prof. Li and Ph.D. students, ensuring seamless scaling and communication for large-scale AI workloads. ## Education ### Master of Science - MS in Computer Science UC San Diego ### Bachelor of Applied Science - BASc in Computer Engineering University of Toronto ### Nanjing Foreign Language School ## Contact & Social - LinkedIn: https://linkedin.com/in/leting-ni --- Source: https://flows.cv/leting JSON Resume: https://flows.cv/leting/resume.json Last updated: 2026-03-29