# HONGYU LU

> Staff Software Engineer — AI Infra & HPC | LLM Serving (SGLang / TensorRT-LLM | E-commerce Search Architecture (Recall/Ranking/Indexing)

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/hongyulu

## Work Experience
### Staff Software Engineer @ ByteDance
Jan 2020 – Present | San Jose, California, United States
Staff Software Engineer – AI Infrastructure (LLM Inference & RL Systems)
Led the architecture and optimization of large-scale LLM inference systems, focusing on long-context serving, speculative decoding, and prefill/decode disaggregation to improve latency, throughput, and cost efficiency.
Built a high-performance LLM serving stack with deep integration of SGLang, TensorRT-LLM, and custom GPU optimizations (KV-cache management, batching, scheduling), enabling efficient serving of large models under production constraints.
Designed advanced request routing and scheduling strategies for search-driven multi-request workloads, ensuring high GPU utilization and strict latency SLOs.
Developed and scaled RL training infrastructure (Verl, AReaL), supporting SFT, RLHF, and agentic RL workloads, improving training efficiency and system scalability.
Built an end-to-end e-commerce search system from scratch, including recall, ranking, indexing, and data feedback loops.

### Software Engineer @ MEGVII旷视
Jan 2018 – Jan 2020 | Shanghai, China
Responsible for building a storage system for small data storage. Build core systems
including data storage, data consistency, system reliability, etc.
Responsible for building a search engine. Build the core systems including high
performance computing, distributed storage, system reliability, etc.


## Education
### Master in Computer Science
Shanghai Jiao Tong University

### Bachelor in Physical Science
Shanghai Jiao Tong University


## Contact & Social
- LinkedIn: https://linkedin.com/in/hongyu-lu-431392298

---
Source: https://flows.cv/hongyulu
JSON Resume: https://flows.cv/hongyulu/resume.json
Last updated: 2026-04-12