# HONGYU LU > Staff Software Engineer — AI Infra & HPC | LLM Serving (SGLang / TensorRT-LLM | E-commerce Search Architecture (Recall/Ranking/Indexing) Location: San Francisco Bay Area, United States Profile: https://flows.cv/hongyulu ## Work Experience ### Staff Software Engineer @ ByteDance Jan 2020 – Present | San Jose, California, United States Staff Software Engineer – AI Infrastructure (LLM Inference & RL Systems) Led the architecture and optimization of large-scale LLM inference systems, focusing on long-context serving, speculative decoding, and prefill/decode disaggregation to improve latency, throughput, and cost efficiency. Built a high-performance LLM serving stack with deep integration of SGLang, TensorRT-LLM, and custom GPU optimizations (KV-cache management, batching, scheduling), enabling efficient serving of large models under production constraints. Designed advanced request routing and scheduling strategies for search-driven multi-request workloads, ensuring high GPU utilization and strict latency SLOs. Developed and scaled RL training infrastructure (Verl, AReaL), supporting SFT, RLHF, and agentic RL workloads, improving training efficiency and system scalability. Built an end-to-end e-commerce search system from scratch, including recall, ranking, indexing, and data feedback loops. ### Software Engineer @ MEGVII旷视 Jan 2018 – Jan 2020 | Shanghai, China Responsible for building a storage system for small data storage. Build core systems including data storage, data consistency, system reliability, etc. Responsible for building a search engine. Build the core systems including high performance computing, distributed storage, system reliability, etc. ## Education ### Master in Computer Science Shanghai Jiao Tong University ### Bachelor in Physical Science Shanghai Jiao Tong University ## Contact & Social - LinkedIn: https://linkedin.com/in/hongyu-lu-431392298 --- Source: https://flows.cv/hongyulu JSON Resume: https://flows.cv/hongyulu/resume.json Last updated: 2026-04-12