# Aidan Do > LLM Inference Optimization @ Fireworks AI | Nvidia/Meta Open Source Contributor | Ex-Canva Location: San Francisco Bay Area, United States Profile: https://flows.cv/aidando 🔹 On the Performance team at Fireworks AI - making inference faster. 🔹 Open-source contributor to NVIDIA (FlashInfer, CUTLASS) and Meta (PyTorch) ## Work Experience ### Software Engineer, Inference @ Fireworks AI Jan 2025 – Present | San Francisco Bay Area - Improved VLM inference latency/throughput by 1.6x–2.2x across 3 customer workloads — helping land $_M ARR in new revenue - Improved TTFT of Kimi K2.5 by 11%–21% for 5–10 image case by writing a custom position embedding bicubic interpolation CUDA kernel — outperforming torch.nn.functional.interpolate by 100x–1700x - Improved MXFP8 CUTLASS GEMM performance by 10–40% at batch ≤32 via swap ab trick, increasing system-wide tok/s by 4–5% across ___B+ annual output tokens ### NVIDIA/Meta - Open-source Contributor @ NVIDIA Jan 2026 – Present Contributor to NVIDIA (FlashInfer, CUTLASS) and Meta (PyTorch) open-source projects: • PyTorch: Rewrote CUDA upsample_bicubic2d kernel to parallelize across batch and channel dimensions — 4.3-43x speedup for VLM position embedding resizing - https://github.com/pytorch/pytorch/pull/174578 • FlashInfer: Upstreamed FP8 Groupwise GEMM optimization for small-M decode shapes — 10-40% faster at batch ≤32 - https://github.com/flashinfer-ai/flashinfer/pull/2327 • CUTLASS: Fixed SM100 (Blackwell) FP8 profiler bug — corrected epilogue shape_div divisibility condition for non-multiple-of-64 N tiles - https://github.com/NVIDIA/cutlass/pull/2946 ### Software Engineer @ Canva Jan 2024 – Jan 2025 | Sydney, NSW • Designed and built fraud response service over gRPC, MySQL, DynamoDB, AWS — handling 10–100x projected growth in fraud cases ### Meta AI Open-source Contributor @ Meta Jan 2024 – Jan 2025 • 7th most active contributor (as of 19 Jan 2024) to official Meta AI projects — Llama-stack/Llama-recipes. Full list of merged PRs: https://bit.ly/4kLbRQm ### Software Engineer @ Semantic Sciences Pty Ltd Jan 2023 – Jan 2024 Developed and maintained enterprise application managing $800M+ in research funding allocation for Australia's main medical research organization ### Junior Software Engineer @ NQRY Jan 2021 – Jan 2022 | Adelaide, South Australia, Australia Maintained a criminal investigation application, implementing mission-critical features for law enforcement organizations ### Software Engineer Intern @ NQRY Jan 2021 – Jan 2021 | Adelaide, South Australia, Australia ## Education ### Bachelor of Software Engineering (Honours) University of Adelaide ## Contact & Social - LinkedIn: https://linkedin.com/in/aidando --- Source: https://flows.cv/aidando JSON Resume: https://flows.cv/aidando/resume.json Last updated: 2026-04-10