Led Google's on-device generative AI initiatives to scale the high-performance GenAI inference framework to hundreds of millions of users.
• Tech lead manager of Google's on-device LLM inference framework, LiteRT-LM(https://github.com/google-ai-edge/LiteRT-LM), enabling the deployment of Gemini Nano to hundreds of millions of devices across multiple platforms, optimizing performance for on-device accelerators (blogpost: https://developers.googleblog.com/en/on-device-genai-in-chrome-chromebook-plus-and-pixel-watch-with-litert-lm/), powering
* Chrome built-in LLM API (https://developer.chrome.com/docs/ai/built-in)
* LLM Inference API for Developers (https://developers.googleblog.com/en/large-language-models-on-device-with-mediapipe-and-tensorflow-lite/)
* On-device GenAI on Chromebook Plus (https://9to5google.com/2024/10/08/recorder-app-chromebook/)
* Gemini Nano on Android (https://ai.google.dev/gemini-api/docs/get-started/android_aicore)
* Image generative in Pixel Studio (https://www.theverge.com/2024/8/13/24219655/google-pixel-studio-ai-image-generation-app)
• Achieved world-record inference speeds for running Stable Diffusion models on-device, pushing the boundaries of mobile AI capabilities. (Paper: https://arxiv.org/abs/2304.11267, Blog: https://ai.googleblog.com/2023/06/speed-is-all-you-need-on-device.html)