# Mani Ananth > Computer Architect | Software Engineer (ML Performance) Location: San Francisco Bay Area, United States Profile: https://flows.cv/mani Experienced Computer Architect and Technical Leader with background in AI acceleration and HW/SW codesign, focusing on end to end application performance and Perf/Watt I am passionate about computer system performance and making applications run faster! ## Work Experience ### Software Engineer - Tech Lead @ Google Jan 2024 – Present Leading teams working on performance optimizations for Gemini on GPUs, and analytical modeling of workload performance on current and future generation GPUs ### Member of Technical Staff @ Cerebras Systems Jan 2021 – Jan 2024 | San Francisco Bay Area Performance and Software Optimization for DL training on the world's largest chip - Architected and built a SW stack for training CV models (CNNs and Diffusion) on the Cerebras WSE. Also tech lead for a 15 person team working on this effort - Optimzed performance and led SW bringup to enable LLM training on a next generation ASIC HW and system (CS3) ### Senior GPU Architect @ NVIDIA Jan 2017 – Jan 2021 | San Francisco Bay Area I was a part of NVIDIA's core Deep Learning Architecture group working on HPC and ML kernel performance. Before that, I was an Architect in the SM Architecture group working on improving energy efficiency. My contributions involved studying the mapping of Deep Learning applications on the SM and Tensor cores, and working on micro-architectural strong scaling features to reduce energy consumption. I also spent time prototyping JIT translators for generating performant GPU assembly. May 2020 - June 2021 - Delivered High Performance ML Kernels for CUDA libraries - Also contributed to CUTLASS OSS - https://github.com/NVIDIA/cutlass Jun 2017 - May 2020 - Worked on the architecture and design of the GPU SM and Tensor cores - Developed Binary translators for generating performant GPU assembly ### Graduate Research Assistant @ Georgia Institute of Technology Jan 2017 – Jan 2017 | Greater Atlanta Area Mathematical modeling of performance scaling in many core tiled CPU architectures ### CPU Architecture Intern @ NVIDIA Jan 2016 – Jan 2016 | San Francisco Bay Area Performance Modelling and Analysis for a next generation ARM CPU ### ASIC Intern @ NVIDIA Jan 2016 – Jan 2016 Tegra SoC Memory Controller Group ### Graduate Teaching Assistant @ Georgia Institute of Technology Jan 2016 – Jan 2016 GTA for CS6290 - High Performance Computer Architecture offered in the Online Masters in CS Program by Prof. Milos Prvulovic (Professor of Computer Science, GATech) ### Graduate Student Researcher @ Georgia Institute of Technology Jan 2015 – Jan 2016 | Atlanta Memory Controller Latency analysis and breakdown in Many Core SMP systems Wireless Processor - DRAM interface Evaluation for Multi Channel DRAM Memory Systems ### ASIC Design Engineer @ NVIDIA Jan 2014 – Jan 2015 | Bengaluru, Karnataka, India ## Education ### Master of Science in Computer Science Georgia Institute of Technology ### Bachelor of Technology (B.Tech.) in Electrical and Electronics Engineering National Institute of Technology, Tiruchirappalli ### Non Degree Option in Computer Science Stanford University Jan 2020 – Jan 2020 ## Contact & Social - LinkedIn: https://linkedin.com/in/manikandanananth50 - Website: https://site.mananth.dev --- Source: https://flows.cv/mani JSON Resume: https://flows.cv/mani/resume.json Last updated: 2026-03-23