•Leading the product design lifecycle to incubate multiple AI products, including AI Training, Inference, Hardware Operation, Performance Benchmark, Digital Twin for 6G Network Design Simulation, and Digital Human Generation. Responsibilities included UX research, product validation, market positioning, and value proposition development to showcase NVIDIA’s cutting-edge technologies to enterprises and NCPs.
•Designing job schedulers for HPC and supercomputing environments on both cloud and on-prem, leveraging SLURM and Kubernetes (K8s)-based architectures; collaborating with infrastructure and AI platform teams to deliver solutions optimized for large-scale distributed computing.
•Researching and designing systems for AI infrastructure, including tools for monitoring, debugging, and visualizing multi-node, large-scale training workloads (thousands of GPUs), supporting pre-training, post-training, and inference pipelines.
•Driving cross-org alignment by overseeing multiple AI training, inference, and infra products; partnering with engineering directors to unify fragmented initiatives into a single, coherent product strategy.
•Collaborating with design managers to develop various learning materials for AI-focused design to help grow team expertise and establish design best practices for complex technical domains.