# Sanjay Prabhakar

> ML SWE - Inference Platform @Together.ai

Location: San Francisco, California, United States
Profile: https://flows.cv/sanjayprabhakar

AI & ML Performance Engineer with expertise in GPU kernel optimization, inference acceleration, and multi-GPU programming, skilled in C++, CUDA, and Python. Experienced in deploying and optimizing LLMs on AWS with vLLM, building high-performance kernels , and reducing latency on NVIDIA and AMD architectures. Strong background in computer vision pipelines (TensorRT, DeepStream) and active open-source contributor (CuPy). Passionate about bridging AI research and systems performance to deliver scalable, production-ready solutions.

## Work Experience
### ML SWE - Inference Platform @ Together AI
Jan 2025 – Present | San Francisco, California, United States

### AI Engineer @ IpserLab
Jan 2025 – Jan 2025

### Computer Vision and Machine Learning Intern @ Agot (Acquired by HME)
Jan 2023 – Jan 2023 | Pittsburgh, Pennsylvania, United States
● Optimized object detection and segmentation models using DeepStream’s TensorRT integration, for a
40% increase in throughput via layer fusion, kernel auto-tuning, and memory bandwidth optimizations.
● Leveraged NVIDIA’s Deep Learning Accelerator (DLA) cores on Orin to offload compute-intensive
workloads, balancing GPU and DLA execution for maximum throughput and power efficiency on edge devices.
● Engineered low-latency video pipelines by integrating RTSP streams with NVIDIA DeepStream SDK,
which improved end-to-end inference latency by 35%.
● Optimized segmentation models using NVIDIA TAO and DeepStream, achieving a 20% improvement in IoU and deploying efficiently on NVIDIA Xavier and Orin. 
● Led the development and launch of an innovative food waste management solution, leveraging a novel ML algorithm for data forecasting and vision-based analysis, resulting in a 50% reduction in waste. 
● Integrated visual language models (GPT-4V, LLaVa) into computer vision pipelines, enabling multimodal scene understanding and improving complex scene interpretation accuracy by 30%. 
● Developed and deployed Transformer-UNet-based segmentation and detection models on AWS SageMaker, orchestrating deployments on a Kubernetes cluster with Argo CD and Docker for seamless automation.


## Education
### Master's degree in Artificial Intelligence
Northeastern University

### Bachelor of Engineering - BE in Computer Science
BMS Institute of Technology and Management


## Contact & Social
- LinkedIn: https://linkedin.com/in/sanjay-prabhakar-northeastern

---
Source: https://flows.cv/sanjayprabhakar
JSON Resume: https://flows.cv/sanjayprabhakar/resume.json
Last updated: 2026-04-11