# Sumanth R Hegde

> LLMs @ Anyscale || Prev: C3 AI, UC San Diego, IIT Madras

Location: San Francisco, California, United States
Profile: https://flows.cv/sumanthrhegde

I'm currently a software engineer at Anyscale, working in the LLM team. Broadly, I'm passionate about machine learning.  

I'm currently building SkyRL: https://github.com/NovaSky-AI/SkyRL

Previously, I did my master's in computer science at UCSD. I've worked on a bunch of projects in natural language processing, computer vision and computer systems in my undergraduate and graduate studies. I've also made open-source contributions to HuggingFace's PEFT (parameter-efficient fine tuning)  and Accelerate libraries. In Summer 2023, I was a Data Science intern at C3 AI, working on fine-tuning language models for their Generative Search application. I also have previous industry experience as a machine learning intern at Hakimo and HyperVerge. 

Much of my work in undergrad was in computer vision, and I also have a publication in UDC Workshops, ECCV 2020.

## Work Experience
### Software Engineer @ Anyscale
Jan 2024 – Present | San Francisco, California, United States
Software Engineer working on the LLM team at Anyscale!

Current Project: SkyRL; Skythought
- One of the core contributors of SkyRL: https://github.com/NovaSky-AI/SkyRL , building a full stack library for post-training LLMs
- Core contributor to https://novasky-ai.notion.site/skyrl-v0 - implemented a scalable remote server for RL training on SWE-Gym, contributed to building asynchronous multi-turn rollout implementation to improve SWE-Bench performance of Qwen-3-8B by 5.8%.
- Core contributor to novasky-ai.notion.site/skyrl-sql, one of the first open-source models trained with multi-turn RL on Text2SQL - matching GPT-4o and o4-mini on the Spider benchmark. 
- One of the core maintainers for the Skythought repo: https://github.com/NovaSky-AI/SkyThought/commits?author=SumanthRH 
- Worked on standardized, scalable evaluation for reasoning models 

Past Project: LLMForge, Anyscale's fine-tuning framework
- Added support for different fine-tuning tasks (such as instruction tuning and causal LM) 
- Improved model source support to allow bringing any HuggingFace model with any chat template to fine-tune on Anyscale.
- Preference tuning, Function calling fine-tuning
- Improved DPO training speed by 20-40% with prefix sharing: https://github.com/frankxwang/dpo-prefix-sharing 
- Led building an SDK for models trained on Anyscale: https://docs.anyscale.com/reference/llm_models

### Student Researcher @ UC San Diego
Jan 2023 – Jan 2024
Working on language models in Prof. McAuley's Lab at UCSD

### Graduate Teaching Assistant @ UC San Diego
Jan 2022 – Jan 2024
- Teaching Assistant for CSE 232: Principles of Database Systems and CSE 21: Mathematics for Algorithms and Systems. 
- Responsibilities included conducting weekly discussion sessions, preparing question papers for examinations, etc. Best part: Office hours!

### Data Science Intern @ C3 AI
Jan 2023 – Jan 2023 | Redwood City, California, United States
- Set up a finetuning codebase for language models from scratch for use in C3’s Generative Search application
- Features: Support for difference causal and sequence-2-sequence models, ability to mix different training datasets (for a text-to-text or a causal language modelling task), visualize metrics on multiple evaluation datasets, parameter-efficient fine-tuning and quantization support, etc
 - Trained 10B+ parameter models on 1M+ samples using DeepSpeed and 🤗 Accelerate.

### Machine Learning Engineer Intern @ Hakimo
Jan 2023 – Jan 2023
Worked on video-based object detection methods to improve Hakimo's Remote Guarding solution

### Undergraduate Student Researcher @ Indian Institute of Technology, Madras
Jan 2020 – Jan 2021
– Demonstrated fast reconstruction of a 12 frame video from a single image of a lensless camera, reducing inference time from 2 hours to 30 milliseconds.
– Proposed an efficient reconstruction framework - a physics-aware neural net trained in an adversarial fashion, used feature-based loss for producing photorealistic videos.
My Bachelor's thesis can be found here: https://tinyurl.com/sumanth-btech-thesis

### Undergraduate Student Researcher @ Indian Institute of Technology, Madras
Jan 2020 – Jan 2020
- Created a novel deep learning based model for image restoration, resulting in a publication at ECCV Workshops 2020 and placed 2nd /150 teams at the Under Display Camera Challenge.
– Developed a two stage pipeline for directly processing megapixel images with a simulation scheme for data augmentation.
– Rectified severe blur and low light conditions in the images, obtaining >12% improvement in image quality with 88% (7.8M) lesser parameters than existing work.

### Deep learning intern @ HyperVerge Inc.
Jan 2019 – Jan 2019 | Bengaluru Area, India
- Implemented a learning-based face detection algorithm for Know-Your-Customer services, reduced false positives 10 times and false negatives by 2.5 times on HyperVerge’s benchmark.
– Trained a Multi-task Cascaded Convolutional Neural Network using > 200,000 images to beat the previous model which had >99.5% accuracy.
– Analysed client data and employed hard positive mining, data augmentation to improve recall by 5% .


## Education
### Master's degree in Computer Science
UC San Diego

### Bachelor of Technology - BTech (Honours) in Electrical Engineering
Indian Institute of Technology, Madras


## Contact & Social
- LinkedIn: https://linkedin.com/in/sumanthrhegde
- Website: https://github.com/sumanthrh

---
Source: https://flows.cv/sumanthrhegde
JSON Resume: https://flows.cv/sumanthrhegde/resume.json
Last updated: 2026-04-05