San Francisco, California, United States
Software Engineer working on the LLM team at Anyscale!
Current Project: SkyRL; Skythought
β’ One of the core contributors of SkyRL: https://github.com/NovaSky-AI/SkyRL , building a full stack library for post-training LLMs
β’ Core contributor to https://novasky-ai.notion.site/skyrl-v0 - implemented a scalable remote server for RL training on SWE-Gym, contributed to building asynchronous multi-turn rollout implementation to improve SWE-Bench performance of Qwen-3-8B by 5.8%.
β’ Core contributor to novasky-ai.notion.site/skyrl-sql, one of the first open-source models trained with multi-turn RL on Text2SQL - matching GPT-4o and o4-mini on the Spider benchmark.
β’ One of the core maintainers for the Skythought repo: https://github.com/NovaSky-AI/SkyThought/commits?author=SumanthRH
β’ Worked on standardized, scalable evaluation for reasoning models
Past Project: LLMForge, Anyscale's fine-tuning framework
β’ Added support for different fine-tuning tasks (such as instruction tuning and causal LM)
β’ Improved model source support to allow bringing any HuggingFace model with any chat template to fine-tune on Anyscale.
β’ Preference tuning, Function calling fine-tuning
β’ Improved DPO training speed by 20-40% with prefix sharing: https://github.com/frankxwang/dpo-prefix-sharing
β’ Led building an SDK for models trained on Anyscale: https://docs.anyscale.com/reference/llm_models