# Devashish Lal > Machine Learning Systems Engineer @ Meta | Kernel Fusion @ SGLang | ex-MLE @ Rivos Inc, Samsung Research | Ex SWE II @ Dell Technologies | GSoC 2024 @ Blender Location: San Francisco Bay Area, United States Profile: https://flows.cv/devashish I’m a Machine Learning Systems Engineer focused on high-performance LLM inference, compiler systems, and hardware–software co-design — with a strong emphasis on open-source contributions. A core part of my work has been building and upstreaming performance improvements to widely used inference frameworks. I’ve led development of compiler fusion stacks, contributed fused kernels to open-source serving libraries, and optimized attention, MoE, and dense workloads to deliver real-world performance gains. My contributions include reducing kernel counts (~10%) through custom torch.compile fusions, improving memory efficiency (up to ~90% reduction in critical cases), and stabilizing large-scale CI pipelines that run 80+ LLMs nightly. Beyond kernel-level optimization, I work deeply in: - torch.compile and graph-level transformations - Kernel fusion and lowering pipelines - LLM serving engines and inference runtimes - Hardware backend integration for custom accelerators - Performance modeling and MLPerf-style benchmarking I enjoy working at the boundary between research and systems — taking new model architectures (MLA, linear attention, MoE) and making them fast, memory-efficient, and production-ready. Open-source collaboration is central to how I operate: I care about building reusable infrastructure, upstreaming optimizations, and enabling the broader ecosystem to run models faster and more efficiently. My goal is simple: close the gap between cutting-edge model innovation and the systems required to serve it at scale. ## Work Experience ### Machine Learning Systems Engineer @ Meta Jan 2025 – Present | Menlo Park, CA ML systems engineer on MTIA software team, developing high-performance LLM inferencing solutions for the MTIA stack ### Machine Learning Compilers Engineer @ sgl-project Jan 2025 – Present Co-leading the efforts to implement sgl-fusion compilation stack powered by torch compile, dynamo and inductor, optimizing LLMs using kernel fusion and democratizing LLM architecture by developing fusion passes like, Fused Activation (MLP up projection + activation + quantization), Rmsnorm + Quant and more ### Machine Learning Systems Engineer @ Rivos Inc. Jan 2025 – Jan 2026 | Santa Clara, CA Lead development of SGLang and Flashinfer on the Rivos Stack ### Machine Learning for XR @ Samsung Research America (SRA) Jan 2024 – Jan 2024 | Mountain View, California, United States - Evaluated Machine Learning (ML) models on the latest Galaxy devices, with a specific focus on 3D, NeRF, Gaussian Splatting, and Generative AI applications - Conducted research and evaluated state-of-the-art papers and projects in the field of 3D imaging and generative models - Preprocessed ML training data, ensuring its suitability for 3D and generative tasks - Actively participated in our innovation process, contributing to prototypes, research papers, and patent development - Collaborated closely with other team members to establish an effective pipeline for implementing and deploying ML solutions - Proposed creative and innovative solutions to meet product goals, with an emphasis on 3D, NeRF, Gaussian Splatting and Generative AI ### Software Engineer, Google Summer Of Code 2024 @ Blender Jan 2024 – Jan 2024 | Mountain View, CA - Created file import nodes for the Geometry Node system. Minimizing disk usage and enabling using Blender as a data visualization tool. - Refactored existing 3D data importers, exposing functions to retrieve raw mesh data. - Add CSV import capabilities to Blender and CSV import for geometry nodes. - Integrated point cloud importers into Geometry nodes. - maintained CMake build system and added new unit tests to improve coverage ### Research Assistant @ USC Institute for Creative Technologies Jan 2024 – Jan 2024 | Los Angeles, California, United States - Overhauled the One World terrain project, with 3D scene re-creation from voxel point clouds using custom data formats and importers, reducing memory usage by 90% in Unreal Engine. - Researched physics simulation for Gaussian splatting. Spearheaded virtual reality demos for interactable scenes with more than 10 million points using PyTorch and Unreal Engine on the Meta Quest Pro platform. ### Software Engineer @ USC Institute for Creative Technologies Jan 2023 – Jan 2024 | Los Angeles, California, United States Engineered real-time extraction and processing of 3D face mesh from webcam feeds while reducing memory footprint, deployed as an interactive learning experience at Lawrence Hall, University of California, Berkeley. ### Software Engineer 2 @ Dell Technologies Jan 2022 – Jan 2022 | Bengaluru, Karnataka, India - Directed technical transformations, engineered reactive systems decoupling teams, and conceived shared infrastructural libraries and reusable user interface components, saving 80 hours of development time per sprint. - Innovated distributed full-stack applications using ASP.NET, Angular, MongoDB, and RabbitMQ to provide best-in-class E-commerce user experience directly impacting B2B revenue. ### Software Engineer @ Dell Technologies Jan 2020 – Jan 2022 | Bengaluru, Karnataka, India - Spearheaded micro-frontend architecture design and organization-wide adoption. - Engineered runtime stitching of Webpack bundles and helped teams adopt the micro-frontend framework ### Software Engineer Intern @ Dell Technologies Jan 2020 – Jan 2020 | Bengaluru Area, India - Pioneered e2e testing and testbeds to automate SSO authentication, reducing organization-wide regression by 15% and QA hours by 30%. - Improved overall tracing and created Splunk dashboards, facilitating data-driven decisions, identifying critical errors, and reducing system failures by 22%. ### Software Engineer @ Resolute AI Jan 2019 – Jan 2020 | Bangalore - Pioneered video surveillance dashboard incorporating multiple WebRTC streams in a single view. - Integrated OpenCV face and action detection models to provide smart surveillance capabilities. ### Software Engineer @ Dell EMC Jan 2019 – Jan 2019 | Bangalore - Engineered Dashboard to analyze speech data from sales calls, along with real-time feedback and suggestions during a sales call - Integrated Azure speech SDK in internal NLP pipelines - Contributed fixes in the NLTK library back to the source repository. ## Education ### Masters in Computer Science University of Southern California ### Bachelor of Technology - BTech in Computer Science Manipal University Jaipur ## Contact & Social - LinkedIn: https://linkedin.com/in/devashish-lal-096868176 - Portfolio: https://www.youtube.com/@CodeBlazeX - GitHub: https://github.com/BLaZeKiLL --- Source: https://flows.cv/devashish JSON Resume: https://flows.cv/devashish/resume.json Last updated: 2026-04-10