Software engineer with extensive experience in both large scale distributed systems and resource constraint modern embedded systems with expertise in building highly scalable and sustainable software stacks, platforms, tools; A leader with the ability to transfer vision into execution.
Experience
2022 — Now
2022 — Now
Sunnyvale, California, United States
2025 - Now | Distributed Systems & Infrastructure Monitoring
• Lead a team in architecting, building, and maintaining a distributed data collection platform, ingesting real-time telemetry from millions of servers across a global fleet. This platform underpins critical fleet health monitoring, diagnostics, and resource management systems.
• Defined the technical roadmap for high-throughput, low-latency data collection, employing technologies such as Pub/Sub for decoupled message distribution and Spanner for persistent state management and metadata.
• Oversee a diverse suite of data collectors, ensuring resilient and scalable data pipelines. Designed and implemented sharding strategies to distribute workload and enhance fault isolation across collector instances.
2022 - 2025
TL for data center server management software stack powering next generation AI.
• Delivered a stateful implementation of Baseboard Management Controller's web server which reduced latency of all data center client application queries by 90% and 25x performance improvement in critical queries.
• Reduced BMC CPU utilization across the datacenter fleet by ~50%
• Devised a query language and data model, to gather telemetry from datacenter servers and an interpreter that client applications use to execute query.
• Co-authored DMTF RedPath specification which makes the query an industry standard.
• Designed and implemented client side query cache which Improved performance of Sensor Monitor
diagnostic applications by 10x.
• Designed an Event Pipeline to provide a real-time telemetry for time sensitive applications and
Save resources (Memory and CPU utilization) in both client applications and servers.
Participated in DMTF forums as a Google representative to bring forward technical challenges seen internally or drive new feature requests to ensure Google aligns upstream-first paradigm.
2020 — 2022
2020 — 2022
Redmond, Washington, United States
• Owned BMC features end to end for next generation Azure datacenter platforms
• Designed and implemented components at all layers of the ASPEED SoC based system, e.g. u-boot, kernel, synchronization primitives, resource allocators, memory management, security, I/O systems, persistence, etc.
• Developed systemd enabled modern C++ based userspace applications integrated with Dbus and using boost libraries.
• Designed platform management layer on top of MCTP over I2C and I3C
• Built and maintained Yocto Linux Distros
• Developed REST APIs for OpenBMC Redfish Servers
• Built and managed firmware integration and signing pipelines
• Test driven development using gTest, gMock
• Designed and Developed Acceptance tests using Robot Test Framework
Active Skills: C/C++17, systemd, DBus, Redfish, IPMI, I2C, UART, SPI, PCIe, MCTP, PLDM, REST, boost, beast, https
2019 — 2020
Folsom, California
Key player in product bring up and developing POCs on left shift platforms
Collaborate with internal and external teams for debugging and fixing principal storage software problems
Delivered firmware design and architecture specifications
Recognitions:
• NSG Strategy Acceleration Division Recognition Award (Q3'17)
For standardizing NVMe SSD Management by delivering NVMe-MI standard
• NSG Internal SSD Engineering Recognition award
For increasing the velocity of the NVMe/ PCIe validation
2015 — 2019
2015 — 2019
• Developed extensible NVMe, PCIe, MCTP, and NVMe-MI embedded software stacks for low power Cortex-A53 processors on Intel's Next Generation SSDs
• Responsible for validation architecture of NVMe-MI over SMBus for PCIe NAND SSDs
• Formulated workable action plans and developed and deployed test suites and automation solutions to achieve optimum firmware conformance to product and protocol specifications within the committed timelines
• Established effective coordination with architecture, development, tools, test and execution teams by driving cross organization meetings and trainings
• Lead common code initiatives to ensure maximum code reuse and eliminate redundancies in effort and code.
• Integrated PCIe/ NVMe (LecRoy and Oakgate), SMBus (aardvark and beagle) protocol exercisers and analyzers to test automation environment reducing test execution man hours by 10x
• Responsible for developing strategic alternatives for addressing false failures in the regression environment that reduced test execution noise by multiple folds
• Worked with third party tool vendors to drive feature requests to meet the new feature validation requirements
• Designed test suites for CI pipelines
Education
Arizona State University
Master's Degree
SRM IST Chennai