Building highly-available & scalable distributed data systems
I am a seasoned software engineer experienced with building large-scale, resilient, performant and high-availability software backend systems with deep expertise in distributed systems and algorithms.
Built core data contention algorithms of a distributed relational database starting from the white board to successfully powering large enterprise-grade workloads (such as the record streaming traffic for SuperBowl in 2024) with scalability, fault-tolerance and high-availability. [Specifically around distributed transactions, isolation levels, etc]
Mentored various engineers to ramp up with expertise in the core architecture to deliver solutions to complex problems, and set them up for career success. Wore multiple hats and worked on complex problems with multiple stakeholders (eng, field, product) working remotely/ hybrid across geographies to onboard database workloads from various sectors (retail, finance, healthcare, etc).
Special mention in the acknowledgement section of the book "Patterns of distributed systems" for helping with patterns on clock skew and consistency.
Also worked on various other query layer features such as adding partial index support for distributed NoSQL and auto "analyze" to collect statistical date for distributed Postgres.
Other highlights:
✅ Helped recruit the right candidates for engineering roles with 60+ interviews.
✅ Climate and cost win: Proposed and pushed for a developer process change to reduce wasted compute cycles in the CI/CD test pipeline leading to large cost savings
✅ Received impact award in quarterly company town-hall.
✅ Featured by “Emerging Builders Spotlight” at 8vc (investor of YugabyteDB).
✅ Featured speaker from the engineering team in the 2022 annual Distributed SQL Summit.
✅ Mentored an intern closely to reduce time for bulk data loading by ~30% by removing stalls and introducing efficient pipelining between the query and storage layer.
✅ Promoting good software practices to reduce the scope for errors/ bugs, helping reach convergence on involved design discussions/ reviews among cross-geographic team members with small time zone overlaps.
✅ Helping unblock customer PoCs with the field and product team.
Performance monitoring and tuning: Designed a low overhead framework in the database query layer to collect lock acquisition and wait times for finding performance bottlenecks in data workloads. Patent filed (as first author), pending approval ("Lock Wait Tracer," 7000-25900/4893US)
•
Improving CI/CD process: Created a backport tracker dashboard to help engineering managers ensure no regression bugs occur due to missed backports.
Developed and optimized microservices to support data replication, backup and restore between on-premise and cloud virtual machines which were a crucial offering of Nutanix’s public cloud offering called “Xi” for IT needs of various enterprise-grade customers across the globe. Notable features include orchestrating migration and optimal placement of VMs on restoration at target site, network mapping, identifying and resolving performance bottlenecks, along with REST API version management.
•
Nutanix Engineering Super Hero Awardee in 2019 quarterly all hands