# Sahil Gandhi

> Infra @ Sierra  |  Prev @ CFLT, MSFT, META

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/sahilgandhi

Hello there! My name is Sahil Gandhi and I have a strong passion for backend systems, distributed systems/infra, and infrastructure engineering and regularly keep up with the current trends in systems and technologies in these fields. I hold degrees from UCLA with a BS in Computer Science and Electrical Engineering and an MS in Distributed and Big Data Systems, have published some papers in the database world (and several white papers in other fields), and also several patents in the systems space.

## Work Experience
### Engineer @ Sierra
Jan 2025 – Present | San Francisco, California, United States
Voice LLM Infrastructure and Product

### Senior Software Engineer II @ Confluent
Jan 2023 – Jan 2025 | San Francisco Bay Area
Confluent Cloud Compute Platform & Elasticity (Control Plane)
-  Tech lead for a regional multi-cloud, multi-k8s deployment framework to support the next generation of the regional Confluent Kora Engine, including: scaling, scheduling, disruption budgets, and more
- Mentored and grew sub-team of 4 engineers to deliver the first few iterations of the framework
- Promoted Vertical Pod Autoscaler (VPA) adoption, slashing Kubernetes cluster spend by $10k+/mo
- Automated node type migrations for Kafka, yielding savings of $2m+/yr and improving end-to-end latency by 20%
- Developed scaffolding Terraform components for seamless migration of monitors and dashboards from Datadog to NewRelic

### Senior Software Engineer @ Confluent
Jan 2022 – Jan 2023 | San Francisco Bay Area
Founding engineer on Confluent Cloud Fleet Management (Control Plane) -- designed, developed, and deployed scalable solutions for day 1+ operations on all Confluent Cloud clusters including: 
- Workflow engine to replace manual dataplane upgrades across the fleet for any cluster type (Kafka, KSQL, etc) and any operation (Upgrade, Restart, Shrink, etc) --> reduced fleet wide rollouts from a 1-2 month endeavor to a 1-2 day task.
- Microservice deployment engine to eliminate manual Helm based deployments and handle Confluent's scale of 10000+ deployments/day
- Microservice to aggregate data from multiple sources via change-data-capture pipelines, database queries, and API scraping so engineers have a unified view of the entire fleet --> Accompanied by a ReactJS based UI for better visibility
- Infrastructure CLI to talk to securely communicate with the above services via SSO authentication

### Software Engineer II @ Confluent
Jan 2021 – Jan 2022 | San Francisco Bay Area
Confluent Cloud Fleet Management (Control Plane)

### Software Engineer @ Confluent
Jan 2020 – Jan 2021 | San Francisco Bay Area
Confluent Cloud Fleet Management (Control Plane)

### Software Engineering Intern - Research @ Microsoft
Jan 2019 – Jan 2019 | Redmond, Washington
I worked on Microsoft Research's BuildXl (https://github.com/microsoft/BuildXL), a distributed build engine that is widely used by internal developers on Windows, Office and other teams and was recently open sourced. My project was to revamp the logging infrastructure to make our binary logs forwards and backwards compatible for analyzing and both platform/language and engine version agnostic. I spent a few weeks researching into different serializing formats (Protobuf, Cap’n Proto, AVRO, etc) as well as different database and logging technologies (Kafka streams, MongoDB, LiteDB, RocksDB, etc) and prototyping these services to see if they would scale to our needs (millions of events, gigabytes of compressed binary log files, custom data types/classes/formats). In the end I chose to use Protobuf objects with RocksDB to create a new version of the logs that were indexed for very fast retrievals for popular customer questions but at the same time, provided the complexity of the custom classes that we used and was not dependent on the build engine version. By using the new logs, I was able to improve the performance of the most popular analyzers by anywhere between 5x to 200x and let customers quickly gain key insights on how to speed up their builds further.

### Graduate Researcher @ UCLA Scalable Analytics Institute
Jan 2019 – Jan 2019
I researched under Dr. Carlo Zaniolo and PhD student Ariyam Das on real time data-stream databases and graph visualizations. We used Datalog, a database querying language similar to Prolog, to be able to create a real-time stream querying system. Particularly, we focused on dealing with negations in real-time queries such as when new data is inputted, figuring out what is "not" in the database and also using recursive queries on large and growing data sets.

In addition to this endeavor, I spearheaded an NLP application for parsing natural English into valid Datalog queries. A final project of mine was working with Hoeffding Anytime Trees and extending the "Extremely Fast Decision Tree (https://arxiv.org/pdf/1802.08780.pdf) and Very Fast Decision Tree" implementations to include bagging and other heuristics to increase performance and to also work on continuous data streams

### Undergraduate Researcher @ UCLA Scalable Analytics Institute
Jan 2018 – Jan 2019 | Westwood, California

### Software Engineering Intern @ Facebook
Jan 2018 – Jan 2018 | Menlo Park, California
I worked on the RTP Tools and Automation Team where we create tools for the Test/Validation engineers and work on the CI/CT infrastructre. My project was to get several services to talk to each other to form a single more coherent automated testing tool for testing and validating new Facebook hardware. One key component of my project was to automate the creation and updates of test results, and this saved several hours of manual updates per person, per week. During this process, I was also able to implement a more logical grouping of hardware reservations which allowed us to have a better coverage map for tests through a period of time, and also enabled automated re-testing in the future! Part of my project also involved migrating some of the existing CLI functionalities into a UI so that more information could be portrayed when choosing a group of devices or a group of tests to validate. Finally, in my free time I also worked on the validation portal to fix any bugs that I could find and optimize it in whatever way possible, and by the end of the internship I had optimized the DB design and queries to speed up the portal by more than 50%!

### Software Engineering Intern @ ViaSat Inc.
Jan 2017 – Jan 2017 | Greater San Diego Area
I worked with a team to create the infrastructure to monitor data from the routers and visualize the data to gain valuable business analytics. For the infrastructure, we used Docker containers and Apache Flume to create a flow of data from the Hubs to our local lab environment. We then automated the infrastructure deployment and AWS/VMware VM provisioning process with Ansible in order for it to be generalized for any future deployments. Once production data was being captured, we processed the Spark streams with the data using Scala and stored the processed data into OpenTSDB. We then graphed data using Grafana to let business leaders make critical decisions of the Arclight network.

### Software Engineering Intern @ Reltio
Jan 2016 – Jan 2016 | Redwood Shores, California (Bay Area)
I primarily worked on the company's Master Data Management (MDM) product to tailor it to individual customers' needs. This responsibility included developing and modifying features in the API (Java, Cassandra, Elasticsearch), UI (Qooxdoo), and metadata configurations (JSON). In addition, I also found and fixed several major bugs within the product's code base before the next version of the API and UI went live. Throughout the internship, I used Git extensively to manage my workflow and simultaneously work on the code with developers from around the world, and gained a critical understanding of both SaaS and PaaS models of service and how either can be beneficial for a company's future.

### Software and Product Development Intern @ Smart Monitor
Jan 2014 – Jan 2014 | San Jose, California
I tested the smartwatch the company produced and assisted in designing and pitching a new product idea to Genentech. Furthermore, I also designed a new prototype product for the company and created a business plan analyzing its potential market and opportunities. I also concurrently interned at IntelliVision, splitting my time according to each of the companies' needs.

### Software and Product Development Intern @ IntelliVision
Jan 2014 – Jan 2014 | San Jose, California
I created a website for one of the company’s products and researched various markets and business opportunities for the company’s future. I also tested product software for accuracy and competency for market use. I also concurrently interned at SmartMonitor, splitting my time according to each of the companies' needs.


## Education
### Master of Science (MS) in Computer Science
UCLA

### Bachelor of Science (BS) - Summa Cum Laude in Computer Science and (Electrical) Engineering
UCLA

### High School Diploma
Monta Vista High School


## Contact & Social
- LinkedIn: https://linkedin.com/in/sahilmgandhi
- Portfolio: http://www.sahilmgandhi.com
- GitHub: https://www.github.com/sahilmgandhi

---
Source: https://flows.cv/sahilgandhi
JSON Resume: https://flows.cv/sahilgandhi/resume.json
Last updated: 2026-04-11