San Francisco, California, United States
(Post acquisition) Still building tools to 10x accountants! We are hiring, please reach out if any open role is a fit!
(Pre-acquisition) AI powered tools to 10x accountants!
2019 — 2023
San Francisco, California
Part of the compute fabric team where we were responsible for providing an out-of-the-box scalable, secure and extensible compute platform that could dynamically launch clusters of machines for data engineering, data science and machine learning workloads.
Specific projects I led include -
1. Removing Public IPs from cluster ingress. This required a re-architecture of our instance bootstrapping process, which was made far more efficient as a result of removing SSH setup commands that were very resource intensive at scale.
2. Privatelink connectivity from clusters to the databricks control plane. This unblocked multiple 7 and 8 figure enterprise deals.
3. A load testing service that allowed virtually 0 cost stress tests for cluster management components by mocking out the cloud provider layer. The mocks were also later used as components in company wide integration testing frameworks.
4. Evangelizing, prototyping and implementing MTLS between machines that process user data and the databricks control plane. This allowed secure communication between these nodes and the control plane, allowing all other teams working on these nodes to design event driven systems securely and easily. The event driven architecture was needed in various places for scalability and stability of the platform. Examples include data ingestion via Kafka, Key Encryption and faster autoscaling of disks when full.
I then moved to our serverless sub-division that completely abstracted clusters from customer facing products.
1. I worked on visibility & auto remediation of Spark executor pods, reducing on-call issues for this class of failures by >10x.
2. Re-architecture of cluster warm pools to allow for minimizing customer facing failures via lazy allocation during scheduling and consolidation of warm pools across different control planes (thus driving down cost of supporting these warm pools).
San Francisco Bay Area
Completed an end to end feature that allowed customers to run custom initialization scripts on each container of a spark cluster that Databricks creates for them on AWS/Azure.
Redesigned the backed to support access control, deliver logs to cloud storage, improve scalability.
Introduced a friendly UI, and created unit and integration tests, with substantial code coverage.
Wrote the backend code in Scala and the frontend code in React/Javascript.
2017 — 2017
San Jose, California
Built extensions that automate a series of simple operations to achieve common database and mathematical tasks in a single click, including Table Normalization, Union All and Matrix Multiply, using Javascript (jQuery) and Python.
Created a utility to upload extensions to an Amazon S3, which is accessible to all Xcalar nodes.
Created both interactive and in-depth exercises for Xcalar Training, and recorded a YouTube tutorial series demonstrating the use of various Xcalar features.
Education
Carnegie Mellon University