I'm interested in distributed databases and machine learning and have around 3 years of experience at building high volume data systems that serve ML and data science applications.

2024 — NowCelonisSoftware Engineer

2024 — Now

New York, United States

2022 — 2023Columbia UniversityGraduate Research Assistant

2022 — 2023

New York, New York, United States

Working under Dr. Eugene Wu and Zachuary Huang in the Wu Lab on improving databases and ML.

Helped build a Python library to train tree based ML models on SQL databases.

Paper Published: https://dl.acm.org/doi/10.1145/3592980.3595318

Helped build a visualization library to display many to many joins for Wide Table Analytics.

Paper Published: https://dl.acm.org/doi/10.1145/3597465.3605224

Researched and helped evaluate Text-to-SQL performance using GPT4.

Paper Published: https://arxiv.org/abs/2310.18742

2023 — 2023AddeparSoftware Engineer Intern

2023 — 2023

New York, United States

Reduced API latency by 40% by optimizing SQL queries and further identified pagination changes that result in 95% improvement.

Profiled Kubernetes pod production usage patterns based on CPU and memory metrics and identified 15% cost reduction opportunity.

Designed and implemented an error framework that differentiates between user and system error that helped reduce noise in alerts ultimately improving developer productivity.

2021 — 2022Hevo DataSoftware Development Engineer 2

2021 — 2022

Bangalore Urban, Karnataka, India

Architected and executed a Dynamic Error Classification System.

Allows us to dynamically categorize any error in the system with a readable error message to improve UX and control retry behaviour based on error.

Decreased time to deploy an error classification change from multiple hours to 1 minutes.

Removed dependency on engineers and code changes to classify errors. Now Product Managers and Support staff can handle errors.

Cut down errors displayed to user reduced by 50% for specific sources.

Designed and implemented a feature in our job scheduler (Handyman) to automatically schedule jobs based on resource needs across machines with different hardware resources (RAM, disk storage).

This was mainly implemented to support ingestion jobs that required downloading multi-GB files. These jobs were automatically scheduled to run on nodes with large disk storage and automatically rerun on same node to continue ingesting same file without re-downloading.

Built a Destination Cost Recommendation Framework.

Automatically collects metadata statistics about data warehouses being used in Hevo and stores it on a data lake.

Automatically calculates these statistics and makes recommendations for the users to reduce the cost of using the warehouse with Hevo.

Improved ingestion rate by 8x for Google Analytics Connector by sampling data volume and intelligently distributing workload across parallel jobs.

Mentored multiple interns.

2020 — 2021Hevo DataSoftware Development Engineer

2020 — 2021

Bengaluru, Karnataka, India

Designed and implemented an autonomous and robust integration with Kafka as a source.

Designed to scale out and scale in when it detects high data in source. Used linear regression and source data retention thresholds to automatically expand to accommodate extra data and scale in to save costs.

Integrated Firebolt as a Destination.

Tackled ambiguous requirements, early documentation to deliver Firebolt on time by using library greps, debugging tools.

Delivered Firebolt integration first in the market, giving Hevo an advantage and exclusive partnership deals.

Added new features such as Parquet support and new key types in our Mapping component.

Optimized sideline events flow.

Reduced the time taken for each sidelined event to be visible to the users by at least 2x (5+ minutes to 1 minute).

Added visibility for users to understand the state of the events as soon as possible.

Education

2022 — 2023

Columbia University

Master of Science - MS

2022 — 2023

2016 — 2020

Manipal Institute of Technology

Bachelor of Technology

2016 — 2020

Experience+

Education

Master of Science - MS

Bachelor of Technology