# Pavan Kalyan Damalapati

> SWE at Celonis | MS CS grad at Columbia University

Location: New York, New York, United States
Profile: https://flows.cv/pavankalyandamalapati

I'm interested in distributed databases and machine learning and have around 3 years of experience at building high volume data systems that serve ML and data science applications.

I document a lot of my technical learnings and experiments on https://pavan-kalyan.dev

I'm interested in the following areas:
-> Backend Development. My work at Hevo as mostly been comprised of backend software development. It gave me exposure to a wide variety of technology on the server side like message queues (Kafka), Databases (MySQL, Postgres, RocksDB, InfluxDB, MongoDB), Data Warehouses (Snowflake, BigQuery, Firebolt), and Java Server Frameworks (DropWizard). 

-> Computer Science research regarding databases, distributed systems, data visualization systems. I'm currently pursuing a masters in CS at Columbia University where I hope to do some research in the above fields. 

-> Open Source contributions. I have contributed to multiple open source repositories like Jackson, DepsCloud, Trickster, etc.

## Work Experience
### Software Engineer @ Celonis
Jan 2024 – Present | New York, United States

### Graduate Research Assistant @ Columbia University
Jan 2022 – Jan 2023 | New York, New York, United States
Working under Dr. Eugene Wu and Zachuary Huang in the Wu Lab on improving databases and ML.
- Helped build a Python library to train tree based ML models on SQL databases. 
             - Paper Published: https://dl.acm.org/doi/10.1145/3592980.3595318
- Helped build a visualization library to display many to many joins for Wide Table Analytics.
            - Paper Published: https://dl.acm.org/doi/10.1145/3597465.3605224
- Researched and helped evaluate Text-to-SQL performance using GPT4.
            - Paper Published: https://arxiv.org/abs/2310.18742

### Software Engineer Intern @ Addepar
Jan 2023 – Jan 2023 | New York, United States
- Reduced API latency by 40% by optimizing SQL queries and further identified pagination changes that result in 95% improvement.
- Profiled Kubernetes pod production usage patterns based on CPU and memory metrics and identified 15% cost reduction opportunity.
- Designed and implemented an error framework that differentiates between user and system error that helped reduce noise in alerts ultimately improving developer productivity.

### Software Development Engineer 2 @ Hevo Data
Jan 2021 – Jan 2022 | Bangalore Urban, Karnataka, India
Architected and executed a Dynamic Error Classification System.
- Allows us to dynamically categorize any error in the system with a readable error message to improve UX and control retry behaviour based on error. 
- Decreased time to deploy an error classification change from multiple hours to 1 minutes.
- Removed dependency on engineers and code changes to classify errors. Now Product Managers and Support staff can handle errors.
- Cut down errors displayed to user reduced by 50% for specific sources.

Designed and implemented a feature in our job scheduler (Handyman) to automatically schedule jobs based on resource needs across machines with different hardware resources (RAM, disk storage).
- This was mainly implemented to support ingestion jobs that required downloading multi-GB files. These jobs were automatically scheduled to run on nodes with large disk storage and automatically rerun on same node to continue ingesting same file without re-downloading.

Built a Destination Cost Recommendation Framework.
- Automatically collects metadata statistics about data warehouses being used in Hevo and stores it on a data lake.
- Automatically calculates these statistics and makes recommendations for the users to reduce the cost of using the warehouse with Hevo.

Improved ingestion rate by 8x for Google Analytics Connector by sampling data volume and intelligently distributing workload across parallel jobs.

Mentored multiple interns.

### Software Development Engineer @ Hevo Data
Jan 2020 – Jan 2021 | Bengaluru, Karnataka, India
Designed and implemented an autonomous and robust integration with Kafka as a source.
- Designed to scale out  and scale in when it detects high data in source. Used linear regression and source data retention thresholds to automatically expand to accommodate extra data and scale in to save costs.

Integrated Firebolt as a Destination.
- Tackled ambiguous requirements, early documentation to deliver Firebolt on time by using library greps, debugging tools.
- Delivered Firebolt integration first in the market, giving Hevo an advantage and exclusive partnership deals.
- Added new features such as Parquet support and new key types in our Mapping component.

Optimized sideline events flow.
- Reduced the time taken for each sidelined event to be visible to the users by at least 2x (5+ minutes to 1 minute).
- Added visibility for users to understand the state of the events as soon as possible.

### Software Development Intern @ Hevo Data
Jan 2020 – Jan 2020 | Bengaluru Area, India
Built multiple scalable and fault-tolerant integrations to various sources: Microsoft Ads, Google Search Console, and Sendgrid, etc. resulting in a total of $60,000 ARR to Hevo.

Increased schema handling capacity from 1 at a time to 10,000 at a time by shipping a bulk processing feature to process schemas.

Implemented a feature to identify incompatible schema mappings in realtime and proactively alert users resulting in drop in incompatible mapping error rates by 15%.

### Solution Specialist Intern @ Microsoft
Jan 2019 – Jan 2019 | Bengaluru Area, India
- Built a chest X-Ray diagnosis system using Deep Learning on Azure platform. This was built with PyTorch and Python.
- Created Front end on ReactJs to accept X-Rays and return probability of each affliction.

### Software Development Intern @ TIF Labs Pvt Ltd
Jan 2017 – Jan 2017 | Bangalore
Developed frontend and backend for AlphaCare using AngularJs and NodeJs to enable smoother interactions between patients and nurses improving patient experience. AlphaCare has been deployed in dozens of hospitals in India. Utilized database relational model, HTML/CSS, and message queues.


## Education
### Master of Science - MS in Computer Science
Columbia University
Jan 2022 – Jan 2023

### Bachelor of Technology in Computer science and engineering
Manipal Institute of Technology
Jan 2016 – Jan 2020


## Contact & Social
- LinkedIn: https://linkedin.com/in/pavan-kalyan-damalapati

---
Source: https://flows.cv/pavankalyandamalapati
JSON Resume: https://flows.cv/pavankalyandamalapati/resume.json
Last updated: 2026-03-23