Experience

2024 — Now

Building and deploying the Zipline AI control plane on top of Chronon OSS (https://chronon.ai/), integrating it with multi cloud (GCP and AWS) to manage lifecycle execution of Spark and Flink jobs on Dataproc / EMR and perform compute sharing.

Expanded data interoperability of Spark jobs to support heterogeneous sources, including Apache Iceberg, BigQuery native/external tables, and Apache Hudi format.

Optimized application stability and resource footprint by diagnosing and resolving critical Docker JVM memory leaks using Java Flight Recorder for heap and thread dump analysis.

Reduced feature fetching latency by 30% by migrating payload serialization from JSON to Avro binary, significantly decreasing vector payload sizes and network overhead.

Stabilized the deployment pipeline by implementing safe, idempotent database migrations using Liquibase and containerizing the full local development environment via Docker.

Added robust end-to-end integration testing suite covering the CLI, compute engine, and inference service on different cloud providers.

Tuned Spark and Flink configurations to ensure job stability. For Spark, analyzed jobs to identify bottlenecks like skew and adjusted configurations appropriately; for Flink, added new operators to expand stream processing logic such as writing out to new outputs.

Scala + Spark + Flink + Iceberg + Docker + Liquibase + Hudi + Dataproc + EMR + BigTable + Postgres

StripeSoftware Engineer

2021 — 2024

Machine learning infrastructure engineer on Stripe's feature engineering platform Shepherd.

https://stripe.com/blog/shepherd-how-stripe-adapted-chronon-to-scale-ml-feature-development

Spark + Flink + Airflow + Scala + Java + Hive + Iceberg

Led migration of Stripe's early merchant fraud detection offline ML model from legacy feature engineering platform to Shepherd. Implemented the Shepherd based features and conducted extensive offline evaluation to ensure new features matched old ones. Backtested features to confirm score distributions and recall of new features were inline with the pre-existing features.

Led technical implementation of first asynchronous Shepherd based ML model for merchant fraud. Built out online flow consisting of an event consumer subscribed to various Kafka topics that fetched online features from feature store and conditionally trigger model scoring downstream. Worked with team of ML engineers to assemble an automated backtesting and training data pipeline using offline computed point-in-time training data involving Airflow + Flyte + Iceberg.

Led technical design and development of first cut of Stripe's new core product infrastructure managing multi-entity accounts/businesses. See https://docs.stripe.com/get-started/account/orgs.

Responsible for ideation + development of the read optimized tree graph storage strategy to represent a Stripe enterprise business with multiple accounts and/or entities. Built read + write APIs and workflows including various locking schemes to support concurrent updates to the tree while maintaining correctness. Java + RPC + Protobuf + Bazel + MySQL + Mongo

Developed real time stream processing jobs using Flink + Scala + Bazel + Kafka to monitor the merchant experience of Stripe's customer base across the world.

AgileMDAnalytics and Data Engineer

2021 — 2021

Led and designed architecture plans to upgrade existing data pipeline infrastructure away from MongoDB including:

pros and cons proposal of replacing MongoDB with AWS Redshift or Snowflake

proof of concept detailing use of Airbyte to load from multiple sources of AgileMD data to AWS Redshift and DBT to own transformations within Redshift.

cost benefit analysis between building and hosting open source data frameworks listed above versus buying a managed ETL/ELT service

AmazonSoftware Engineer

2019 — 2021

Building asynchronous micro-services with Java using AWS CloudFormation, Lambda, DynamoDB, SQS, and EventBridge to solve problems in cross border movement for Amazon Logistics.

Architected and developed end to end solution to migrate production data in AWS DynamoDB to internal Amazon data lake for further processing by dependent business intelligence and data engineer teams using AWS CloudwatchEvents, S3, and SNS.

Led performance readiness assessment for team’s services (> 10) in preparation for increased traffic due to 2020 Q4 holiday season. Analyzed previous traffic patterns and initiated load testing of individual services to understand if services’ limits would handle the forecasted traffic. Horizontally or vertically scaled team’s services on a case by case basis (larger AWS EC2 instances, more EC2 instances, configuring AWS DynamoDB to autoscaling, etc).

Completed integration of Amazon Devices Logistics onto team’s platform and served as lead and primary point of contact for new technical issues and features.

PetalSoftware Engineer II

2019 — 2019

Education

Columbia University

Bachelor of Science (B.S.)

Elon University

Experience+6

Education

Bachelor of Science (B.S.)

Bachelor of Science (B.S.)

Experience