Experience

RobinhoodStaff Software Engineer

2020 — Now

Menlo Park, California, United States

Technical lead in the AI Infra:

Context data services for GenAI apps

Feature engineering infrastructure

ML Developer experience

Data science notebook platform

Technical lead in the data platform team:

Rearchitected events ingestion pipelines to handle heavy spikes in trading activity

Evolved Spark execution backends into multi-cluster architecture with autoscaling for high availability, reliability, scalability and cost efficiency

Built an orchestration service for Spark batch pipelines for better integration, security and to enable infrastructure evolution and migrated the entire batch workload onto this service.

Onboarded backend teams to use our batch processing infrastructure, with stronger availability and SLAs than typical analytical workloads

Designed and currently working on moving all Spark workload execution to Kubernetes infrastructure

Jump started work on building aggregation analytics infrastructure based on Apache Pinot

Worked on cost savings initiatives and helped significantly reduce Spark infrastructure costs

Spearheading effort to modernize workflow platform based on Airflow at Robinhood. Presented our work at Airflow Summit 2024

Led multiple vendor evaluations for infrastructure solutions

AirbnbSoftware Engineer

2017 — 2020

San Francisco Bay Area

Technical lead in data infrastructure team:

● Wrote a self-service streaming compute platform based on Apache Flink and SQL. Subject matter expert in Apache Flink internals and usage recommendations. Presented the work at Flink Forward 2019

● Managed the Apache Druid ecosystem as a backend for self-service metrics framework. Details presented in Airbnb blog post

● Apache Spark infrastructure. Working on unifying batch compute on Apache Spark platform. Delivered query auditing feature. Working on a metrics platform for all Spark jobs, optimizing resource intensive Spark jobs and solutions to enhance data engineer productivity.

TroolyEngineering Lead, Data & Systems

2016 — 2017

Los Altos, CA

Data infrastructure, web crawling & data extraction

Rocket Fuel Inc.Senior Engineer & Technical Lead

2013 — 2016

Redwood City, CA

Developing and maintaining big and real time data infrastructure.

Projects i have led/worked on include cross cluster Hive dataset replication, authorization, performance instrumentation and improvements in our Hive query infrastructure, writing modeling data pipelines in Spark

Most recently led the development and architecture of Apache Spark based data pipeline framework, from inception & prototype phase to delivery, as its technical lead. The framework is now the main engine powering all data pipelines in our model building infrastructure. Key elements i was responsible for include:

Defining the roadmap for the project.

Worked on the prototype and defined overall architecture and design of the infrastructure

All things performance: identifying all performance bottlenecks and adding features/optimizations in the framework to fix them.

Writing big parts of the framework as well as write data pipelines on top of the framework.

Growing and supporting the engineering team that delivered this to production.

Project milestone planning & tracking towards delivery

Go to person for all Apache Spark related issues in the overall engineering team.

MicrosoftSenior SDE

2007 — 2013

Redmond

Core developer in the SQL Server Manageability team, working on various layers of the stack including web services, framework and UI. Has worked on SQL Server and SQL Azure releases.

Implemented web services that provide database management functionalities for SQL Azure in C#. In particular, implemented OData web service protocol, authentication protocol and associated encryption scheme, and worked on service instrumentation and deployment.

Education

Cornell University

BS

Cornell University

Experience+2

Education

BS

M.Eng

Experience