I am a hands-on technical engineer focussed on the practice of building data infrastructure, platforms, and methods for building data-centric products. I like making data and computing more accessible to people and products.
Experience
2023 — Now
2023 — Now
San Francisco, California, United States
2019 — 2023
2019 — 2023
San Francisco, California
• Migrated Spark compute to be backed by Kubernetes. The migration led to a 2x reduction in infrastructure cost and gave the added benefit of autoscaling and running multiple versions of Spark concurrently.
• Built a DSL for the workflow orchestration engine. The DSL provides an easy-to-use interface for Data Scientists to trigger and manage their ML workflows.
• Terraformed the entire network infrastructure for the Data Platform.
• Introduced accelerated compute using GPUs for model training and serving.
• Migrated batch processing from disparate compute to a Kubernetes-based environment. Ensured backward compatibility to existing APIs and interfaces. Provided a computational environment that scales up/down on-demand and saves costs.
• Implemented adaptive resource allocation for Kubernetes pods. The allocation model is based on probabilistic matching of prior executions. This feature resulted in about a 45% reduction in computing cost for batch job execution.
• Implemented a data quality framework. The framework integrates into the Spark writers and does automated trend analysis before persisting the datasets.
• Implemented an application to provide AWS credentials periodically on a developer machine tied to SAML and AWS. This improved security and compliance around data and infrastructure access.
• Designed and implemented a multitiered data backend for storing and serving near real-time user interactions. The implementation replaced an in-memory solution, which resulted in a 10x reduction in infrastructure costs while keeping the latency nominal.
• Implemented an artifact caching and serving layer that feeds the batch pipelines. p99 latency of under 100ms and high availability.
• Involved in mentoring other engineers on design, code quality, and best practices.
2017 — 2019
2017 — 2019
San Francisco Bay Area
• Built a stable and automated data pipeline on Google Cloud by migrating Kafka, implementing Spark jobs, and creating underlying scheduling that reduced operational spend increased data throughput up to 2GB/sec.
• Implemented a manifest based workflow and taxonomy for Apache Airflow by creating pre-defined templates and introducing continuous integration reducing code entropy and eliminating manual components.
• Developed data anonymization and deletion framework to ensure adherence to federal and GDPR privacy guidelines.
• Implemented Spark jobs for product analytics, ETL, schema extraction, and enrichment on an on-going basis to provide data for product and machine learning workflows as well as business leaders.
• Coached a team of data engineers as the tech lead from interviewing and onboarding new hires to providing on-going mentoring to set them up for success.
• Created an automated data monitoring framework allowing for proactive triaging and improving data availability.
• Lead a project to implement a TensorFlow based ML pipeline for valuation of promotional ads units.
2017 — 2017
2017 — 2017
San Francisco Bay Area
• Built the first version of Unity’s consolidated data platform
• Created a geo-distributed data ingestion pipeline, schema inference, and ETL pipeline that fed into a SQL query layer by collaborating with other business units and onboarding datasets to ensure alignment based on diverse business and technology requirements.
• Streamlined business intelligence, machine learning, and data engineering workflows to eliminate fragmentation and improve data consumption and production feasibility.
2016 — 2017
2016 — 2017
San Francisco Bay Area
• Migrated homegrown infrastructure to Kubernetes as the manager for 5 backend engineers growing capacity to support growth from 13M to 150M daily users.
• Coached and developed individuals including the hiring of 3 new team members from job training to performance management.
Education
The University of Texas at Austin