Software Developer, experienced in building and maintaining reliable and scalable systems with a diverse set of experience both in the US and overseas.
2024 — Now
New York, New York, United States
Built infrastructure and tooling powering our entire offline analytics platform, enabling product and data science teams to analyze petabytes of sensor data from millions of devices. Primarily worked with Spark / Databricks, AWS services (Database Migration Service, Aurora, Glue, Step Functions, DynamoDb), Dagster, Terraform.
ETL and Data Validation
Overhauled and lead efforts to build end to end validation tooling for replication processes for proprietary timeseries data, improving company confidence in replicated data.
Redesigned the replication path for Aurora RDS databases, cutting costs in half as well as improving maintainability and correctness. As part of this, became an active member of the internal database working group, helping on critical efforts like company-wide aurora 1->2 upgrade. Contributed upstream to Spark Protobuf to improve Protobuf parsing from Spark 3.4.0+.
Data Pipelines + Precomputed Reports
Maintained our in house Data Pipelines framework built on top of AWS Step Functions and Spark, to allow showing terabyte-scale reports to customers.
Cost Savings
Led multiple efforts to save the company upwards of $3 million a year by:
Using spot instances, and diversifying across machine type and AZ to reduce spot interruptions.
Query optimization as well as platform optimization (delta vacuum jobs, table optimizations, etc.) to reduce waste and total costs.
S3 lifecycle and tiering improvements to reduce the costs of petabytes of storage.
Improving databricks compute utilization.
Also, improved monitoring of costs to prevent future regressions.
Table Access Controls & Compliance
Leading efforts to implement granular table access controls to improve privacy/least privilege posture and increase auditability for a growing internal userbase.
Includes leading an effort to federate access controls to data-owners, and improve controls on both product and internal enterprise data
San Francisco, California, United States
Lead a full stack product team and executed on cross-functional full stack work across various verticals on the Samsara platform. Especially key were issues of scale, with millions of data points consumed every day, and product flexibility, as our enterprise customer base had a broad set of needs across almost every industry. Technologies used include kinesis, mysql, go, graphql, typescript, react, spark.
Gateway Health
Helped lead a cross functional effort over months to build and deploy a new report that gave both customers insight into the health of their fleets, and our internal teams the tools necessary to debug those issues.
used hardware data ingested from hundreds of thousands of devices and indexed results for efficient access
built offline computation logic in Spark providing a platform for broad analytical capabilities into health information
complex, multi-stage rollout with stakeholders both in and out of the organization
Reporting and Data Analytics
Built, optimized, and maintained full stack reports using a variety of internal reporting systems, notably a Spark cluster enabling us to query near-petabyte-scale data.
vehicle activity and usage to help clients identify low usage vehicles
time spent at certain sites
colocation reporting for field service teams
an analysis of inconsistencies in ingested odometer data to help inform firmware diagnostics efforts
Data Management Collaborations
Full stack representative for a cross functional group that was formed to help deploy a system to catalog and describe data in our data warehouse for easier consumption by internal teams.
2019 — 2020
San Francisco, California, United States
2017 — 2019
Fukuoka, Japan
Built backend and full-stack services supporting LINE stickers and LINE Wallet, used by over a hundred million users around the world.
Line Custom Stickers
Built the core sticker rendering in a product that allowed users to include custom text into sticker sets created by designers. Modeled the sticker text as SVGs, and built an SVG rendering pipeline using Java Graphics, integrating with a CDN, Redis Cache, Mongodb, and Spring Boot application. Was responsible for the entire stack including server allocation, load balancing and scaling, release monitoring, etc.
Additionally, built a partner-facing CMS to allow creation and modification of these sticker sets using Kotlin, Spring Boot, Mongodb, and Custom Elements.
Line Wallet
Helped build, monitor, and maintain the Line Wallet tab, which interfaced with a suite of financial products built by LINE. Maintained and improved Redis caching for data from downstream services, as well as a messaging API with rate-limiting to consolidate all financial messages from LINE through a single channel.
Sped up internal testing by rebuilding the internal CMS using Kotlin, Vue.js and Typescript.
SRE
Overhauled and consolidated multiple application alerting pipelines into a single pipeline using Prometheus. In addition, re-built our batch job monitoring and analytics.
Education
2010 — 2014
The University of Texas at Austin
B.S
2010 — 2014