Software Engineer focusing on big data with Typescript, Python and AWS experience. Always on the lookout for a good puzzle. I walk the line between backend and infrastructure, with experience building multi-terabyte scale data lakes and streaming event systems.
Experience
Greater Seattle Area
Team lead for the XM Directory Analytics Team. Owner of two terrabyte scale ETL pipelines with a track record of delivering features on time and reducing operational overhead. Some of our larger projects include:
Transaction Enrichment Based Segments
Designed enhancements to our query language to support a new resource type
Implemented new filters in our query API for the new resource type
Built new ETL pipeline for new resource type
Delivered the project on-time with no major bugs reaching production
Contact History Ingestion Optimization
Reduced peak ingestion lag from ~5hrs to ~5minutes
Eliminated the source of half our customer reported bugs
Worked cross-org to add extra context in events, eliminating ~25% of calls made by our service
Improved directory cache lookup hit rate from 5% to ~90%
Improved contactId cache lookup hit rate from ~2% to 70%
Contact Centric Dashboards
Build onboarding UI for contacts datasets
Mentored back-end project lead on parallel data loading (chunking) enhancement
Worked cross-org with the dashboards team to ensure our datasets are displayed correctly in their UI
Measured and tuned alert noise generated by the new system, reducing alert fatigue
2019 — 2022
Greater Seattle Area
Back end engineer focusing on big data platforms with Typescript, Scala and AWS experience
Survey Based Segments
Wrote design doc, organized design review, negotiated scope with PMs, created tasks and timeline
Designed legacy question id to fieldset field id conversion enabling ~50 services to query survey data
Standardized legacy question id conversion process w/ data platform and data lake teams
Implemented legacy id conversion in segments query engine with NodeJS and Typescript
Added survey question support to segments query API using NodeJS and Typescript
Query Optimization
Created query scale testing tool with cache defeating randomization and statistical sampling in python
Implemented batching query scheduler in Typescript, increased query throughput 10x in production
Scale tested segmentation with 2000 queries/hr on 1M-10M row datasets, unblocking large customers
Qualtrics Data Lake
Designed file based ingestion API using S3 presigned URLs, file upload events and SQS
Implemented file upload API in Scala with the play framework used to upload 100GB/hr to parquet files
Developed Avro schema store with Scala API backed by S3. Stores schemas for ~100k datasets
Implemented reschematization w/ SQS FIFO Queues and nomad batch jobs to regenerate parquet files
Created + implemented disaster recovery plan. Added backups w/ auto restore for data and services
Ran quarterly disaster recovery tests. Tested automated recovery from simulated entire DC failure
Chicago, Il
Designed and implemented fast rollbacks reducing downtime due to bad code by fifteen minutes (40%). Planned and Implemented Auth0 integration into our SAAS product reducing security risks. Implemented authorization framework for SAAS organization hierarchy using JWTs, python, and js. Designed and implemented authorization framework eliminating risk of forgetting auth on new endpoints. Implemented CI/CD pipeline used by 40 developers to deploy 8 times a day with zero downtime. Implemented async database clients for managing organizations and users using python and asyncio. Organize and run a bi-weekly game night with co-workers.
Education
2014 — 2017
University of Michigan
Bachelor's degree
2014 — 2017