•Built and maintained automated 100+ data pipelines to provide accurate and timely
business data used for loan underwriting and KYB practices. (Node.js/Puppeteer scrapers,
Apache Beam, Scio, Google Dataflow, Airflow orchestration, BigQuery, Elasticsearch)
•Led migration from Apache Beam/Dataflow to dbt for SQL -based transformations to
improve developer velocity, and testing (dbt, Airflow, BigQuery, Elementary)
•Created integrated data platform to orchestrate pipelines through business events for
improved latency, native monitoring and reporting, and built-in tooling for Ops to facilitate
data transfers (React, Ruby, Python, Parquet, Google BigQuery, Google Pub/Sub)
•Refactor and add new auditing tools used by internal team for accurate and quick review of
accuracy of business documentation (React, Ruby)