Experience
2021 — Now
San Francisco, California, United States
Data Platform
• Created Wellfound’s data platform, powering company-wide analytics and in-product data features.
• Operate large-scale data pipelines in Dagster and dbt, ingesting and modeling multi-source datasets into AWS Redshift and an AWS-based Data Lake.
• Expose and evolve real-time data APIs using FastAPI and Pydantic validation
• Maintain and improve MLOps infrastructure: LLM Proxy layer with real-time logging, uptime, performance, and cost observability; integrated Braintrust for LLM evaluation.
• Scale and optimize HuggingFace ML classification and embedding pipelines, serving billions of predictions in production.
Candidate Sourcing Service
• Aggregate candidate and company data from diverse internal and external sources.
• Transform, normalize, and ML-enrich data (classification, tagging, semantic similarity).
• Index and serve 1B+ candidate profiles and millions of companies in Elasticsearch for high-performance, recruiter-facing search and matching.
Tech: AWS (S3, Redshift, DMS, Athena, Lambda, CloudWatch), Elasticsearch, Python, FastAPI, Pydantic, HuggingFace, dbt, Dagster, Datadog, SQL, LLM
2018 — 2021
2018 — 2021
San Francisco Bay Area
• Built and led Envoy’s Data Engineering function from the ground up, enabling self-serve analytics, machine learning, and experimentation across the company.
• Managed and expanded a cross-functional data team to 9; introduced eng best practices (QA, testing, observability) to improve reliability and trust in analytics.
• Designed and implemented customer cohort feature adoption models in dbt + Looker, delivering actionable product insights.
• Developed a scalable event tracking ingestion pipeline (Segment → AWS Glue Data Lake → Redshift) with downstream data models powering analytics, A/B tests, and reporting.
• Delivered embedded, customer-facing analytics inside the Envoy product using AWS DMS, Django, Pandas, React, and Vega.
• Integrated Salesforce and Intercom with internal analytics pipelines, syncing millions of user attributes for sales and support.
Tech: AWS Glue, Redshift, Data Lake, dbt, Python, Django, React, Datadog, Segment, Amplitude, Looker
2016 — 2020
2016 — 2020
San Francisco Bay Area
After 10 years as a leader at 8tracks, I moved to an advisory position.
I contributed to data and ML initiatives until 8tracks was acquired, then led the technical transition to the new owners.
2006 — 2016
2006 — 2016
San Francisco Bay Area
• Co-founded 8tracks, scaling it into a beloved music discovery platform with 8M monthly active users and a vibrant, loyal community.
• Built and led a passionate, high-performing EPD team from 1 → 12, fostering creativity, collaboration, and a deep connection to our mission.
• Engineered a high-performance service, consistently maintaining sub-100ms API responses at scale.
• Pioneered data science-driven music recommendations using word2vec within months of its publication.
• Developed a highly scalable in-house product analytics and experimentation platform, processing 1B+ events/month to drive product decisions.
Tech: Ruby, Redis, Redshift, StatsD, D3.js, SQL, AWS, sklearn
2015 — 2016
2015 — 2016
San Francisco Bay Area
• Established real-time event observability with StatsD → Graphite/Grafana to monitor deliveries, inventory, and driver location.
• Integrated Domino Data Lab to enable data scientists to independently develop and deploy ML models.
• Consolidated data warehousing into Redshift and standardized LookML for consistent, reliable analytics.
• Built evaluation and visualization tools for monitoring delivery ETA prediction model performance.
Tech: StatsD, Graphite, Grafana, Redshift, Looker, Python, Domino Data Lab
Education
ESIEA - École d'Ingénieur·e·s d'un numérique utile