Experience
2019 — Now
2019 — Now
San Francisco, California, United States
Recognized for ML innovation: Architected a Python-based model inference framework that won the Engineering Innovation Award from Indeed CEO Chris Hyams, demonstrating the ability to lead high-impact compliance and governance programs in a fast-paced environment.
Implemented CIS Controls to ensure the highest security and compliance for data science initiatives. Collaborated with Legal, Security, and AWS architects to design privacy-compliant systems which became the foundation of the AI Ethics data science team for analysis of PII and user demographic data.
Managed cross-functional collaboration between Legal, Security, and AWS architects to design a privacy-compliant system for processing and storing demographic data at scale. This required navigating complex privacy laws, corporate policies, and risk mitigation strategies to ensure compliance.
2016 — 2019
2016 — 2019
San Francisco, California, United States
Real-Time Data Infrastructure: Architected scalable data ingestion pipeline processing billions of events from millions of concurrent users, leveraging Apache Kafka, Spark Structured Streaming, and AWS auto-scaling to achieve sub-second latency
ML-Driven Ad Optimization: Redesigned Unity Ads auction algorithms reducing computational complexity, incorporating machine learning models for bid optimization and real-time ad delivery decisions
A/B Testing Framework: Built experimentation platform for ad auction improvements, enabling data-driven decision making through statistical analysis of revenue impact, user engagement, and algorithm performance metrics
2014 — 2016
2014 — 2016
San Bruno, California, United States
Anomaly Detection System: Developed ML-based anomaly detection features for YouTube using causal impact analysis (https://goo.gl/7VjJme), identifying abusive content and spam through statistical modeling of user behavior patterns and content metrics, reducing spam by significant margins
Scalable Pattern Recognition: Built real-time detection algorithms processing millions of videos daily, leveraging statistical methods to identify irregular patterns in view counts, engagement metrics, and user interactions
SOX Compliance Framework: Architected compliance testing infrastructure for ad revenue systems using Java and PostgreSQL, implementing automated validation through Jenkins CI/CD pipelines to ensure financial transaction accuracy
Revenue Integrity: Developed real-time revenue tracking and validation system, creating integration tests that verified millions of daily ad transactions while maintaining SOX compliance requirements for financial reporting
2012 — 2014
2012 — 2014
San Francisco Bay Area
Large-Scale Data Pipeline: Engineered distributed crawling and indexing ecosystem processing 100M+ app reviews across 1M+ applications from Google Play, Apple App Store, Amazon, Microsoft, and Nook platforms, achieving 99.9% reliability
NLP & Sentiment Analysis: Developed natural language processing algorithms to extract user sentiment, automatically detect software defects from review text, and transform unstructured feedback into actionable product insights
Scalable Architecture: Built high-throughput data collection infrastructure using distributed systems principles, handling real-time ingestion and processing while maintaining data quality and deduplication across multiple app stores
Analytics Platform: Created analytics dashboard that synthesized review data into business intelligence, enabling product teams to identify critical bugs, feature requests, and user satisfaction trends across competitive landscapes
2010 — 2012
2010 — 2012
San Francisco Bay Area
WebDriver Architecture: Implemented the foundational WebDriver API for Chromium, designing REST API interfaces and extending Chrome's IPC system to enable automated browser testing - now used by millions of developers worldwide
Privacy-Compliant Data Pipeline: Built scalable crawling infrastructure for user post data and search histories, implementing comprehensive security controls, privacy safeguards, and audit trails while maintaining high throughput
Personalized Search Algorithm: Integrated Twitter feeds and external data sources to enhance search relevance, developing Chrome extension that dynamically reordered results based on user interests and social signals
Distributed Testing Infrastructure: Architected fully automated testing harness on EC2 for clustered machine tests, incorporating advanced image comparison algorithms and fuzzy template matching integrated with Ocean book scanning OCR for 40+ languages
Education
University of Michigan