ML Signal/Feature Platform team owns critical data infrastructure & central platform to boost and scale signal/feature development efforts across entire company.
I'm the founder, architect and TL of a novel "micro-data platform" (internal codename "Galaxy") which provides:
• annotation-based generic DataFlow Java API (DSL) with extension for incremental processing
• managed signal registry
• state management and execution orchestration
• ownership model and automated monitoring/alerting
• fully-automated metadata (lineage, data coverage & distribution, delay etc) generation, collection and serving
• strong governance, signal discovery, and signal life-stage dev flow
• rich tooling (command line & UI)
In addition, I server as uber-TL who's responsible for an umbrella of projects:
* User Signal Platform V1/V2: company-wide central platform for realtime user interest & intent understanding
* Discovery-indexing data pipelines and its migration to "Galaxy"
* Engagement event-driven (Kafka consumer based) common infrastructure for realtime signals