# Siyang Xie > Software Engineer at Kumo.ai Location: Mountain View, California, United States Profile: https://flows.cv/siyang 6yr Google + 6yr Pinterest experience on various backend systems including data processing frameworks, micro-data (microservice-ish big data) platform, ML platform, ranking & recommendation infra, and product/application backend services. At my work, I retrospect existing system and architectural designs, crystalize holistic view, draw up strategic plan and innovate technology atop world-class engineering practices from Pinterest, Google and open-source community; I seed, drive and grow projects of large scope, start by rendering vision into deliverables and then shift energy towards tech leadership such as strategic planning, communication and cross-org collaborations. ## Work Experience ### Software Engineer @ Kumo.AI Jan 2022 – Present | Mountain View, California, United States ### Engineering Manager, Content Acquisition & Media Platform @ Pinterest Jan 2018 – Jan 2022 * Media (video & image) platform: end-to-end media ingestion infra (uploading, transcoding, and delivery/serving); visual signal computation infra; media optimization. * Content Acquisition/Ingestion Link scraping/crawling and parsing; link clustering, propagation & aggregation between pin links and images; high quality localness and trustworthy signals; crawled rich pin creation/ingestion infra; localized canonical pin ranking etc. ### Staff Software Engineer & Tech Lead, ML Data/Feature Platform @ Pinterest Jan 2017 – Jan 2018 | San Francisco Bay Area ML Signal/Feature Platform team owns critical data infrastructure & central platform to boost and scale signal/feature development efforts across entire company. I'm the founder, architect and TL of a novel "micro-data platform" (internal codename "Galaxy") which provides: - annotation-based generic DataFlow Java API (DSL) with extension for incremental processing - managed signal registry - state management and execution orchestration - ownership model and automated monitoring/alerting - fully-automated metadata (lineage, data coverage & distribution, delay etc) generation, collection and serving - strong governance, signal discovery, and signal life-stage dev flow - rich tooling (command line & UI) In addition, I server as uber-TL who's responsible for an umbrella of projects: * User Signal Platform V1/V2: company-wide central platform for realtime user interest & intent understanding * Discovery-indexing data pipelines and its migration to "Galaxy" * Engagement event-driven (Kafka consumer based) common infrastructure for realtime signals ### Senior Software Engineer, Content Platform @ Pinterest Jan 2016 – Jan 2017 | San Francisco Bay Area Built-from-scratch a unified user signal platform of development framework, infrastructure, offline data access and online serving. I innovated a novel data processing+serving lambda architecture, spreading incremental computation across batch, real-time and serving-time. User signal platform provides user features/signals to power Homefeed Ranking, Ads Ranking and other candidate retrieval/generation systems at Pinterest. Alongside the project, I introduced modern Dependency Injection technology to Pinterest and laid out a modularized framework for building processing/serving applications. I built a highly efficient real-time event processing framework/infra that processes xxx K/sec event qps with 25x higher efficiency than old infra. ### Senior Software Engineer, Homefeed @ Pinterest Jan 2016 – Jan 2016 | San Francisco Bay Area Proposed and drove framework/platform initiative for user modeling/feature engineering in Homefeed recommendation system. Built a unified user signal framework and service, and fully integrated with entire ML infra of Homefeed including model DSL, offline training/eval pipeline, and online scoring/serving path. ### Software Engineer @ Google Jan 2010 – Jan 2016 | Kirkland, WA "People/Contacts": Delivers people-oriented Contacts experience for Google users. Fully owns the backend service to provide person resolution and deduplication suggestions to assist users clean up their address books. "Social/G+ Discovery": Part of infra team that built in-house large scale streaming/incremental social affinity signal computing infrastructure, and an online scoring service at the core of Google's social graph backend that serves social affinity scores to clients such as Gmail (autocomplete), Hangout (chat roster ranking), Contacts (frequently contacted list), G+ (friend recommendation) etc. "Crash": Optimized and re-architectured infrastructure of collecting, processing, and analyzing application crash reports for high efficiency, scalability, flexibility and reliability. ## Education ### M.S. in Computer Engineering UC Irvine ### M.S. in Physics UC Irvine ### B.S. in Physics University of Science and Technology of China ## Contact & Social - LinkedIn: https://linkedin.com/in/siyang-xie --- Source: https://flows.cv/siyang JSON Resume: https://flows.cv/siyang/resume.json Last updated: 2026-04-10