# Swapna M

> Staff Software Engineer at Apple

Location: Sunnyvale, California, United States
Profile: https://flows.cv/swapnam

I’m a Principal Software Engineer with over 15 years of experience designing and scaling data-intensive systems and ML platforms at Apple and Yahoo. My expertise lies in building resilient, high-throughput streaming platforms, real-time data lakes, and enabling ML workflows at scale.

At Apple, I work on the Apache Flink platform team, building infrastructure to operationalize Flink and Iceberg jobs while helping teams optimize and tune workloads for large-scale operation.I lead efforts within the Trust & Safety Data Lake and Governance team, designing scalable infrastructure to enable secure, governed, and efficient access to sensitive data. I work closely with cross-functional teams to build real-time ingestion pipelines using Flink and Iceberg, support ML experimentation through JupyterLab integration, and ensure robust data governance and compliance practices.

Previously at Yahoo, I spearheaded petabyte-scale HBase migrations, developed Flurry Push for personalized mobile engagement, and built scalable image search systems integrating Getty content—powering millions of users and contributing to significant revenue growth.

I thrive at the intersection of platform engineering, data systems, and ML infrastructure, consistently focusing on scalability, usability, and impact.

## Work Experience
### Staff Software Engineer @ Apple
Jan 2022 – Present | California, United States
Currently enhancing the Flink SQL ecosystem to enable analysts and ML engineers to easily build and operate streaming pipelines, integrating Kafka and REST catalogs for greater flexibility and expressiveness. Working on improving observability and data lineage tools to strengthen job traceability and debugging. Actively coordinating and assisting in tuning and optimizing large-scale Flink + Iceberg jobs to improve performance and scalability.


Built scalable ML platforms and tools to accelerate anti-fraud development in Trust and Safety team. Designed real-time Flink-based sequence models to detect account takeover patterns in iMessage, optimizing Flink+RocksDB state (~5TB) and creating reusable Flink SQL frameworks for streaming aggregates. Set up JupyterLab environments to improve experimentation and streamlined data access from Snowflake, HDFS, and S3. Led Spark batch job migration from Hive to Iceberg format, achieving faster runtimes through predicate pushdown and optimized partitioning and storage layouts.

### Lead Software Engineer @ Apple
Jan 2021 – Jan 2022
Led the design and implementation of a centralized, secure data lake for fraud and abuse detection across Apple services. Transitioned legacy Spark pipelines to Flink for near real-time ingestion from 300+ Kafka topics, adopting Apache Iceberg for robust schema evolution, ACID guarantees, and performance gains. Developed self-serve schema evolution tools, Py4J-based cross-language APIs, and schema caching, achieving ~40% compute savings and driving broad adoption across Trust & Safety and ML teams.

### Principal Software Engineer @ Yahoo
Jan 2019 – Jan 2020
Responsible for migrating all data from privately managed Flurry HBase clusters to Yahoo’s centralized, multi-tenant HBase platform to optimize infrastructure costs and reduce operational overhead. The migration involved transferring 2.5 petabytes of data without impacting live services. To ensure service continuity, the backend was enhanced to support parallel writes to both source and destination clusters. Internal migration tools were upgraded to handle secure cluster transitions, while performance was optimized using HBase snapshots and bulk load techniques. The effort included close collaboration with cross-functional teams to coordinate service dependencies, security, and migration schedules.

### Senior Software Engineer @ Yahoo
Jan 2015 – Jan 2019 | United States
Flurry Push enables app developers to send targeted messages to re-engage and retain users. Designed and implemented the evaluator component that includes scheduler, apply delivery limits across campaigns and guarantee one time delivery per campaign. Enhanced Hbase server side row filtering (HBASE-20618) thus reducing disk IO to optimize evaluating users based on timezone.

Automated the ingestion pipeline that processes the Yahoo Knowledge Graph data packs with 2 days SLA. Integrated sift key point based image de duplication to eliminate the duplicates from data across providers. Improved relevancy using partner based signals and query category.

### Lead Software Engineer @ Yahoo
Jan 2012 – Jan 2015 | Bengaluru Area, India
Involved in various aspects of Image Search content processing infrastructure. Implemented real time processing pipeline based on Storm and Hbase to ingest partner content with 15min SLA. 

Resurrected the batch-processing pipeline on Hadoop to recreate the web crawl index for image search.

### Software Engineer @ Yahoo
Jan 2010 – Jan 2012 | Bengaluru Area, India
Designed and implemented the serving infrastructure for slotting Getty and Yahoo owned media content above Bing on Image Search. Involved offline analysis of signals from various systems to determine the query intent, relevancy and content availability. Improved user engagement by tuning search index native ranking coordinating with research team.

Worked on enhancing the image search slideshows feature by adding more content providers and integrating the automated slideshow generation pipeline from research team.

### Senior Service Engineer @ Yahoo
Jan 2008 – Jan 2010 | Bengaluru Area, India
Actively involved in different aspects of Bing transition. In particular on benchmarking, planning the BCP, continuous monitoring by actively coordinating with external teams.

Completely reworked on the pipeline that pulls data from different streams mainly crawlers, editorial databases, feedback loop system to Feeder systems.

Enhanced monitoring and self recovery of all multimedia search components by adding application level monitoring into nagios and Uranus graphs that reduced incidents rate.

### Intern @ NetApp
Jan 2007 – Jan 2008
Involved in developing a performance log analyzer for automated troubleshooting by applying data mining techniques on performance logs of Netapp filers.


## Education
### B.E(Hons) Mechaniical Engineering in Msc(Hons)Information Systems, Information systems
Birla Institute of Technology and Science, Pilani


## Contact & Social
- LinkedIn: https://linkedin.com/in/swapna-m-54603ba

---
Source: https://flows.cv/swapnam
JSON Resume: https://flows.cv/swapnam/resume.json
Last updated: 2026-04-12