I’m a Principal Software Engineer with over 15 years of experience designing and scaling data-intensive systems and ML platforms at Apple and Yahoo. My expertise lies in building resilient, high-throughput streaming platforms, real-time data lakes, and enabling ML workflows at scale.

Experience

AppleStaff Software Engineer

2022 — Now

California, United States

Currently enhancing the Flink SQL ecosystem to enable analysts and ML engineers to easily build and operate streaming pipelines, integrating Kafka and REST catalogs for greater flexibility and expressiveness. Working on improving observability and data lineage tools to strengthen job traceability and debugging. Actively coordinating and assisting in tuning and optimizing large-scale Flink + Iceberg jobs to improve performance and scalability.

Built scalable ML platforms and tools to accelerate anti-fraud development in Trust and Safety team. Designed real-time Flink-based sequence models to detect account takeover patterns in iMessage, optimizing Flink+RocksDB state (~5TB) and creating reusable Flink SQL frameworks for streaming aggregates. Set up JupyterLab environments to improve experimentation and streamlined data access from Snowflake, HDFS, and S3. Led Spark batch job migration from Hive to Iceberg format, achieving faster runtimes through predicate pushdown and optimized partitioning and storage layouts.

AppleLead Software Engineer

2021 — 2022

Led the design and implementation of a centralized, secure data lake for fraud and abuse detection across Apple services. Transitioned legacy Spark pipelines to Flink for near real-time ingestion from 300+ Kafka topics, adopting Apache Iceberg for robust schema evolution, ACID guarantees, and performance gains. Developed self-serve schema evolution tools, Py4J-based cross-language APIs, and schema caching, achieving ~40% compute savings and driving broad adoption across Trust & Safety and ML teams.

YahooPrincipal Software Engineer

2019 — 2020

Responsible for migrating all data from privately managed Flurry HBase clusters to Yahoo’s centralized, multi-tenant HBase platform to optimize infrastructure costs and reduce operational overhead. The migration involved transferring 2.5 petabytes of data without impacting live services. To ensure service continuity, the backend was enhanced to support parallel writes to both source and destination clusters. Internal migration tools were upgraded to handle secure cluster transitions, while performance was optimized using HBase snapshots and bulk load techniques. The effort included close collaboration with cross-functional teams to coordinate service dependencies, security, and migration schedules.

YahooSenior Software Engineer

2015 — 2019

United States

Flurry Push enables app developers to send targeted messages to re-engage and retain users. Designed and implemented the evaluator component that includes scheduler, apply delivery limits across campaigns and guarantee one time delivery per campaign. Enhanced Hbase server side row filtering (HBASE-20618) thus reducing disk IO to optimize evaluating users based on timezone.

Automated the ingestion pipeline that processes the Yahoo Knowledge Graph data packs with 2 days SLA. Integrated sift key point based image de duplication to eliminate the duplicates from data across providers. Improved relevancy using partner based signals and query category.

YahooLead Software Engineer

2012 — 2015

Bengaluru Area, India

Involved in various aspects of Image Search content processing infrastructure. Implemented real time processing pipeline based on Storm and Hbase to ingest partner content with 15min SLA.

Resurrected the batch-processing pipeline on Hadoop to recreate the web crawl index for image search.

Education

Birla Institute of Technology and Science, Pilani

Experience+3

Education

B.E(Hons) Mechaniical Engineering

Experience