๐๐
๐ฝ๐ฒ๐ฟ๐ถ๐บ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป ๐ฃ๐น๐ฎ๐๐ณ๐ผ๐ฟ๐บ: Designed and implemented a high-performance platform handling over 3 billion API requests daily. It delivers real-time insights with a 95th percentile execution time of 3 milliseconds, enabling rapid iteration and optimization. Features include user segment and cluster creation for targeted experimentation, enhancing performance across diverse user groups. This platform fosters collaboration with transparent, actionable data. Contributed to DRUID by adding a parquet extension, a P-Value aggregator, and fixing bugs.
๐๐ถ๐บ๐ฒ๐น ๐๐ฎ๐๐ฎ ๐๐ฃ๐: As a core engineer, developed and deployed GIMEL, an Apache Spark-based big data framework. GIMEL provides an efficient abstraction layer for data access across storage systems and integrates with Jupyter notebooks for interactive exploration. My contributions, including the GIMEL Thrift server, enable robust, secure data access and streamline the data science workflow through rapid prototyping and visualization.
๐จ๐ป๐ถ๐ณ๐ถ๐ฒ๐ฑ ๐๐ฎ๐๐ฎ ๐๐ฎ๐๐ฎ๐น๐ผ๐ด: Developed PayPalโs Unified Data Catalog, a metadata system for governing data and AI assets across files, tables, models, and dashboards in both on-premises and cloud environments. As a founding engineer, I was crucial in designing and implementing this comprehensive system.
๐ก๐ฒ๐
๐๐๐ฒ๐ป ๐๐ฎ๐๐ฎ ๐ ๐ผ๐๐ฒ๐บ๐ฒ๐ป๐ ๐ฃ๐น๐ฎ๐๐ณ๐ผ๐ฟ๐บ: Designed a scalable, autonomous data movement platform with schema registry and automated TSDB monitoring. Technologies used include Apache Gobblin, Airflow, Aerospike, Oracle, and Spark.
๐๐ถ๐ฟ๐๐ ๐ฃ๐ฎ๐ฟ๐๐ ๐ง๐ฟ๐ฎ๐ฐ๐ธ๐ถ๐ป๐ด ๐๐ป๐ณ๐ฟ๐ฎ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ (๐๐ฃ๐ง๐): Led the development of an internal client and server-side event tracking platform. This platform includes APIs and libraries for efficient data capture, processing, and real-time report generation. It is essential for tracking web traffic and KPIs for various PayPal products and marketing campaigns.