Experience
2022 — Now
2022 — Now
San Francisco Bay Area
2020 — 2022
2015 — 2020
2015 — 2020
Engineer on Uber's Data Platform team working on query engines. Mainly focussing on query planning, pushdowns and latency optimizations.
Neutrino:
Neutrino is Uber's fat trimmed version of Presto to query realtime databases in sub second latencies by aggressive pushdowns of joins and aggregations. Worked on supporting Aresdb connector in Neutrino. Added sort and topn pushdowns in the Presto Pinot connector.
Presto Hudi namenode RPC optimizations:
Reduce excessive namenode calls and efficiently list/access filesystem for Hudi tables. Redesigned split calculation as a lightweight filter on top of the regular file listings done in Presto, using Hudi’s APIs directly and Presto query planning code paths.
Faster Hudi Hive Incremental queries:
Reduce planning overhead and improve query latency for Hudi Hive incremental queries, unlocking 10-30x more efficient warehouse ETLs.This involved rewriting Hudi/Hive integration ground up, specifically focusing on speeding up incremental queries by leveraging Hudi commit data.
Marketplace data systems:
Early engineer on the Gairos team building realtime Data/ML Pipelines powering pricing/surge services. Designed and implemented lambda architecture using Apache Spark (batch/streaming) and Apache Samza (streaming), transforming real-time event streams and materializing into updateable OLAP systems (primarily Elasticsearch) for serving real time data/features to critical Marketplace services.
2014 — 2015
2014 — 2015
Mountain View, CA
Engineer on Voldemort (Linkedin's NoSql Kv Store inspired from Amazon's Dynamo)
• - Design and implement storage and kafka consumption layers for Venice - the new derived data platform on Voldemort.
• - Improve bandwidth utilization by adding support for compression in Voldemort Read only pipeline
• - Lead migration of existing voldemort clients to a REST based coordinator
• - Added new admin client APIs and tools to manage Voldemort stores on Coordinator service
• - Design and implement RocksDB storage engine for Voldemort
• - Introduce separation of network and storage handling using Async storage worker threads
• - Improve queuing delays via better request scheduling in the server
• - Used Netty to RESTify the server
• - Leverage client side request timestamps to provide a Java-GC aware measure of client-server latencies
• - More efficient handling of the communication buffer on the server
• - Admission control techniques to enable the server respond to overloads more gracefully
2010 — 2011
2010 — 2011
Watchdog Infrastructure: Developed a 'Watchdog' service to monitor and manage Enterprise computational tasks.Developed the basic resource scheduler to spawn remote processes on
cluster machines and monitor them continuously for failures.
Dependency Aware Scheduling: Enhanced the basic infrastructure to specify inter task dependencies, task blackout times using a configuration file. Scheduling algorithm processes these dependencies and schedules tasks only when the dependencies are satisfied. Additionally, to improve resource utilization, dependencies can be specified to particular checkpoints in another task, enabling the dependent task to be started in parallel to the original task.
Fault tolerance: Scheduler handles failures based on policies specified in the configuration file, either restarting all dependent tasks or only rerunning select checkpoints in each dependent task.
Banking Applications: Worked on development of SWIFT protocol supporting applications like IBR(Internal
Branch Routing) and OFAC Tools for Enterprise Financial Messaging.
Education
University of Colorado Boulder
Master of Science
San Francisco State University
Master of Science
Madras Institute of Technology, Anna University