• Leading efforts to nativize Spark workloads using Velox for 2x speedup and 25% cost reduction
• Led operational excellence charter with key projects like automated troubleshooting, history server reliability, build migration of UDFs, release pipeline stabilization, error actionability.
• Led security charter to address 10+ vulnerabilities and successfully onboard the high risk Novi(cryptocurrency) data pipelines.
• Drove 2yr migration of spark workloads to trusted gateway by implementing token based authentication, rpc traffic encryption, data at rest encryption, executor isolation, user code separation and shared blob bucket security. See my talk “Securing Apache Spark Applications at Facebook” at Spark + AI Summit 2020.
• Resolved complex dependency hell (500+ dependencies with 200k overlapping classes) in a forked Spark repo, reducing build times from 50 minutes to 30 seconds for 30+ devs.