Software engineer with deep expertise in data platforms, cloud infrastructure, and streaming systems, driving large-scale platform modernization and cost-efficient architecture.
On-Prem Druid to SaaS Migration: Spearheaded migration of real-time analytics platform to Imply SaaS, covering infrastructure provisioning, codebase adaptation, observability setup, and stakeholder engagement—ensuring zero downtime and smooth user onboarding.
•
Schema Evolution Governance: Drove cross-org initiative to modernize schema evolution with governance guardrails. Led system design, stakeholder alignment, and mentored engineers on schema registry integration and evolution patterns.
•
Sub-Second Latency Redesign: Architected and deployed low-latency streaming solutions using Apache Flink, Kafka (MSK), and Druid, reducing average query time from 9s to 0.3s (96% improvement) and cutting costs by 98% through Druid-native optimization.
•
Streaming Framework Modernization: Rebuilt core data ingestion platform using Apache Flink, Kafka (MSK), EKS, and S3, deprecating Oracle and enabling open table formats (Iceberg) on Snowflake for cross-platform interoperability. The pattern enabled other services to leverage Snowflake as backend datastore and saved ~$540k on Oracle licensing and ~$360k annual operational cost.
•
Databricks Adoption: Led the evaluation and POC of Databricks vs. EMR for batch workloads. Designed migration plan and migrated critical EMR pipelines. Converted 90% of the raw tables from parquet format to interoperable format delta lake. This resulted in a cost savings of ~$350k/year.
Responsible for front-end web application development and back-end database and services implementation for an IoT device analytical platform that monitors system healthiness and reports KPIs.
•
Created responsive and user-centric UI components using React/Redux. Requested data and displayed interactive maps with Google Maps API. Built data visualization with Ant Design and D3.js. Handled user authentication flow with AWS Cognito.
•
Developed Python and serverless framework for back-end services development, including RESTful APIs with API Gateway. Automate data processing with Lambda functions and S3.
•
Automated the deployment flow with AWS CloudFormation, and Code Pipeline.
•
Managed and improved the graph-based database and significantly improved the query speed.
•
Practiced Agile and Scrum in 2-week sprints with 12 other developers.
Developed and deployed data pipelines from S3 to Athena, Quicksight on AWS using Glue and Lambda, realized performance analysis on the dashboard.
•
Predicted bus travel time with location-based data using time series and random forest in Spark, integrated machine learning algorithms with data pipeline using AWS EMR.
•
Automated the troubleshooting process by mining device logs using SQL with AWS Glue, Athena and CloudWatch, which reduced the troubleshooting cycle time by 80%.