Proven Architect with 15+ years of Engineering experience in design and development of large scale applications in the areas of Data warehousing, Big Data, BI Environments.
Experience
2021 — Now
2021 — Now
San Jose, California, United States
Working in the Platform team, I am responsible for guiding various Modernization teams on data store architecture, database design, and process reviews, ensuring business requirements are implemented correctly without scalability or portability issues.
As a Staff Software Engineer at CDK Global, I lead data modernization and platform engineering initiatives across multiple high-impact projects. I have driven RDS scalability and HA/DR enhancements, implemented partition reorganization and purge/archive frameworks, and optimized SQL and ETL workflows for modernization schemas.
I provide back-end data support for critical programs including Unify and Embedded Reporting, and have delivered tools such as SQL performance analyzers, cross-platform migration utilities, and automation scripts to improve reliability, observability, and cost efficiency.
Partnering with cross-functional teams, I ensure data design best practices, migration strategies, and monitoring frameworks are in place, delivering scalable, performant, and future-ready solutions that enhance both team productivity and customer experience.
2020 — 2021
2020 — 2021
Worked in Big Data Engineering, focusing on the migration of legacy AppWorx/Lassen-based datasets. This effort involved analyzing and retrofitting around 50 datasets for the LMS team, transforming them into new, efficient de-normalized tables and files.
The LMS datasets are built on a homegrown UMP/UDP framework and utilize technologies such as Hive-SQL, Spark-SQL, PIG, and Scala, running on Apache Spark. As part of streamlining the BDE SOT tables, I developed custom UDFs using the organization?s standard UDF framework to enhance processing efficiency and maintainability.
In addition, I successfully converted approximately 50 PIG data flows to Spark-SQL, significantly improving performance, scalability, and alignment with modern big data practices.
2019 — 2020
Sunnyvale, California, United States
led the design and development of the new PruDB2DL process, introducing enhanced logging, tracking, and monitoring features for better system observability and performance.
To close key process gaps, I built several automation tools, including a Log Purge and Archive utility, a Job Monitor for long-running jobs, a Basic Data Quality (DQ) process, and a Job Status Reporting solution.
As part of the SIA project, I automated 5 out of 8 NFS feeds using the in-house Fast Data Exchange (FDE) framework built on Scala and Spark, while improving existing code for reliability. I also delivered WSG Alert Feeds and developed a fully automated Reconciliation Feed for the RET team.
Additionally, I created a salable mass-loading framework to ingest ~3,500 tables into the Data Lake for departments like GI, and mentored several team members to help the team achieve delivery excellence.
2018 — 2018
2018 — 2018
San Francisco Bay Area
Working in Big Data Science team, Responsible for delivering project Sparkle. Sparkle provides customer metrics to various departments such as marketing and business analysts etc.
• Designed and deployed 23-25 ETLs for the Data Science Team.
• Provided necessary data for models, setup Model ETL and uploaded scores.
• Developed tool to generate DSMF definition from excel
• Developed 15-17 DSMF definitions for the monitoring framework
Technologies: Hive, Bash scripting, Python, Oozie
2015 — 2017
2015 — 2017
Santa Clara
Company provides closed loop restaurant marketing SaaS platform that is highly scalable and ingesting data from various sources which include email, SMS, social, online ordering, loyalty programs, reservations, and more. The analytics platform provides clients with actionable insights about guests, menus, pricing, media mix, and social media.
• Delivered Multithreaded Java Based ETL scheduler, including other tools such as TDE (transaction Data Extractor), which extracts internal system’s data into GA. Calculated distance between the stores using Google API for Longitude and Latitude in order to determine neighborhood stores.
• Resolved scaling issues from the earlier versions by implementing Pig and Oozie based frameworks for New ETLs.
• Architected and delivered the latest version of GA, with a new data model including highly scalable Complete ETL, including several features such as monitoring, SLA processing, Re-scheduler capabilities.
• Delivered highly customizable, easy to use Ad hoc Reporting Module for sales team’s needs using VBA, which is extended to generate QA reports.
• Developed several utilities in Pig such as Murmur Hash, UDF to deal with parquet timestamp int96 and PigLoadDB function to read data from DB directly.
• Mentored development teams to achieve our project goals.
Gathered business requirements and architected scalable solutions. Provided Support for Release activities for Guest Analytics. Performed cross platform integration and gap analysis of our Promotion Manger, Campaign Manager and GA.
Technologies: Hadoop, Hive, HBase, Drill, Monet DB, MySQL, MSSQL Server, Kylin, Oozie, Pig, Druid, Talend and Spark (as backend engine), svn, git.
Education
Jawaharlal Nehru Technological University
Master's degree
Sri Krishnadevaraya University
Master's degree
Stanford University
Certificate
The Johns Hopkins University