Experienced Software Engineer with expertise in architecting and implementing real-time and batch data processing pipelines. Proven ability to design and scale data-intensive distributed applications, and enable analytics to drive business growth.

Experience

AppleSoftware Engineer

2023 — Now

London, England, United Kingdom

LentraLead Data Engineer

2022 — 2023

Singapore

Scaled the in-house Customer Profiling product to handle a high volume of data for the largest telecom company in the Philippines. In streaming mode, the system can now process an impressive 600,000+ events per second. Additionally, in batch mode, it can efficiently handle over 25+ terabytes of data daily. This significant scalability improvement ensures that it can effectively handle the large data loads required by the telecom company, enabling comprehensive customer analysis and profiling at a massive scale.

Designed and led the implementation of Customer 360 for one of the first digital banks in the Philippines. Onboarded 100+ banking domain profile attributes for comprehensive customer analysis.

Collaborated with cross-functional teams to architect and implement robust, real-time data processing pipelines using AWS, Flink, and Kafka, to consume banking transactions for the first digital banks in the Philippines.

Designing and creating the roadmap for Customer Profiling V2 with support for Spark Structured Streaming, language-agnostic transformation, and containerized execution.

Leading the design and roadmap creation of a data observability product for the internal platform. It will feature a data catalog, data lineage, intelligent attribute onboarding hints, and platform monitoring. These capabilities aim to minimize efforts during product outages and enhance overall data management efficiency.

LentraData Engineer II

2019 — 2022

Singapore

Authored Cadenz Profiles, a dynamic customer profiling product that offers both batch and real-time capabilities. It enables automated intelligent marketing and service decisions. This innovative solution significantly improved the efficiency of onboarding profile attributes, increasing the rate to 10 attributes per day per engineer.

Authored a set of versatile transformation modules, data sources, and data sinks that can be easily integrated into the Cadenz Profile. These components enable seamless interaction with various object stores, databases, and data stores. The primary goal of this effort was to simplify the integration process with different data storage systems, allowing for greater flexibility and compatibility.

Led the implementation of an Early Warning System for a major bank in India, utilizing our in-house customer 360 product. The system effectively flagged suspicious transactions and collected user feedback. This initiative resulted in a notable reduction in duplicate loans obtained by individuals through multiple subsidiaries.

Implemented a versatile framework that mirrors Kinesis events to a Kerberized Kafka, enriching them with additional metadata and ensuring exactly once processing. This solution enables seamless integration between the two systems while maintaining data integrity and reliability.

Supervised a team to reimplement & revamp a 7-year-old Analytics platform with spark and AWS

Developed a config-driven generic framework on Apache Spark which supports customizable and pluggable SQL-based data transformation to be able to onboard attributes faster

LentraData Engineer

2016 — 2019

Chennai, Tamil Nadu, India

Implemented an AWS-based Data Lake, utilizing event-based EMR job creation and data processing on Spark. Implemented role-based security for table and column-level access control, ensuring data protection and privacy.

Collaborated with a team to build an Intelligent decision support system to gain a competitive advantage in pursuing opportunities in the government and public sector which gain insights from historical trends of tenders, bids, competition, and various factors. Worked on designing the complete architecture of the Data Gathering stage which also involves implementing a crawling and parsing engine

Focused on performance optimization of Hive queries on the Azure platform. Conducted comparative analysis between HDInsight offerings and Azure Data Lake Analytics, exploring different file formats and compression techniques for improved efficiency.

Collaborated with a team to add new features and tests to a generic MapReduce-based Ingestion and Extraction Framework. Also, Kerberized both frameworks to work on Kerberized Cluster

Education

National Institute of Technology Nagaland

Experience

Education

Bachelor of Technology - BTech