•Played a pivotal role in aligning data solutions with business strategies by partnering with the Data Science team & key stakeholders, orchestrating workshops for 25 participants, and driving the creation of 80 innovative use cases.
•Actively participated in building data platforms for Data Science & analytics team using big data tools & frameworks like PySpark, Azure Data Factory & Azure Databricks, causing 33% enhancement in data flow efficiency
•Crafted and deployed scalable ETL workflows in Azure Data Factory, facilitating seamless data ingestion from various sources, including Amazon S3 & Azure Blob Storage, into Azure Data Lake (ADLS Gen2) delivering 100% efficiency.
•Mastered the management and surveillance of Azure Data Factory (ADF) data pipelines, attaining a remarkable 99% uptime rate, thus guaranteeing superior reliability and peak system performance.
•Catalyzed enhancements in the data quality check framework and spearheaded optimization of Spark applications, reducing 28-man hours per activity through strategic data partitioning and advanced caching methodologies.
•Conducted thorough data analysis with Spark SQL & PySpark DataFrame API, delivering key insights and clear reports to stakeholders through effective communication with 100% consistency.
•Documented and conducted training sessions on data pipelines and data models, sharing expertise with team members and stakeholders.