Senior Data Engineer | Python, Scala, Spark, Airflow, Databricks, AWS & Snowflake Expert | Passionate about efficient data-driven solutions
Senior Data Engineer accomplished with 6+ years of experience in designing and developing efficient and scalable solutions. Proven track record of meeting business and technical requirements and exceeding data client expectations.
Led the architecture of key data quality and monitoring frameworks for internal and external data sources and infrastructure including Spark, Glue, MySQL, Snowflake, Airflow, Tableau and third-party vendors
•
Mentored engineers and members of cross-functional teams by teaching technical concepts and offering guidance
•
Designed the technical interview process and interviewed candidates for data related engineering roles
•
Supported the analytics of Aura’s suite of identity protection and security products that helped transform it into a unicorn company
Developed robust end-to-end batch and real-time streaming pipelines that processed tens of terabytes of daily data orchestrated by Airflow using Python, SQL, Scala, Spark, Databricks, and Snowflake
•
Optimized Docker containers for efficient layers that were orchestrated by Kubernetes and integrated with CircleCI
•
Fine tuned SQL queries, Spark jobs and clusters for performance gains and reduction in overall costs
•
Implemented robust metadata synchronization and data transfer between Snowflake and Databricks
•
Migrated from AWS only to Databricks on AWS, Maven to Gradle, Ansible to Terraform, and Zeppelin to Databricks notebooks and integrated Delta Lake
•
Developed tools for tracking table-level data lineage and schema evolution in Python and SQL that was key in streamlining the simplification of pipelines and the data model