Lead developer of a platform automating ETL pipeline applications by architecting the framework, library, and API, implementing key processors like deduplication and SQL Join, providing UI functionality, and integrating deployment workflows on Databricks, AWS Glue, and Spark-Standalone Cluster. Designed event logging to InfluxDB, MariaDB, and Amazon SES for dashboard views and notifications.
Authored a white paper, presented demonstrations to clients, created sample workflows using DataMorph, and recorded informational videos to showcase the platform.
Developed a proprietary ML framework, achieving 97% accuracy with an entity resolution model and deploying the integrated solution in Scala.