San Jose, California, United States
Launched an end-to-end pipeline from scratch to transfer all query plans from Databricks, EMR, and an internal Apache Spark dependency to Neo4J Graph Database to fully enhance data discovery
Added a Scala Spark Listener to an internal Spark dependency, transferring all data to Marquez Database
Constructed a fully-developed Dockerized Python library including documentation and CI/CD unit and
integration tests to transfer OpenLineage data from Marquez into Neo4J using Cypher Queries
Developed Neo4J Cypher Queries to discover and reduce duplicate work among all TTD query plans