Greater New York City Area
• Built Knowledge Graph to capture firm internal and external relationships to extract business insights; built data pipelines to combine multiple sources of structured and unstructured data into one heterogeneous graph with millions of vertices and billions of edges
• Built distributed and scalable solution in Hadoop ecosystem for serving graph data, reducing the response time of each batch API call by more than 80% on average - using pre-processing and caching the milestone data into sequence files
• Constructed framework in Hadoop ecosystem to allow indexing and generic filtering to enable querying the graph using attributes and secondary keys, helping reduce the number of calls by 50% compared to when only querying with primary keys was possible
• Led and implemented data governance standards for team's graph generation and analytics pipelines, resulting in streamlined processes and improved data integrity – helping comply with consent orders from different Govt. organizations