Worked on a distributed analytics engine that monitors and analyzes the network. The engine automatically runs on unutilized compute resources, thus removing the need for a separate monitoring server. One of the first engineers on a product team of 6. Doing test driven development in a CICD pipeline.
•
Designed and developed the backend systems composing of Node.js, Redis, MySQL, and Tableau
•
Implemented API’s to enable a dockerized data transfer service from each node back to our servers
•
Applied probabilistic data structures to minimize data transfer size (reduced data transfer size by 5x)
•
Overhauled the data processing pipeline from a proprietary software, Flowforce, to a functional model in Node.js (reduced data processing time by 10x)
•
Operated backend servers with VMware vSphere and FreeNAS
Worked on an automated data analysis framework and automation suite. The goal of the suite is to determine the root cause of an issue through automatic analysis of large volumes of data collected.
•
Identified a need to implement a historical record for all analytical tools
•
Installed and managed a Hadoop cluster using Cloudera Manager
•
Designed a database that meets the requirements for analytics using Hive as a framework
•
Modified existing tools in Perl and Python to extract relevant information
•
Developed a user facing tool using Python to handle all database interactions