•Predictive Modeling: Worked with datasets of approximately 1,000,000 rows and 10,000 columns within Hadoop File System to build predictive models. Utilized Python and Pyspark to clean data, perform feature selection, and train models. Made use of multiple machine learning algorithms including GBT, XGBoost, and random forest.
•Natural Language Processing: Built a call analysis tool that extracts all mentions of price from a sales conversation and ranks each price by how likely it is to be associated with a sales offer.
•Anomaly Detection: Created anomaly detection jobs on Kibana that use machine learning to learn patterns within historical data and apply these learnings to detect anomalies. Proved to be beneficial in uncovering issues with incoming data.