New York City Metropolitan Area
Machine Learning Linear & Forest Regression—R, MLR, XGBoost, ggplot2
• Accomplished and implemented a model of the concentration of compounds in a body of water. The forest model reached a 500% better averaged predictive performance than other previously published methods of predicting the concentrations without over-fitting.
• Efficiently multi-threaded and separated computational tasks for modeling allotting full utilization of computer resources. Efficient implementations provided 19 times better performance in generating and computing mathematical models than sequential means.
• Organized, assessed, and evaluated major contributing dataset factors with client to integrate modeling software within their project requirements. Articulated complex and simple technical requirements back to the client for the project to progress.
• Implemented and designed a relational PostgreSQL database in 2NF to normalize processed data to achieve interactive relationships between independent dataset features further optimizing model metric performance.