Livermore, California, United States
• Leveraged tree-based learning algorithms, such as decision trees and advanced ensemble techniques (BAGGING, RandomForests, ADABoost), to achieve a 30% improvement in root-mean-squared error (RMSE), enabling groundwater management authorities to predict water sample ages more affordably and mitigate the impact of consuming contaminated water for the Central Valley region’s population
• Analyzed feature importances by evaluating the mean decrease in impurity for each learner, enhancing interpretability of the machine learning model and enabling non technical domain experts to prioritize feature extraction for more cost-effective analysis of groundwater sample chemical compositions
• Examined predictor variable interactions by generating deciled Partial Dependence Plots (PDPs) through marginalization over complement features thus empowering hydrologists to discern the relationship between each feature and water sample age variations in the Central Valley region