•Built a classification model to identify relevant search results for determining copyright infringement.
•Achieved accuracy and recall gains of over 80% compared to the existing system, resulting in significant reduction in manual review effort on misclassified records.
•Structured training data collection pipeline and oversaw annotation process.
•Responsible for end-to-end deployment, from analysis and feature engineering to testing, implementing a production-ready service for model training and prediction; visualizing and communicating results with a focus on reproducibility.