Designed and implemented prototype algorithm to extract brand names from product names, and algorithms to classify product mentions in text selections, which were then implemented in the production application. These help cut down processing time as well as help eliminate false positives.
Emphasized importance of ground truth as a model metric, and used MTurk to generate ground truth in order to test algorithm / model improvement iterations.
Constructed a tweet-music metadata matching algorithm. It used the music corpus to identify music stop words and music word importance, a tweet parser which extracted and grouped important tweet words, and a relatively small number of parameters so that it wouldn't require a great deal of ground truth to train. It could eliminate almost all false positives while generating only a small number of false negatives.