I product manage the team that makes Elicit give accurate and vettable answers to users' research questions
•Set the roadmap for ML features and eval. I talk to key customers to learn how we need to improve Elicit's answers, then I translate this into specs for the ML team and eval designs
•Started and lead the 2-person eval team at Elicit. This team creates evals for all Elicit features that let an ML engineer run a quick command to learn whether the change they just made was an improvement, and to have justified confidence in that answer
•Led our work on data extraction from its inception -- worked very closely with key users to build a minimum viable product that fit their needs, then product-managed extending the MVP. Data extraction is now the core feature of Elicit and we make constant improvements, e.g. https://email.elicit.com/deliveries/dgTd9ggDANJo0WgBjbfL0NY-iZqxBk4O86BA
•Designed the tasks and evals for our paper on answering questions about papers using iterated decomposition: https://arxiv.org/pdf/2301.01751
•Designed and created the dataset for a high-fidelity approach to evaluate Elicit's search which allowed us to improve precision by around 78% (see https://email.elicit.com/deliveries/dgTd9ggDAKHze6DzewGQc_h8U7zEMbsdsvUZ6pw=)
(When Ought spun off Elicit, I moved along with the entire team from Ought to Elicit. That happened Summer 2023, but my role transitioned earlier so I've gone with the role transition date here)