Project: Identification of Peruvian languages
•Compile texts in multiple Peruvian languages from the Internet
•Automate text extraction
•Develop Python scripts to clean text
•Prepare the dataset based on the clean and labeled texts.
•Develop, train and compare various Machine Learning models and a Recurrent Neural Network.
•Develop a web application to present the results
•Write and present two papers in the SIMBig conference
Project: Classifiers committee for genetic expression data
•Develop a new machine learning algorithm based on classifier committees
•Automate testing multiple hyperparameters on the developed algorithm on various public datasets
•Compare results of the algorithm versus the Random Forest algorithm
•Write and present a paper in an IEEE BIBM conference