•Worked on the Edifecs flagship product building an ML data training pipeline to generate recommendations based on LLM embeddings ending in a classification layer identifying relevant ICD10 labels
•Used an ensemble of LLM and Neural Network Models (BERT, ROBERTA, BGE, Llama3 for charts, sentences, clinical concepts data for features, and ICD10 codes for labels) for training and moved these models to the inference layer to recommend ICD10 codes out of new incoming patient charts
•Built a chatbot for Product documentation queries, using Llama3 in a RAG architecture with FAISS vector database.
•In a follow up iteration, I replaced FAISS with ChromaDB for Production
•Tested various prompts while taking the project all the way from concept to production for the above Product documentation Chatbot.
•Developed an application in Java/JSP and Oracle to capture NLP trigger terms, Spell Checking terms, and Regular expressions
•Created Regular Expressions for use by the NLP for phrase identification specific types of dates, telephone numbers, email addresses and other entities using Spacy/OpenNLP
•Modified specialized NLP entity extractors including dates, and clinical terms
•Running the end-to-end NLP pipeline (OCR, NLP, CRE, Precision Recall Calculations)
•Worked on an ML model using ANN for recommending ICD10 codes based on a collection of clinical concepts found in patient charts Keras, Jupyter Notebook and Pandas.