# Manasa Gunampalli > Senior Data Scientist | Machine Learning, Generative AI, LLMs, RAG | NLP, Python, Spark | AWS, Azure, Databricks | Fraud, Retail & Healthcare Analytics Location: United States, United States Profile: https://flows.cv/manasagunampalli I’m a Senior Data Scientist with 6+ years of experience building and deploying machine learning, NLP, and Generative AI solutions across financial services, healthcare, and retail.I specialize in designing end-to-end AI systems from large-scale data pipelines and feature engineering to production deployment of machine learning and LLM powered applications that support real-world decision making. Key Skills: Python | SQL | Spark | TensorFlow | PyTorch | BERT | LangChain | MLflow | Docker | Kubernetes | FastAPI | AWS | Azure Machine Learning: Predictive Modeling, Fraud Detection, Credit Risk Modeling, Recommendation Systems,Time Series Forecasting, A/B Testing NLP & LLMs: Retrieval Augmented Generation (RAG), BERT Fine Tuning, Semantic Search, Named Entity Recognition, Sentiment Analysis, Embeddings Big Data & Data Platforms: Apache Spark, Databricks, ETL Pipelines, Data Lakehouse Architectures , Apache Kafka MLOps & AI Systems: FastAPI, Docker, Kubernetes, MLflow , CI/CD Pipelines, Model Monitoring , Explainable AI (SHAP, LIME) Experience Across Industries: KeyBank - Built an enterprise RAG based AI system combining semantic retrieval and large language models to support fraud and compliance investigations. Designed vector search architectures using FAISS and Pinecone and developed fraud detection and credit risk models improving investigation efficiency. Mayo Clinic - Developed clinical machine learning models for patient readmission prediction and treatment response analysis. Built HIPAA for compliant NLP pipelines extracting insights from physician notes and engineered healthcare data pipelines using Azure Databricks and PySpark. Macy’s - Built demand forecasting and recommendation systems improving inventory planning and customer engagement. Developed scalable AWS and Databricks data pipelines and applied NLP techniques to analyze customer feedback and sentiment. ## Work Experience ### Senior Data Scientist @ KeyBank Jan 2023 – Present | Cleveland, OH • Designed an enterprise RAG platform combining semantic retrieval and LLM reasoning, accelerating fraud investigations by 40% for 50+ analysts. • Architected scalable RAG pipelines using FAISS, Pinecone, and ChromaDB, enabling sub second search across large financial document repositories. • Developed agentic AI workflows using LangChain and LangGraph to automate multi step reasoning for fraud policy analysis. • Enhanced fraud narrative classification accuracy by 17-20% through BERT fine tuning with LoRA and zero-shot classification techniques. • Reduced fraud false positives by 18% by building ensemble risk models using XGBoost, Random Forest, and LightGBM. • Implemented near real time credit risk scoring pipelines integrating transaction and bureau data. • Deployed anomaly detection models using Isolation Forest to strengthen AML monitoring and detect suspicious activity. • Automated large scale ETL pipelines using Apache Spark, AWS Glue, and S3 to process multi terabyte financial datasets. • Operationalized ML services using FastAPI microservices, Docker containers, MLflow, and CI/CD pipelines. • Integrated model monitoring and explainability frameworks using SHAP and LIME to support regulatory compliance. ### Data Scientist @ Mayo Clinic Jan 2021 – Jan 2023 | Rochester, MN • Created clinical prediction models for patient readmission and treatment outcomes, improving model performance by 14%. • Engineered scalable healthcare data pipelines using Azure Data Factory, Databricks, and PySpark to process large EHR datasets. • Constructed NLP pipelines extracting medical entities from physician notes, reducing manual chart review by 30%. • Designed time series forecasting models (ARIMA, SARIMA) for hospital capacity planning. • Built semantic search systems using FAISS to enable fast retrieval across medical knowledge bases. • Deployed deep learning models using PyTorch on Azure Kubernetes Service, reducing inference latency by 25%. • Streamlined ML lifecycle management using Azure Machine Learning for experiment tracking and deployment. ### Data Scientist @ Macy's Jan 2019 – Jan 2020 | New York, United States • Led development of a retail analytics platform integrating demand forecasting, inventory optimization, and personalization models. • Improved demand forecast accuracy by 31% using XGBoost, LightGBM, and TensorFlow. • Developed recommendation systems increasing customer engagement by 11-14%. • Established scalable ETL workflows using AWS Glue, Apache Spark, Databricks, and Airflow. • Applied NLP techniques to analyze customer sentiment and product reviews, reducing processing time by 35%. • Delivered business insights through Tableau and Power BI dashboards for merchandising and supply chain teams. ## Education ### Master's Degree in Data Science University at Buffalo ## Contact & Social - LinkedIn: https://linkedin.com/in/manasareddy461 --- Source: https://flows.cv/manasagunampalli JSON Resume: https://flows.cv/manasagunampalli/resume.json Last updated: 2026-04-17