# Ssanidhya Barraptay > AI Engineer | LLMs, Agentic AI, RAG | Data Science & MLOps | Azure OpenAI, Databricks Location: Philadelphia, Pennsylvania, United States Profile: https://flows.cv/ssanidhya Data Scientist and AI Engineer with 6+ years of experience building production-grade AI, machine learning, and Generative AI solutions across pharma, consulting, transportation, and technology domains. I specialize in transforming complex data into scalable, business-ready intelligence using Python, SQL, cloud platforms, and modern AI architectures. Currently working as an AI Engineer at GlaxoSmithKline, where I design and deploy agentic AI systems, LLM-powered applications, and RAG pipelines using Azure OpenAI, Databricks, MLflow, and Azure AI Search. My work enables non-technical stakeholders to interact with enterprise data through natural language, reducing decision cycles from hours to minutes while maintaining GxP compliance and auditability. Previously, I’ve delivered high-impact data science and analytics solutions at organizations including TTX Company, Ernst & Young, and Airbnb—ranging from predictive modeling and deep learning systems to large-scale data engineering pipelines processing millions of records weekly. I’ve led initiatives in NLP, A/B testing, recommender systems, computer vision, and MLOps, consistently driving measurable efficiency gains and revenue impact. Technically, I bring strong expertise in: Machine Learning & Deep Learning (CNNs, Transformers, LSTMs, model evaluation) Generative AI & LLMs (RAG, LangChain, prompt engineering, agent frameworks) Cloud & Big Data (Azure, AWS, GCP, Databricks, Spark, Delta Lake) MLOps & Deployment (Docker, CI/CD, MLflow, monitoring concepts) Data Visualization & Storytelling (Power BI, Tableau) Beyond hands-on development, I enjoy mentoring junior engineers, collaborating cross-functionally, and translating ambiguous business problems into clear, data-driven AI solutions. 📌 Open to opportunities in: Senior Data Scientist | AI Engineer | Machine Learning Engineer | Applied Scientist | GenAI Engineer | Data Engineer (AI-focused) Let’s connect if you’re building intelligent, scalable systems—or looking for someone who can bridge advanced AI with real-world business impact. ## Work Experience ### Artificial Intelligence Engineer @ GSK Jan 2025 – Present | Philadelphia, Pennsylvania, United States • Pioneered a Text-to-SQL chatbot leveraging LLMs, RAG, FAISS, and Azure OpenAI, enabling pharma R&D teams to query complex databases in natural language and receive summarized insights, reducing query turnaround time from hours to seconds. • Leading deployment of the chatbot in React & JavaScript with a team of three developers, driving 75% adoption of the POC within 4 months across the R&D department. • Drove 40% increase in query execution accuracy while improving scalability of SQL workloads. • Designed a multi-agent architecture with data dictionaries and semantic layer, improving SQL accuracy, scalability, and execution speed across enterprise data platforms (MS-SQL, Azure, Databricks). • Implemented Model Context Protocol (MCP) to orchestrate multiple intelligent agents, enabling context-aware routing, dynamic tool selection, and significantly faster SQL generation by modularizing responsibilities across specialized table-level and query-level agents. ### Data Science Analyst @ TTX Company Jan 2023 – Jan 2025 | Charlotte, North Carolina, United States • Built an AI-powered Streamlit app using Azure OpenAI, LangChain, RAG, and embeddings to help non-technical teams upload, manage, and query fleet-specific PDFs via natural language - cutting manual lookup time by 70%. • Contributed to agentic AI frameworks using LLMs to automate fleet-specific query responses, improving issue resolution speed and decision quality. • Designed and deployed an AI-driven customer support automation system using OpenAI, FAISS, and Azure ML to handle ticket categorization, prioritization, and response generation - reducing manual handling by 40%, cutting response time by 35%, and boosting satisfaction by 25%. • Developed a hybrid log classification system (Regex + Sentence Transformers + Logistic Regression + LLMs) via FastAPI - improving accuracy by 40% and reducing operational costs by 30%. • Created 8 Power BI dashboards for major U.S. railroads supporting 7 business units, analyzing 20 M+ records with Spark + Python - accelerating data-driven decisions by 60%. • Automated 80% of railcar data pipelines using Python, Airflow, and PyODBC - streamlining integration and tracking. • Implemented ARIMA and Isolation Forest models for anomaly detection, delivering real-time alerts to enhance data accuracy. • Modernized legacy ETL processes with Python, SQL, and Hive - integrating S3 + Oracle data and saving 15 analyst weeks annually. • Built automation workflows using Power Automate, Power Apps, and Synapse to refresh 15+ recurring MS-SQL reports - saving 50+ analyst hours monthly. • Designed a scalable Azure analytics pipeline (Data Lake + ADF + Synapse + Power BI) to automate fleet reporting and visualization, reducing reporting time by 60%. ### Graduate Teaching Assistant @ Illinois Institute of Technology Jan 2024 – Jan 2024 | Chicago, Illinois, United States • Supported 50 students in ITMD 513, enhancing their understanding of Python programming. • Facilitated engaging lab sessions to reinforce programming concepts, boosting student participation. • Designed and graded assignments and exams, aligning with course objectives to accurately assess learning. ### Data Scientist @ EY Jan 2022 – Jan 2022 • As an Associate Consultant (Data Scientist) with the Technology Consulting practice in EY, I worked as part of a team of problem solvers with extensive consulting and industry experience, assisted clients & consultants across diverse industries all over the world ,helping them solve their complex business issues from strategy to execution. • Extracted data from Social media sources like Twitter, Tumblr, Facebook, Youtube, Forums, and more using API & web scrapping tools. • Performed sentiment analysis on structured & unstructured datasets using Natural language Processing, Machine Learning & Python. Created analytical reports based on the data and presented those reports to the stakeholders. •Worked as a Deep Learning Consultant for an MNC steel manufacturing company helping them analyze & detecting defects in 13000+ images of steel containing 4 types of defects, to improve the quality of manufacturing & localizing defects found in steel manufacturing quality and reduce waste due to production defects. • Received datasets from multiple vendors, and integrated the datasets using SQL & Excel, deriving valuable insights from them. • Handled large volumes of datasets using SQL & debugging & troubleshooting using SQL queries. • Created dashboards using Power-BI, Excel & Aera. Cleaned, analyzed & validated the data using Excel and Aera. Performed statistical analysis of the data & created technical reports at the end of each task. • Worked as a team's senior data analyst by helping junior analysts with daily tasks and guiding them in day-to-day activities. • Provided data-driven analytical solutions for various companies across different sectors globally & closely worked with clients all over the world. ### Data Analyst II @ EY Jan 2020 – Jan 2022 • Created a full-fledged web application using Microsoft Power apps to collect & submit user information, creating an automated flow using Microsoft Power Automate that directed the collected information to the authorized person. • Provided Information Management solutions, including Data Cleaning, Data Aggregation, and Quality Assessment using Microsoft Excel & MS-SQL. • Worked for a Multinational pharmaceutical company on Material Master Data Management projects for migration of acquired company's data into SAP. • Created around 500+ UINs using SAP Atlas & MDG for around 28+ countries while achieving multiple successful go-lives in the process. ### Jr. Data Scientist @ Airbnb Jan 2019 – Jan 2020 | India • Devised automated recommendation models using Python, Scikit-learn, and AWS S3, analyzing over 6 million booking records to enhance personalized host-guest matching and improve conversion across regional listings. • Formulated large-scale A/B testing frameworks with SQL and PySpark to evaluate pricing strategy effectiveness, reducing average booking decision time by 12 hours across 4 key markets through data-backed insights. • Engineered ETL pipelines in Databricks and Apache Airflow to automate data ingestion from multiple APIs and user event logs, cutting manual data processing by 110 hours monthly while ensuring pipeline reliability and compliance. • Visualized complex behavioral metrics using Tableau and Power BI, delivering 10+ interactive dashboards that monitored host satisfaction and demand fluctuation across 25 global regions, demonstrating strong data storytelling. • Collaborated with cross-functional analytics and design teams to implement NLP-based review sentiment models, processing 2.8 million guest reviews and identifying top drivers of user retention and platform trust. • Piloted an internal analytics enablement initiative by introducing SQL query optimization and data validation best practices for 5 junior analysts, accelerating model deployment workflows by 40 hours per sprint through effective knowledge sharing. ### Machine Learning Researcher @ Pianalytix Jan 2019 – Jan 2019 | Remote • Studying & working on various Machine Learning & Deep Learning Models in-depth, get to know the insights of the Machine Learning technique & Algorithms and to reduce the time complexity to run these models smoothly over time. • As an intern my day-to-day tasks include creating original content catering to the niches of Data Science, Deep Learning & Machine Learning. ### Data Science & Business Analytics Intern @ The Sparks Foundation Jan 2019 – Jan 2019 | Remote • Analyzed and collected data from various sources available from the internet. • Designed various Machine Learning models using Python and scikit-learn. • Created linkages between various datasets within business intelligence software to enable predictive modelling and trend analysis ### Data Science & Machine Learning Intern @ CodeSpyder Technologies Private Limited Jan 2018 – Jan 2019 | Pune, Maharashtra, India • Studied and compared efficiency of Machine Learning algorithms like :- kNN ,logistic regression on various datasets like House Price Dataset ,Credit Card Datasets ,etc. • Implemented character recognition in live videos using conventional neural network & multilayer perceptron trained on eMNIST dataset. • In my final project I implemented a stock price prediction model using LSTM. ## Education ### Master's degree in Information Technology Illinois Institute of Technology ## Contact & Social - LinkedIn: https://linkedin.com/in/ssanidhyabarraptay24 --- Source: https://flows.cv/ssanidhya JSON Resume: https://flows.cv/ssanidhya/resume.json Last updated: 2026-04-18