Data Scientist and AI Engineer with 6+ years of experience building production-grade AI, machine learning, and Generative AI solutions across pharma, consulting, transportation, and technology domains.
Experience
2025 — Now
2025 — Now
Philadelphia, Pennsylvania, United States
• Pioneered a Text-to-SQL chatbot leveraging LLMs, RAG, FAISS, and Azure OpenAI, enabling pharma R&D teams to query complex databases in natural language and receive summarized insights, reducing query turnaround time from hours to seconds.
• Leading deployment of the chatbot in React & JavaScript with a team of three developers, driving 75% adoption of the POC within 4 months across the R&D department.
• Drove 40% increase in query execution accuracy while improving scalability of SQL workloads.
• Designed a multi-agent architecture with data dictionaries and semantic layer, improving SQL accuracy, scalability, and execution speed across enterprise data platforms (MS-SQL, Azure, Databricks).
• Implemented Model Context Protocol (MCP) to orchestrate multiple intelligent agents, enabling context-aware routing, dynamic tool selection, and significantly faster SQL generation by modularizing responsibilities across specialized table-level and query-level agents.
2023 — 2025
2023 — 2025
Charlotte, North Carolina, United States
• Built an AI-powered Streamlit app using Azure OpenAI, LangChain, RAG, and embeddings to help non-technical teams upload, manage, and query fleet-specific PDFs via natural language - cutting manual lookup time by 70%.
• Contributed to agentic AI frameworks using LLMs to automate fleet-specific query responses, improving issue resolution speed and decision quality.
• Designed and deployed an AI-driven customer support automation system using OpenAI, FAISS, and Azure ML to handle ticket categorization, prioritization, and response generation - reducing manual handling by 40%, cutting response time by 35%, and boosting satisfaction by 25%.
• Developed a hybrid log classification system (Regex + Sentence Transformers + Logistic Regression + LLMs) via FastAPI - improving accuracy by 40% and reducing operational costs by 30%.
• Created 8 Power BI dashboards for major U.S. railroads supporting 7 business units, analyzing 20 M+ records with Spark + Python - accelerating data-driven decisions by 60%.
• Automated 80% of railcar data pipelines using Python, Airflow, and PyODBC - streamlining integration and tracking.
• Implemented ARIMA and Isolation Forest models for anomaly detection, delivering real-time alerts to enhance data accuracy.
• Modernized legacy ETL processes with Python, SQL, and Hive - integrating S3 + Oracle data and saving 15 analyst weeks annually.
• Built automation workflows using Power Automate, Power Apps, and Synapse to refresh 15+ recurring MS-SQL reports - saving 50+ analyst hours monthly.
• Designed a scalable Azure analytics pipeline (Data Lake + ADF + Synapse + Power BI) to automate fleet reporting and visualization, reducing reporting time by 60%.
2024 — 2024
Chicago, Illinois, United States
• Supported 50 students in ITMD 513, enhancing their understanding of Python programming.
• Facilitated engaging lab sessions to reinforce programming concepts, boosting student participation.
• Designed and graded assignments and exams, aligning with course objectives to accurately assess learning.
2022 — 2022
2022 — 2022
• As an Associate Consultant (Data Scientist) with the Technology Consulting practice in EY, I worked as part of a team of problem solvers with extensive consulting and industry experience, assisted clients & consultants across diverse industries all over the world ,helping them solve their complex business issues from strategy to execution.
• Extracted data from Social media sources like Twitter, Tumblr, Facebook, Youtube, Forums, and more using API & web scrapping tools.
• Performed sentiment analysis on structured & unstructured datasets using Natural language Processing, Machine Learning & Python. Created analytical reports based on the data and presented those reports to the stakeholders.
• Worked as a Deep Learning Consultant for an MNC steel manufacturing company helping them analyze & detecting defects in 13000+ images of steel containing 4 types of defects, to improve the quality of manufacturing & localizing defects found in steel manufacturing quality and reduce waste due to production defects.
• Received datasets from multiple vendors, and integrated the datasets using SQL & Excel, deriving valuable insights from them.
• Handled large volumes of datasets using SQL & debugging & troubleshooting using SQL queries.
• Created dashboards using Power-BI, Excel & Aera. Cleaned, analyzed & validated the data using Excel and Aera. Performed statistical analysis of the data & created technical reports at the end of each task.
• Worked as a team's senior data analyst by helping junior analysts with daily tasks and guiding them in day-to-day activities.
• Provided data-driven analytical solutions for various companies across different sectors globally & closely worked with clients all over the world.
2020 — 2022
2020 — 2022
• Created a full-fledged web application using Microsoft Power apps to collect & submit user information, creating an automated flow using Microsoft Power Automate that directed the collected information to the authorized person.
• Provided Information Management solutions, including Data Cleaning, Data Aggregation, and Quality Assessment using Microsoft Excel & MS-SQL.
• Worked for a Multinational pharmaceutical company on Material Master Data Management projects for migration of acquired company's data into SAP.
• Created around 500+ UINs using SAP Atlas & MDG for around 28+ countries while achieving multiple successful go-lives in the process.
Education
Illinois Institute of Technology