# Varshith rao > Data Scientist | AI & GenAI Specialist | ML, NLP & Time-Series Expert | Building Scalable Production ML | Turning Data into Business Impact Location: Austin, Texas Metropolitan Area, United States Profile: https://flows.cv/varshithrao With 4 years of experience across fintech, healthcare, and e-commerce, I specialize in designing end-to-end machine learning systems that solve high-stakes business problems. From detecting fraud in real time to forecasting customer behavior, my work sits at the intersection of AI, data engineering, and scalable MLOps. πŸ’‘ What I Bring to the Table: ⚑ Building real-time ML pipelines handling high-volume data with low latency 🧠 Applying advanced models like XGBoost, LSTM, Autoencoders, and Graph Neural Networks ☁️ Deploying scalable solutions using AWS SageMaker, Docker, Kubernetes πŸ“Š Transforming raw data into insights with SQL, PySpark, and BI tools πŸ” Ensuring model transparency using SHAP, LIME πŸ”₯ Impact Highlights: Reduced fraud detection latency by 30% while scaling systems to handle 50% more transactions Improved customer retention by 5% through predictive modeling Accelerated data pipelines by 40%, enabling faster business decisions I thrive in environments where I can own problems end-to-end, collaborate cross-functionally, and turn ambiguity into scalable solutions. Whether it's engineering features from streaming data or deploying production-grade ML systems, I focus on what matters mostβ€”delivering measurable outcomes. 🎯 Certifications & Continuous Learning: πŸ… Databricks Certified Data Engineer Professional πŸ… Microsoft Certified: Fabric Data Engineer Associate πŸ… Data Engineering Foundations – Astronomer πŸ… Google Data Analytics Capstone πŸ… Python for Everybody – University of Michigan 🌍 I’m always open to opportunities where I can: Build high-impact ML systems Work on challenging, data-intensive problems Contribute to innovative, fast-moving teams πŸ“© Let’s connect if you’re working on something exciting in AI, ML, or Data Engineeringβ€”or if you just want to talk data. ## Work Experience ### Data Scientist @ S&P Global Jan 2024 – Present | United States ● Led the development of end to end, real time machine learning pipelines for fraud detection, ensuring high accuracy, low latency, and scalability in high-volume transaction environments. ● Engineered sophisticated features from diverse streaming data sources including banking APIs, payment systems, user behavior signals, and device intelligence to enhance model performance and significantly reduce false positives. ● Deployed and managed machine learning and deep learning models (XGBoost, LightGBM, Autoencoders, Isolation Forest, Graph Neural Networks) using AWS SageMaker, Lambda, and Fargate, with containerization (Docker) and orchestration (Kubernetes), reducing processing latency by 30% and scaling to handle 50% higher transaction loads. ● Consolidated and processed structured and cloud-based datasets from platforms such as SQL Server, PostgreSQL, BigQuery, and Snowflake to enable comprehensive fraud analysis and reporting. ● Implemented graph-based anomaly detection techniques to identify complex fraud patterns and uncover hidden relationships within transaction networks, improving risk detection capabilities. ● Partnered with cross-functional teams including risk, compliance, and product to align fraud detection systems with regulatory frameworks (GDPR, CCPA, SOX), contributing to a 10% improvement in audit outcomes. ● Developed interactive dashboards and visualization tools in Power BI and Tableau to provide real-time insights into fraud trends, geospatial risk patterns, and operational performance for leadership teams. ● Established automated model monitoring and retraining workflows, incorporating CI/CD pipelines, A/B testing, and explainability methods (SHAP, LIME) to maintain model performance, transparency, and regulatory compliance. ### Data Scientist @ Hindustan Computers Limited (HCL) Jan 2021 – Jan 2022 | India ● Conducted in-depth analysis of historical e-commerce sales and user behavior data using Python and SQL, identifying seasonality patterns, customer trends, and key performance drivers across digital channels. ● Built and evaluated predictive models (ARIMA, Prophet, LSTM, XGBoost) to forecast 12-month sales and predict customer churn, contributing to a 5% decrease in churn and more reliable revenue projections. ● Designed and generated advanced features by integrating data from CRM systems, web analytics, advertising platforms, and inventory sources to improve customer segmentation and personalization, resulting in higher campaign effectiveness. ● Developed automated ETL pipelines in collaboration with engineering and marketing teams, consolidating data from Shopify, Magento, Salesforce, and Snowflake, reducing data processing time by 40% and accelerating reporting workflows. ● Leveraged PySpark, Pandas, and NumPy to process and analyze large-scale transactional datasets efficiently within a distributed cloud-based environment. ● Deployed ML models through Flask, Fast API, and Docker, orchestrated with Kubernetes and AWS SageMaker, to streamline deployment pipelines. ● Created real-time dashboards in Power BI and Google Data Studio to monitor KPIs, sales, marketing, and supply chain performance, driving faster decision-making. ● Monitored model performance and system health using Prometheus, Grafana, and AWS CloudWatch, while maintaining CI/CD pipelines in Jenkins and GitHub Actions for continuous improvements. ### Data Scientist @ iView Labs Pvt. Ltd. (Software Development Company) Jan 2020 – Jan 2021 | India ● Performed comprehensive analysis of high-volume healthcare datasets, including electronic health records (EHR), insurance claims, and patient activity data, leveraging Python and SQL to uncover patterns in disease trajectories, readmission risks, and treatment outcomes. ● Collaborated with clinicians and healthcare stakeholders to translate medical requirements into data science solutions, ensuring models aligned with real-world clinical workflows and decision-making processes. ● Built data pipelines to ingest and harmonize multi-source healthcare data (EHR, imaging metadata, lab systems) using Apache Spark and cloud-based architectures, improving data accessibility and consistency. ● Developed optimization models to improve hospital resource allocation (bed occupancy, staff scheduling), reducing patient wait times and enhancing operational efficiency. ● Delivered interactive dashboards and reporting tools using Tableau and Power BI to track patient outcomes, hospital KPIs, and model predictions, enabling actionable insights for both clinical and administrative teams. ● Created interactive dashboards in Power BI and Tableau to monitor key healthcare KPIs such as patient outcomes, hospital utilization, and treatment efficiency, supporting data-driven decision-making by clinicians and administrators ## Education ### Master's degree in Business Analytics The University of Texas at Dallas ### Bachelor of Technology - BTech in Electronics and Communications Engineering Jawaharlal Nehru Technological University ## Contact & Social - LinkedIn: https://linkedin.com/in/varshith-rao --- Source: https://flows.cv/varshithrao JSON Resume: https://flows.cv/varshithrao/resume.json Last updated: 2026-04-16