Hyderabad, Telangana, India
• Built ETL pipelines and SQL scripts to process 200M+ medical claim records using SQL, Hive, and PySpark; standardized data across payer systems for ML modeling and reduced manual cohort creation work by 35%.
• Engineered end-to-end ETL workflows using PySpark, SQL, and Airflow to process large-scale healthcare data, ensuring HIPAA compliance and improving data pipeline reliability by 30%.
• Developed dashboards (Tableau, Power BI) for provider anomalies, claims denial rates, and member risk scoring.
• Collaborated with actuaries, clinicians, and engineers on forecasting models for utilization, pricing, and fraud detection; implemented KPI monitoring and A/B testing frameworks.