•Used PySpark & SQL to pull data sets of >10million rows from Amazon Redshift relational database
•Performed data analysis using pandas, python & SQL to extract business insights related to email activity
•Worked with Marketing & Consumer Analytics teams to explore how email cadence, email content, & consumer demographics influence email activity, visualized findings using matplotlib & seaborn
•Built a Logistic Regression Model & tuned features to predict consumer email activity with >88% accuracy
•Used model findings to propose new email campaign strategy that is catered to individual consumer needs & increases overall email open rate from 4.3% to 5.4%