• Built an end-to-end risk detection pipeline over 30-day programmatic ad logs to identify potentially manipulative MFA sites, validating BigQuery joins, aggregating events to the hostname level, and combining behavioral signals with website embeddings for scalable failure detection
• Labeled 50k+ hostnames with a high-value-device heuristic, engineered burstiness and session features, and used statistical tests to isolate robust indicators of suspicious traffic behavior and reduce noise in downstream risk scoring
• Trained an interpretable XGBoost classifier achieving ROC-AUC 0.81, then used SHAP and threshold calibration to convert model outputs into analyst-facing triage decisions for scalable risk review and mitigation prioritization