I designed and engineered Stats.Fan, a modern MLB analytics platform built on a proprietary data pipeline that makes 125+ years of baseball history fun and interactive through live dashboards, trivia, and player comparisons.
🔧 Highlights
Built a full-stack data infrastructure:
•Created a data lake with 20,000+ player HTML snapshots using R2 storage.
•Designed a refinery and scraping engine (using Cheerio) to normalize messy historical data.
•Structured a MongoDB-based data warehouse with clean, queryable player documents.
•Integrated MLB Stats API augmentation (career, season, and live stats).
•Deployed a custom CDN for player headshots tied to normalized player IDs.
Developed a modern, interactive frontend:
•Built with React and ECharts.
•Features include player comparison tools, trivia generator, and interactive stat widgets.
•Optimized for performance and SEO with clean markup and server orchestration.
Designed orchestration and deployment tooling:
•Built CLI + shell scripts for rsync-based deployments to a DigitalOcean droplet.
•GUI tools for scraping config, ingestion monitoring, and API response testing.
•Modular scraper architecture with GraphQL-style parameterization.
End-to-end engineering ownership:
•From ingestion pipelines and data normalization to frontend UX and production deployment.