1st Place - Final Four Analytics Challenge 2026
The Final Four Analytics Challenge 2026 asked teams to predict the S-curve seed (1–68) for every NCAA Tournament team using only regular-season statistics. Predictions had to cover the full pool of 360+ Division I teams
The competition ran in 3 rounds, starting with 200+ teams:
| Round | Format | Focus |
|---|---|---|
| Round 1 | Kaggle submission (RMSE-scored) | Build a model, submit predictions |
| Semifinals | Live video presentation | Explain your approach, findings, and what it means for the NCAA |
| Finals | In-person presentation + Tableau dashboard | Tell the full story - model, insights, and actionable takeaways |
We built a multi-stage ensemble ML pipeline that predicts NCAA seeds with a final Kaggle RMSE of ~1.37 (generalization model) and ~0.11 (semi-supervised):
Raw Stats → Data Cleaning → Feature Engineering (104 features) → Ensemble ML → Constrained Seed Assignment
Models used: HistGradientBoosting, GradientBoosting, ExtraTrees.
Key insight: The NCAA selection committee doesn't assign seeds algorithmically. We reverse-engineered their logic and built features like AQ conference penalties, at-large power conference bonuses
| Stage | Features | What It Captures |
|---|---|---|
| Base | 66 | Win%, NET rank, quadrant records, resume scores |
| Conference Strength | +12 | How strong is this team's conference? |
| Committee Logic | +26 | AQ/AL bid adjustments, bubble zone flags, conference penalties |
| Model | Kaggle RMSE | Correct Seeds |
|---|---|---|
| Base Ensemble (4 models, 66 features) | ~2.39 | 408/451 |
| Enhanced Ensemble (5 models, 104 features) | ~1.39 | 411/451 |
| Tournament Blend (7 models) | ~1.37 | 412/451 |
| Semi-Supervised (verified historical labels) | ~0.11 | 448/451 |
We built a full Tableau dashboard to present our findings as a narrative. The story was structured around four questions:
- What did we build, and how accurate is it?
- Which stats actually matter for seeding?
- What patterns did we uncover about how the committee thinks?
- What does this mean for the NCAA?
Visualizations included conference strength heatmaps, model progression charts, feature importance breakdowns, radar charts for team profiles, and actual vs. predicted seed scatter plots.
- Semifinals Solution Demo (Video): Watch here
- Finals Presentation:
Final_Presentation.pptx - Dashboard Screenshots:
Dashboard Screenshots.pdf