top of page

Projects

Predicting NHL Goal Probabilities with Machine Learning
Project Overview
  • Title: Predicting NHL Goal Probabilities with Machine Learning

  • One-liner: Using NHL shot data from MoneyPuck.com to model and predict goal probabilities.

  • Short Description:

    • Exploratory analysis of NHL shot data to uncover patterns and scoring trends.

    • Development of logistic regression models with feature engineering.

    • Ongoing development of nonlinear models to improve predictive accuracy.

    • Highlights end-to-end ML workflows, data visualization, and reproducibility.

Skills and Techniques Used
  • Python, pandas, NumPy, scikit-learn, matplotlib

  • Exploratory Data Analysis (EDA)

  • Logistic regression and feature engineering

  • Model evaluation metrics

  • Data wrangling and cleaning

  • Visualization of complex datasets

Key Findings
  • Built a base logistic regression model using shot location, type, and context features. Achieved AUC 0.72 and Brier score 0.062.

  • Compared model predictions to MoneyPuck.com’s xG values; found systematic underestimation but identified clear patterns to improve the model.

  • Added additional features through feature engineering (shooter, goalie, team, and game situation info). This improved performance: AUC 0.785, Brier score 0.059.

  • Visualizations show the updated model better aligns with expected goal probabilities, centering prediction errors around zero.

baseLogR.png
refinedLogR.png

Prediction Error Distributions: Base Model (Left) vs Combined Model (Right)

Project Status
  • Started nonlinear modeling

  • Applying Random Forest and XGBoost to capture more complex patterns in gameplay.

bottom of page