Hi Jaxdog! Thanks for your support. I will be posting a similar video (to this one) detailing how to use decision boundaries with KenPom stats to predict the final four and the overall champion. Stay tuned! Cheers!
This is exactly what I was looking for, thanks for the analysis. It would be really interesting to do a similar analysis for upsets looking for statistical significance of the kenpom "four factors".
Hi King! I'm glad you like the video. I agree. I just posted a video on how to find upsets. It uses randomization of KenPom probabilities to pick a few upsets. We'll see how it works. Cheers!
Hi 2DISCiples! Sure thing. The blue slanted line is AdjO = AdjD + 22.9. So for teams to be above this line, that implies that AdjO - AdjD > 22.9, or simply AdjEM > 22.9. The orange line is AdjO = AdjD + 10.0. Hope this helps!
Hi Zachary! I totally understand the confusion here. These are conflicting stats. When this happens, I resort to the matchup. According to the KenPom prediction formula, Kentucky has an 86.2% chance of defeating Oakland. So I am picking Kentucky. Hope this helps!
Think you might have mentioned in one of your videos, but are the historical data you’re using to find these criterias, are these the kenpom stats of the teams after their round 1 or round 2 games have been played or the ratings they had going into the tournament (prior to round 1)?
The decision boundaries (AdjO and AdjEM) only apply to nine teams currently. All of which are going to be 4 seeds or better. What about finding round 1 winners for teams in the middle of those boundaries for both wins and losses?
Hi Adam! There are currently 13 teams with AdjEM > 22.9 as of 9pm on Thursday night. I definitely think the conference tournaments will give us more teams that satisfy these boundaries. Also, when the final selections are made, we'll see more teams below the lower bounds. However, you are absolutely correct. There will be games with two teams in the "middle" of the boundaries. For those, I plan on using the following KenPom spreadsheet from another UA-camr. It is free to download. I might also use my own KenPom model I created with Python. I just haven't been thrilled with the results, yet. Cheers! Here is a link to the Excel KenPom model video for single game predictions: ua-cam.com/video/N0fvKpe2sJQ/v-deo.html
So how do you can you use it for even teams marching up against each other that aren’t in top 17 of kenpom Because we have for example 2nd round over 122 wins 80% and under 105 loses78% , what does that mean for under 120-110 Because the fact of the matter is is from 18-53 every single one of the teams falls under 120 -112 offensive efficiency and every single team there is under 102.5 defensive efficiency ( except Indiana state whos exactly 102.5)
Hi CD2_! That is a very good question. My video does not address how to pick games "inside" the decision boundaries. Part of the answer will be in my next video where I use decision boundaries to pick the final four and the overall champion. However, these teams are likely to be above the high values. For the "in-between" teams I plan on using a KenPom single game prediction model created by another UA-camr. It is an Excel spreadsheet that is a free download. My plan is to be "all-in" on KenPom this year. I actually believe the strength of this strategy will be in predicting the overall champion. We'll see how it plays out! Cheers! Here is a link to the Excel KenPom model video for single game predictions: ua-cam.com/video/N0fvKpe2sJQ/v-deo.html
Hi Kevin! I agree. Especially when it comes to predicting the higher seeds. I think the upsets are exceptions to that rule. But it would be interesting to see if a team with a lower AdjEM would still be predicted to win if they have, say, an elite AdjD and their opponent has a mid-level AdjO. I'll definitely look into this. Cheers!
Hi Tyler! I created that chart using the pre-tourney KenPom stats from 2001 - 2023. I use Python and its various libraries to do almost all of my analysis. I would be happy to share the code if you are interested. Cheers!
Hey Levi! That's an interesting idea. I've trained machine learning models to predict wins. But the variance in March Madness is hard to predict. Cheers!
Is there any concern of overfitting by using the entire available data to produce the decision boundaries? Also, maybe it's more worthwhile to use say the 5 most recent years to pick up on any new trends.
@@mikebabiak Hey Mike! That's a very good question. Overfitting is always a concern. However, after training a model, it is common practice to train the "final" model on all available data before using it to predict future events. That's how I view this. I'm trying to predict 2024. If you ignore where the decision boundaries came from for a moment, here is how the decision condition "AdjEM > 22.9" played out from 2018 - 2023 in the first round: 2018 - 90.0% (9 of 10), 2019 - 92.3% (12 of 13), 2021 - 88.9% (8 of 9), 2022 - 84.6% (11 of 13), 2023 - 88.9% (8 of 9). There are currently (3/14/2024) 12 teams with an AdjEM value greater than 22.9. Cheers!
Hi MaverickFan! While I do not necessarily recommend paying for data, the annual subscription fee for KenPom is aroud $22. Also, you can see the pre-tourney data for 2024 before the games start today (3/21/2024). Cheers!
Looking forward to your video next week to see what boundary values teams this year are in. Interesting stuff, love the content keep it up!
Hi Jaxdog! Thanks for your support. I will be posting a similar video (to this one) detailing how to use decision boundaries with KenPom stats to predict the final four and the overall champion. Stay tuned! Cheers!
This is exactly what I was looking for, thanks for the analysis. It would be really interesting to do a similar analysis for upsets looking for statistical significance of the kenpom "four factors".
Hi King! I'm glad you like the video. I agree. I just posted a video on how to find upsets. It uses randomization of KenPom probabilities to pick a few upsets. We'll see how it works. Cheers!
Awesome video, just subscribed! Excited to see the analysis on Monday.
Hi Vince! Thanks and welcome! Looking forward to seeing the selections on Sunday. Cheers!
This was a great video. I was wondering if it would be possible to post the formula of the linear functions that can be found at 5:16?
Hi 2DISCiples! Sure thing. The blue slanted line is AdjO = AdjD + 22.9. So for teams to be above this line, that implies that AdjO - AdjD > 22.9, or simply AdjEM > 22.9. The orange line is AdjO = AdjD + 10.0. Hope this helps!
For round 1, Kentucky fits the AdjO >120.7(88% win) and the AdjD >102.6(90% lose). Curious to hear your thoughts on their matchup round 1?
Hi Zachary! I totally understand the confusion here. These are conflicting stats. When this happens, I resort to the matchup. According to the KenPom prediction formula, Kentucky has an 86.2% chance of defeating Oakland. So I am picking Kentucky. Hope this helps!
Think you might have mentioned in one of your videos, but are the historical data you’re using to find these criterias, are these the kenpom stats of the teams after their round 1 or round 2 games have been played or the ratings they had going into the tournament (prior to round 1)?
Hi Minhajghayur! The KenPom stats that I use and discuss in my videos are all *pre-tourney* stats. Cheers!
@@KerrySportsAnalyst Thank you for confirming! This content you’re creating is great! looking forward to Tomorrow’s video
The decision boundaries (AdjO and AdjEM) only apply to nine teams currently. All of which are going to be 4 seeds or better. What about finding round 1 winners for teams in the middle of those boundaries for both wins and losses?
Hi Adam! There are currently 13 teams with AdjEM > 22.9 as of 9pm on Thursday night. I definitely think the conference tournaments will give us more teams that satisfy these boundaries. Also, when the final selections are made, we'll see more teams below the lower bounds. However, you are absolutely correct. There will be games with two teams in the "middle" of the boundaries. For those, I plan on using the following KenPom spreadsheet from another UA-camr. It is free to download. I might also use my own KenPom model I created with Python. I just haven't been thrilled with the results, yet. Cheers!
Here is a link to the Excel KenPom model video for single game predictions:
ua-cam.com/video/N0fvKpe2sJQ/v-deo.html
So how do you can you use it for even teams marching up against each other that aren’t in top 17 of kenpom
Because we have for example 2nd round over 122 wins 80% and under 105 loses78% , what does that mean for under 120-110
Because the fact of the matter is is from 18-53 every single one of the teams falls under 120 -112 offensive efficiency and every single team there is under 102.5 defensive efficiency ( except Indiana state whos exactly 102.5)
Hi CD2_! That is a very good question. My video does not address how to pick games "inside" the decision boundaries. Part of the answer will be in my next video where I use decision boundaries to pick the final four and the overall champion. However, these teams are likely to be above the high values. For the "in-between" teams I plan on using a KenPom single game prediction model created by another UA-camr. It is an Excel spreadsheet that is a free download. My plan is to be "all-in" on KenPom this year. I actually believe the strength of this strategy will be in predicting the overall champion. We'll see how it plays out! Cheers!
Here is a link to the Excel KenPom model video for single game predictions:
ua-cam.com/video/N0fvKpe2sJQ/v-deo.html
Seems a lot simpler to say "the team with the higher AdjEM is expected to win". It will be interesting to see exceptions to that rule.
Hi Kevin! I agree. Especially when it comes to predicting the higher seeds. I think the upsets are exceptions to that rule. But it would be interesting to see if a team with a lower AdjEM would still be predicted to win if they have, say, an elite AdjD and their opponent has a mid-level AdjO. I'll definitely look into this. Cheers!
Did you create the "Round 1 Results" chart or is it available in KenPom?
Hi Tyler! I created that chart using the pre-tourney KenPom stats from 2001 - 2023. I use Python and its various libraries to do almost all of my analysis. I would be happy to share the code if you are interested. Cheers!
@@KerrySportsAnalystI would actually be interested in looking at the Python code if you don’t mind sharing
I wonder if there'd be a way to throw ai into the mix to get it even more accurate than it already is
Hey Levi! That's an interesting idea. I've trained machine learning models to predict wins. But the variance in March Madness is hard to predict. Cheers!
Is there any concern of overfitting by using the entire available data to produce the decision boundaries? Also, maybe it's more worthwhile to use say the 5 most recent years to pick up on any new trends.
@@mikebabiak Hey Mike! That's a very good question. Overfitting is always a concern. However, after training a model, it is common practice to train the "final" model on all available data before using it to predict future events. That's how I view this. I'm trying to predict 2024.
If you ignore where the decision boundaries came from for a moment, here is how the decision condition "AdjEM > 22.9" played out from 2018 - 2023 in the first round: 2018 - 90.0% (9 of 10), 2019 - 92.3% (12 of 13), 2021 - 88.9% (8 of 9), 2022 - 84.6% (11 of 13), 2023 - 88.9% (8 of 9). There are currently (3/14/2024) 12 teams with an AdjEM value greater than 22.9. Cheers!
suprising that farleigh dickinson's defense was that bad. they looked really pesty against purdue and even fau
Hi Rasak! That surprised me too. Their AdjD for 2023 was 118.425 which was third worst among all D1 schools. Yet the pulled the upset. Wild!
How did you find that dataset?
you have to pay for it
Hi MaverickFan! While I do not necessarily recommend paying for data, the annual subscription fee for KenPom is aroud $22. Also, you can see the pre-tourney data for 2024 before the games start today (3/21/2024). Cheers!