It would be good if you can put a more detailed guide on Dataquest, to include predicting future matches using rolling averages etc. Would happily sign up just for that!
GREAT Video! I will be coding this and Implementing in my personal work. I would love to see a future video on how you go about predicting future games. I would also love to see something just like this for player performance, at a game by game level.
Really great video. I learned some good ways to use list comprehensions in pandas to help with column names on top of the scikit learn fits. Thanks for this.
Excellent video that shows you how to use machine learning to identify the correlated factors that determine the outcome using previous games, but is a little misleading because it doesn't actually show you how to predict outcomes of future games. Would love to know where I can find this information, even if I have to pay for it.
Completed the first video, super awesome thank you!!! Does this video help with grabbing player stats and using AVG Reb, PTS, AST, etc to predict stats VS opponents ?
Amazing channel mate! Are you able to demo. how to deploy ML models into production and what we could use to fully automate this end to end? Preferably with systems/platforms that are free to use.
Hi. I have two questions. a) Where did you find the data to use for your test? b) How easy is it for someone who don't know programming to learn python?
Excellent video for learning, but doesn't actually show you how you can predict future games. Future games do not have all the box score stats, which make it difficult project outcomes for the future based upon what this video is demonstrating. Some help or an additional video would be much appreciated showing how to actually use this to predict future games (or games that have not yet occurred).
Hey! Great video (complicated too, gotta watch second time) I would personally benefit very much from a video on how to use this for future matches, pleaseee!
Hey, How long will it take to run SequentialFeatureSelector with the same parameters, but using RandomForestSelector or XGBoost as a model? Couple of hours , days ?
Hi, I'm curious as to why this only results in a 64% accuracy. For example, something as simple as comparing the records of the teams at the time they've played and predicting the one with higher win% to win would result in around a 68% accuracy for the 2021-22 season. Is this due to ridge classification?
I've been working with this code for about three weeks now and I have successfully scraped all of the player stats too and want to somehow add a 'lineup' feature that looks at the MP of each player and how productive they tend to be to further improve the model. Any chance you would be willing to help me with that?
You introduced leakage/lookahead bias by scaling the entire dataset at once before training. It would be better to add it to a pipeline that scales the train test splits appropriately.
at 38:48 i get an error when running the function find team averages for last 10 games, that i can not resolve. would it have something to do with the error showing in the video, the futurewarning?
When computing the rolling averages, why did you not use the 'left = X' paramater, like you did in your football predictor video? Don't your rolling averages include knowledge of the current game you are predicting?
I need some clarification here... The data in the training and test set contain the points scored by each teams, how can the model not predict exactly if the game is a win or a loss? It literally just has to check if the team has more points and return true if it does... I am confused
hello brother, can you help me with this line that is generating the following error: line: df_rolling = df[list(selected_columns) + ["won", "team", "season"]] def find_team_averages(team): rolling = team.rolling(10).media(). return rolling df_rolling = df_rolling.groupby(["team", "season"], group_keys=False).apply(find_team_averages) error: DataError: Cannot aggregate non-numeric type: object.
We would really like to see a video on predicting future games in the NBA. Even though this would be a horrible use of PyTorch, I would like to see it done with PyTorch, as well as a wide variety of other machine learning models & technologies (sklearn, etc.). It would also be nice to see some work with regards to this done on Kaggle as well, for example using NBA datasets as well as NCAA datasets.
does this for loop need to be updated for my pc? for url in standings_pages: save_path = os.path.join(STANDINGS_DIR, url.split("/")[-1]) if os.path.exists(save_path): continue
Yeah, it has to scrape a lot of records (8500), and there is a time.sleep in the loop. Each record should take about 6 seconds to download. There's also a small chance that it will time out after 30 seconds of trying and need to retry. We can guesstimate the runtime with (8500 * 6 + 8500 * .05 * 60) / 3600 = 21.25, so it should take about 21 hours to run. You could try reducing the sleep time and timeout times for playwright, but there is a risk of getting banned by the server.
I would also love to see actually predicting future games. Thanks for the content!
Would love to see one on predicting future games. Great video. Very well done
Wow. Would love to see one on predicting future games. Great video.
Really enjoyed this! I'd love to see another video on how to predict future games. Thank you for the tutorial
I would love to see the future games please, I enjoy these videos and it helps me learn
Same for future predictors!!! THX FOR THIS!
another great vid, would love one on future predictions
It would be good if you can put a more detailed guide on Dataquest, to include predicting future matches using rolling averages etc. Would happily sign up just for that!
GREAT Video! I will be coding this and Implementing in my personal work. I would love to see a future video on how you go about predicting future games. I would also love to see something just like this for player performance, at a game by game level.
Awesome explanation, I used this for my class, thank you
Really great video. I learned some good ways to use list comprehensions in pandas to help with column names on top of the scikit learn fits. Thanks for this.
Great walk through. Would love to see how you update the next values for home and away teams
Excellent video that shows you how to use machine learning to identify the correlated factors that determine the outcome using previous games, but is a little misleading because it doesn't actually show you how to predict outcomes of future games. Would love to know where I can find this information, even if I have to pay for it.
Great video. Please make one about predicting future games.
Please do one for future games!!
Completed the first video, super awesome thank you!!! Does this video help with grabbing player stats and using AVG Reb, PTS, AST, etc to predict stats VS opponents ?
I am really like to learn the technique applied and logit used to improve the prediction. Gain a lot of knowledge from you !!
Absolute legend.
Please do a predictive video for future games 🙏🏻
simply awesome, thank you
here I'm waiting the video on predicting future games... maybe someday will come
Great tutorial 👌🏾 by any chance did you make the video on how to up date the model
Amazing channel mate! Are you able to demo. how to deploy ML models into production and what we could use to fully automate this end to end? Preferably with systems/platforms that are free to use.
Great video, would be great if you could do one but that predicts total points scored, not necessarily in basketball.
Hi there, just curious what would you say are the main things to look for when predicting games
Hi, how could I attach the season to the predictions to see how well the model did for each individual season?
stay strong, Coulibaly is going to be a star
Hi. I have two questions.
a) Where did you find the data to use for your test?
b) How easy is it for someone who don't know programming to learn python?
This was a great video but I would be happy if you would do one for the prediction of future games
When you doing the one to predict the future games i.e. value of 2 in the target column? Thanks
How did you decide to chose ridge classifier?
I wonder why using ridge classifier but not logist regression in the SequentialFeatureSelector
Excellent video for learning, but doesn't actually show you how you can predict future games. Future games do not have all the box score stats, which make it difficult project outcomes for the future based upon what this video is demonstrating. Some help or an additional video would be much appreciated showing how to actually use this to predict future games (or games that have not yet occurred).
You would use the data prior to the matchup to predict
Hey! Great video (complicated too, gotta watch second time) I would personally benefit very much from a video on how to use this for future matches, pleaseee!
Hey,
How long will it take to run SequentialFeatureSelector with the same parameters, but using RandomForestSelector or XGBoost as a model? Couple of hours , days ?
Hello, what is the algorithm used by the model and where could I get information on the logic behind the algorithm used by the model?? Thank you.
Please make a video on how to predict future games!!🙏🙏
please make video on how to predict future games
give this man move views so we get another part !!!!
Hi!
Why did we use ridge classification?
would love to see total score predictor sir.
Hi, I'm curious as to why this only results in a 64% accuracy.
For example, something as simple as comparing the records of the teams at the time they've played and predicting the one with higher win% to win would result in around a 68% accuracy for the 2021-22 season.
Is this due to ridge classification?
I've been working with this code for about three weeks now and I have successfully scraped all of the player stats too and want to somehow add a 'lineup' feature that looks at the MP of each player and how productive they tend to be to further improve the model. Any chance you would be willing to help me with that?
You introduced leakage/lookahead bias by scaling the entire dataset at once before training. It would be better to add it to a pipeline that scales the train test splits appropriately.
at 38:48 i get an error when running the function find team averages for last 10 games, that i can not resolve. would it have something to do with the error showing in the video, the futurewarning?
did u figure this out?
How would I filter out rows of games that were in the playoffs so I just have regular season games in the dataframe?
Doesnt rolling 10 include current game for rolling average? wouldnt that be a leakage?
No, because you predict always the next game
When computing the rolling averages, why did you not use the 'left = X' paramater, like you did in your football predictor video? Don't your rolling averages include knowledge of the current game you are predicting?
I need some clarification here... The data in the training and test set contain the points scored by each teams, how can the model not predict exactly if the game is a win or a loss? It literally just has to check if the team has more points and return true if it does... I am confused
We're predicting the winner of the next game. The algorithm doesn't know what happened in the next game when it is making predictions.
@@Dataquestio facepalm
'Customer segmentation in retail using machine learning' please make a video on this topic using real dataset.😥😥🙏🙏
Why do we need player stats which have max in front of them? What is the purpose of max stats ? Can anyone help clarify please?
predict future nfl games please
Hi sir,
Where to deploy this project.
Having issues with the rolling averages. Recieving a value error, saw a couple other people with the same error did anybody manage to figure it out?
which ide u r using
JupyterLab
hello Please can you show how can we select 2 teams and than the AI would chose who wins , like please write the code in the reply
hello brother, can you help me with this line that is generating the following error:
line:
df_rolling = df[list(selected_columns) + ["won", "team", "season"]]
def find_team_averages(team):
rolling = team.rolling(10).media().
return rolling
df_rolling = df_rolling.groupby(["team", "season"], group_keys=False).apply(find_team_averages)
error:
DataError: Cannot aggregate non-numeric type: object.
Did you manage to solve that?
I believe you need to change that line to rolling = team[selected_columns].rolling(10).mean()
this works@@bena.9440
@@bena.9440 Yes it is
@@bena.9440 u saved me
INSTANT FOLLOW!!!!
We would really like to see a video on predicting future games in the NBA. Even though this would be a horrible use of PyTorch, I would like to see it done with PyTorch, as well as a wide variety of other machine learning models & technologies (sklearn, etc.). It would also be nice to see some work with regards to this done on Kaggle as well, for example using NBA datasets as well as NCAA datasets.
does this for loop need to be updated for my pc?
for url in standings_pages:
save_path = os.path.join(STANDINGS_DIR, url.split("/")[-1])
if os.path.exists(save_path):
continue
Is anyone facing 'Cannot aggregate non-numreic type:object' error while trying to this:
df_rolling = df_rolling.groupby(["team", "season"], group_keys=False).apply(find_team_averages)
I have the same problem with that error.
i have solved this issue
How did u solve that?
@@akashgahlaut4078
@@akashgahlaut4078 How did you resolve
the full
del in begining shuld be like this
del df['index_opp']
del df['mp.1']
del df['mp_opp.1']
del df['mp_max_opp.1']
del df['mp_max.1']
Unrelated, but the previous nba score scraper took like 3 days to scrape 2016-2022. OH MY DAYS.
Yeah, it has to scrape a lot of records (8500), and there is a time.sleep in the loop. Each record should take about 6 seconds to download. There's also a small chance that it will time out after 30 seconds of trying and need to retry. We can guesstimate the runtime with (8500 * 6 + 8500 * .05 * 60) / 3600 = 21.25, so it should take about 21 hours to run.
You could try reducing the sleep time and timeout times for playwright, but there is a risk of getting banned by the server.
How do I add to the date selection line this new season from October to now