Predict Baseball Stats using Machine Learning and Python

Поділитися
Вставка
  • Опубліковано 14 жов 2024

КОМЕНТАРІ • 40

  • @imfrshlikeuhh
    @imfrshlikeuhh 2 роки тому +19

    the fact that this type of content is FREE is mind blowing

    • @anishapostate4221
      @anishapostate4221 2 роки тому

      he fact that people are not knowing this is another mind blowing thing

    • @imfrshlikeuhh
      @imfrshlikeuhh 2 роки тому +1

      @@anishapostate4221 i wldnt say that, there are plenty more ppl who dont know this than do

  • @DanielGarcia-uq8yz
    @DanielGarcia-uq8yz 2 роки тому +2

    Great project...love the concept of dataquest's guided project walkthroughs. Thanks Vik

  • @kingofhavila9850
    @kingofhavila9850 2 роки тому +1

    That day I joined the webinar slightly late so I was excited about watching this video.

  • @henryryan5194
    @henryryan5194 Рік тому +7

    I might be missing something, but...
    Once you have trained and tested the model, what is the process to apply the model to predict the following year?
    In this video you trained the mode to predict the "Next_WAR" which in this case would be the players 2022 WAR, and then evaluated the model based on the real result vs. your predicted result. But, if you wanted to predict 2023 WAR, how would the code need to be adjusted?
    Essentially, how do you used the trained model to predict 2023 player WAR?

    • @willcarroll9762
      @willcarroll9762 Рік тому

      You ever figure it out? I’m struggling there too

    • @Chris-rl6rw
      @Chris-rl6rw Рік тому

      @@willcarroll9762 This model can only predict one year out into the future. To predict 2023, you would need 2022 data. It's not necessairly a full time series analysis, but a linear regression model used to predict the following years stats. Predicting Next WAR is predicting next years stat. You could attempt to create a column for 2 years out into the future by shifting the 'WAR' column again and testing how the model predicts two years into the future and so on. My guess is it may start performing poorly at that stage.

    • @LouieWinehouse
      @LouieWinehouse Рік тому

      you could train it based on the first 3 months of data to predict the next 6 months of the season or however u want. For my mlb ML model i train it on March-July to predict August-October

  • @hakeemyatim5363
    @hakeemyatim5363 Рік тому +1

    Hello! This is an awesome project and walkthrough that you've done!
    I actually wanted to try predicting HR's instead of WAR's in this model, but when I tried it with scaling the data for ridge regression, I would get HR numbers between 0 to 1 with the minmax scaler. But if I skip that part, I'd get the whole number of the predicted HR for the next year. Would it still be accurate if we are just looking at HR's when I skip the scaling?
    Again, Great Video!

    • @Dataquestio
      @Dataquestio  Рік тому +1

      You don't want to scale your target column. So if you're predicting HRs, you want to scale all of the columns except the HR column.

  • @tomkmb4120
    @tomkmb4120 2 роки тому +1

    Hey Vik, coming here from your more recent video with NBA stats analysis. In this instance, is pybaseball replacing the more manual work being done by playwright and having to parse the specific html in order to scrape the data you need? Is there an equivalent for the NBA to pybaseball? I think there may be one for the NFL that I've seen in places but this is all new to me so I can't be sure. Just struggling a bit with adapting that previous video to be a regular python file instead of following along directly with your Jupyter tutorial is all.

  • @tomkmb4120
    @tomkmb4120 Рік тому

    A little confused on the Sequential Feature Selector, you mention that after normalising the data - it picks the features that it thinks will help with accuracy the most, how is it determining that? Sorry if that's a stupid question.

  • @chealol4233
    @chealol4233 3 місяці тому

    How would you be able to do this for "Predicting" an player to record a hit in a given game? Is that possible?

  • @arundey3971
    @arundey3971 Рік тому +1

    any idea on why pybaseball package no longer loads. I tried pip install pybaseball, and I get an error.

  • @pushkarratnaparkhi2205
    @pushkarratnaparkhi2205 2 роки тому +1

    Great video. Thanks 💯💯

  • @jscott21
    @jscott21 Рік тому

    Incredible video - thank you so much

  • @evanmaurer1968
    @evanmaurer1968 Рік тому

    I appreciate this content sir. Thank you so much!

  • @vitonash
    @vitonash Рік тому

    a bit confused on what the purpose of making the full copy and then dropna() was. it doesn't seem like the full copy was used at all throughout the rest of the code?

  • @wanjohisamuel8547
    @wanjohisamuel8547 2 роки тому +1

    Your videos are amazing. I'm starting to love ML.
    What advice will you give to someone who is starting Data Science...

    • @Dataquestio
      @Dataquestio  2 роки тому +1

      That's great to hear, Wanjohi! I actually started a site called Dataquest where you can learn data science from scratch - the data scientist path will teach you all the main data science skills - www.dataquest.io/path/data-scientist/ .

  • @reena3571
    @reena3571 2 роки тому

    Thank you immensely for sharing

  • @el_goomba
    @el_goomba Рік тому +3

    how would you adjust the code to predict 2023 war?

    • @kellybjames
      @kellybjames 7 місяців тому

      did you solve for this?

  • @paperk1d
    @paperk1d Рік тому

    Is it possible to this in R I am just started to learn about programming so I don’t have much knowledge about this

  • @tjans1979
    @tjans1979 Рік тому

    What editor are you using for this?

  • @leassis91
    @leassis91 Рік тому

    thank you for this content!

  • @fudgenuggets405
    @fudgenuggets405 Рік тому

    I don't think pybaseball is working any more. I get a blank .csv at the beginning after supposedly downloading the Fangraphs data.

  • @gianpierrealvarado993
    @gianpierrealvarado993 8 місяців тому

    Does anyone know why I wouldn’t be able to import pybaseball on JupyterLab anymore? I’m trying to follow along on my own notebook and for some reason I’m getting an error code that the module doesn’t exist. Thanks for any help in advance!

  • @peter93263
    @peter93263 Рік тому

    Can you do something similar for English Premier league soccer?

  • @AlyssaFord-xs3ht
    @AlyssaFord-xs3ht Рік тому

    I am having trouble finding the batting csv file

  • @zachbroussard8734
    @zachbroussard8734 Рік тому

    I’m not getting the CSV when I run this. Can anyone help?

  • @AbrarMuhtasim
    @AbrarMuhtasim 2 роки тому

    'Customer segmentation and clustering in retail using machine learning' with real data set. Please make a project tutorial in this project😭😭😭😭

  • @cloudcomputingbd
    @cloudcomputingbd 2 роки тому

    nice

  • @emmamutegi5919
    @emmamutegi5919 2 роки тому

    I have a problem running this...help
    removed_columns = ['NEXT_WAR', 'Name', 'Team' ,'IDfg', 'Season']
    selected_columns = dataset.columns[~dataset.columns.isin(removed_columns)]
    'AttributeError: 'function' object has no attribute 'columns'

    • @Dataquestio
      @Dataquestio  2 роки тому

      It looks like 'dataset' is a function for some reason. It should be a pandas Dataframe. Make sure you didn't accidentally assign to the `dataset` variable.

    • @turtle1897
      @turtle1897 Рік тому

      @@DataquestioI have that same issue and I have just started Dquest and was just using this as a follow along project while I wasn’t studying. I have some knowledge but not yet to this stage yet just working towards familiarity

  • @SuperNunera
    @SuperNunera 2 роки тому

    Ty for sharing. Amazing content.