Thanks a lot for taking the time to make this video man, I'm a huge sports fan who's been self learning Python for the last 4 months or so. Just getting into looking at Jupyter, Numpy and Pandas, so these videos are a great resource for me. Why don't NBA stats want you scraping from them? I don't understand what the issue is, is it just that large numbers of requests put pressure on their servers? Will move on to part 2 now!
That's great to hear!! To answer your question: I think (and this is just my basic understanding) websites try to prevent scraping so that bots can't access all the data with malicious intent (e.g., selling the data). So the goal of adding headers is to appear less bot-like (by changing the user agent).
@@alexsington Yeah that makes sense from your end. Web Scraping is still so new to me but I've heard it referred to as 'ethically dubious' and I just cannot understand outside of putting the strain on the server with multiple requests simultaneously why it would be an issue for you to take a copy of data they've made available publicly. Do you plan on doing any similar videos for any other sports?
At 7:30 when you say that data used to be available but not anymore... is there really no way to get it? For example if I wanna pull team's offensive and defensive ratings (which has the same problem as those shooting stats), is the only way to get it is through scraping the hmtl source?
Great video, thanks for the info! I was trying to recreate this but it seems like they've locked the page down quite a bit, any time I pull any of the URLs it just continuously loads and never renders. Any suggestions/tips to pull this data consistently?
Is there a diff technique to get teams data because whenever I try this technique to get teams data the server connections get abot=rted without response if there is any other way pls do tell me
If you are trying to scrape a page where the API is not available, then that's the problem. If not, then, to be completely honest, I don't have a good answer for you. I tried running the code with and without headers. It just took a few tries before it ran all the way through without getting blocked, and I don't know why one try failed while another try succeeded.
i keep the headers=headers, and change the np.random.uniform() to 1 sec , and keep rerun until it done "sorry my english sucks hopefully it helped you"
I am not able to scrape data. It crashes when fetching data for 2012-13 playoffs season. Can you help? Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages equests\models.py", line 971, in json return complexjson.loads(self.text, **kwargs) File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 346, in loads return _default_decoder.decode(s) File "C:\ProgramData\Anaconda3\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\ProgramData\Anaconda3\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\Nishchay\AppData\Local\Temp\ipykernel_12448\3071084054.py", line 29, in r = requests.get(url = api_url).json() File "C:\ProgramData\Anaconda3\lib\site-packages equests\models.py", line 975, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
This was awesome! I’ve been wanting to scrape sports data in a loop like this for a while. Thanks!
Nice video on websrapping. Love it!!!
Thanks a lot for taking the time to make this video man, I'm a huge sports fan who's been self learning Python for the last 4 months or so. Just getting into looking at Jupyter, Numpy and Pandas, so these videos are a great resource for me. Why don't NBA stats want you scraping from them? I don't understand what the issue is, is it just that large numbers of requests put pressure on their servers? Will move on to part 2 now!
That's great to hear!! To answer your question: I think (and this is just my basic understanding) websites try to prevent scraping so that bots can't access all the data with malicious intent (e.g., selling the data). So the goal of adding headers is to appear less bot-like (by changing the user agent).
@@alexsington Yeah that makes sense from your end. Web Scraping is still so new to me but I've heard it referred to as 'ethically dubious' and I just cannot understand outside of putting the strain on the server with multiple requests simultaneously why it would be an issue for you to take a copy of data they've made available publicly.
Do you plan on doing any similar videos for any other sports?
Would that data that is locked out be available on the nba_api?
At 7:30 when you say that data used to be available but not anymore... is there really no way to get it? For example if I wanna pull team's offensive and defensive ratings (which has the same problem as those shooting stats), is the only way to get it is through scraping the hmtl source?
No, you can definitely scrape that too, but you just can't use this simple API method. I'd recommend trying something like beautiful soup.
Excellent video content and beautiful scrap data.
Great video, really easy to follow along.
Can you look at FT differential vs W/L and look at statistical significant correlations of this happening to large markets over small markets ?
So cool! Thanks for sharing
I cant access the api anymore with the same methods, any idea if NBA locked its basic data too now?
Great video, thanks for the info! I was trying to recreate this but it seems like they've locked the page down quite a bit, any time I pull any of the URLs it just continuously loads and never renders. Any suggestions/tips to pull this data consistently?
having the same problem here!
Any luck with being able to pull the data?
Did you work this out?
Good stuff, Fight On bro!
Nice fella 💪💪
Is there a diff technique to get teams data because whenever I try this technique to get teams data the server connections get abot=rted without response
if there is any other way pls do tell me
Thank you for this tutorial
They have an option to share the data which allows you to download it... no need to scrap 🎉
what exactly did you do to solve the final error? I got the same error. removed headers=headers but I'm still getting the same error.
If you are trying to scrape a page where the API is not available, then that's the problem. If not, then, to be completely honest, I don't have a good answer for you. I tried running the code with and without headers. It just took a few tries before it ran all the way through without getting blocked, and I don't know why one try failed while another try succeeded.
i keep the headers=headers, and change the np.random.uniform() to 1 sec , and keep rerun until it done "sorry my english sucks hopefully it helped you"
@@ahmadkurniawan6984 not working for me bro, can you help?
@@nishchay89 thats the only thing i did bro , just keep doing it til it succeeds, give me your email, i can send you my notebook if you want
@@nishchay89 done check ur email
Awesome!
Where is second part.
how did you get your data extracted into Excel? I compress all seasons from 2012 - 2021 & then input "data = pd.read_excel('nba_player_data.xlsx')"
Were you able to scrape all the data?
I keep running it over and over again but I’m getting a KeyError: ‘resultSet’ after it attempts to scrape the data
@@jacksondavies8695 me to did you ever fix this?
@@jacksondavies8695 I was getting the same error, turns out that I didn't enter the url correctly in the for loop. Make sure it matches the video!
@@NGSRumble Yo I did the exact same thing but I get the same keyError
Guys does anyone watching this actually have a winning sports betting model is it possible or are the casinos already on it
I am not able to scrape data. It crashes when fetching data for 2012-13 playoffs season. Can you help?
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages
equests\models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\ProgramData\Anaconda3\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\ProgramData\Anaconda3\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Nishchay\AppData\Local\Temp\ipykernel_12448\3071084054.py", line 29, in
r = requests.get(url = api_url).json()
File "C:\ProgramData\Anaconda3\lib\site-packages
equests\models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
could you solve this?
How to earn money with this stats ?
What do you mean by that?
@@alexsington bying this data or maybe use this for gambling after analyse data