Scrape LIVE scores - No BeautifulSoup or Selenium NEEDED!

Поділитися
Вставка
  • Опубліковано 2 лют 2025

КОМЕНТАРІ • 123

  • @abhijeetbonde8635
    @abhijeetbonde8635 3 роки тому +21

    I just learned this trick 2 days back. one of my friend showed me this method... and i was wondering why hasn't anyone uploaded a video on this. and here it comes.... please do keep making these videos.... they are really helpful...

  • @fatimaelmansouri9338
    @fatimaelmansouri9338 3 роки тому +15

    This is probably the best video I've seen on APIs ! this topic is so poorly covered on UA-cam! Amazing content thank you for this !!

  • @Diego-ry6bo
    @Diego-ry6bo 3 роки тому +12

    This is so helpful and educational John! Keep it up mate! Love your work.

  • @ianchirwa5816
    @ianchirwa5816 26 днів тому

    I watched this video some years back and it helped me a lot, now years later I was in need of the concepts taught here. Been looking all over the internet for this video!! Was afraid it got deleted 😅

  • @dadimoszhanzhad8440
    @dadimoszhanzhad8440 3 роки тому +1

    Bro... What ! this is next level scrapin.. Beyond The Complexities of code, Yet With all the features, Thank You Very Much ! I Love You !

  • @maximuscryptosx9424
    @maximuscryptosx9424 2 роки тому +12

    Wow. This is exactly what I was looking for. Simply brilliant. Thank you!

  • @StormWolf01
    @StormWolf01 2 роки тому +1

    Well, scrapping data from the actual API server as opposed to the webpage itself is actually a great idea. Thanks for the vid.

  • @tubelessHuma
    @tubelessHuma 3 роки тому +2

    You are right. It would be first step to check for any API to make our life easy. Thanks John.💖

  • @wangdanny178
    @wangdanny178 2 роки тому +1

    Ok I think this video solved the problem of yesterday posted in another episode about hidden api. THANKS JOHN!

  • @wallstreetx5241
    @wallstreetx5241 2 роки тому +1

    😁 SUPER HELPFUL one of the best coding learning videoes, I ever watched!! you've gained a sub for life!

  • @Swqtt
    @Swqtt 2 роки тому +1

    Great video, it is a lot more useful to work api then with Selenium. I improved my time to download everything from 5 to like 1 minute. Thanks

  • @marioandresheviacavieres1923
    @marioandresheviacavieres1923 2 роки тому +1

    Thanks for the awesome tip, cheers from Seattle!!

  • @abhijitmondal7831
    @abhijitmondal7831 3 роки тому +5

    Wow. That's amazing 🔥 I really like your work.

  • @matiascavalcante4698
    @matiascavalcante4698 Рік тому +1

    Saved a lot of trouble using this method, thanks!

  • @fernandaalves71
    @fernandaalves71 Рік тому

    Your work is amazing! Thanks for helping me a lot with these scraping practices!

  • @erichubbard5449
    @erichubbard5449 4 місяці тому +1

    Fantastic content as usual! 🎉

  • @JohnBillot
    @JohnBillot 3 роки тому +4

    Superb, clearly presented and explained. Thank you so much.

  • @boiboi1988
    @boiboi1988 3 роки тому +1

    Thanks for this tutorial John. Really appreciate what you are teaching here. It solved my web scraping problem. :)

  • @caiopjv
    @caiopjv 2 роки тому +1

    So helpful! Much easier for what I was trying than BeatifulSoup.

  • @luisparada5443
    @luisparada5443 2 роки тому +1

    I hope I can buy you a beer sometime man. I appreciate this video for real. Thank you! +1 Follower

  • @hossamgamal8661
    @hossamgamal8661 3 роки тому +1

    Amazing video as always
    keep up the good work

  • @VictorVaughn1
    @VictorVaughn1 11 місяців тому +1

    Awesome video! Do you have a video on what to do with all the information that you just scraped, examples of how to use it?

  • @andriuslopes6377
    @andriuslopes6377 Рік тому +1

    Thank you very much !! I was having trouble extracting data from dynamic websites.

  • @wernerbrasil
    @wernerbrasil 2 роки тому +1

    Excellent tutorial! Big fan of your videos

  • @scg565081
    @scg565081 2 роки тому +2

    Thanks for all the tutorials John. As a newbie to Web Scraping and data science (never too old to learn at 58), I’m loving the intuitive and plain English approach you have in your demonstrations. having watched the ‘Scraping News’ video and now this one, I wonder how you could refine the script to include a search for the search bar and then suggesting a topic that is then searched for. I.e. I have a news feed favourite site that has a search bar that I can refine my chosen reading material, say ‘Ukraine’ for example, and it goes and fetches all the news from around the world on that topic. It’s then that I’d like to scrape the newsfeeds and then that your newsfeed script comes into its own. Great if you could demonstrate a video that overcomes the search aspect before the automated scraping. Thanks and keep up the videos. Easily my favourite go to learning resource.

    • @lethalbacon16
      @lethalbacon16 2 роки тому +2

      Well if you look at the network calls when you search something you should be able to track down the endpoint they use for searches. You should then be able to call that endpoint yourself and scrape the data that way.

  • @craftsntech2500
    @craftsntech2500 3 роки тому +1

    So helpful. Thanks for the concept shared freely

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +1

      Thanks glad you enjoyed it

    • @craftsntech2500
      @craftsntech2500 3 роки тому

      @@JohnWatsonRooney Really... You know I spend lots of time doing this via selenium python, but this just made my life much easier.

  • @abdul2651
    @abdul2651 Рік тому +1

    Omg its so useful!!!!! Got subbed. Thaks!

  • @nikolamilicevic8665
    @nikolamilicevic8665 2 роки тому

    This is extremely useful, thanks for the tutorial!

  • @SuperPoker1980
    @SuperPoker1980 2 місяці тому

    Fantastic, this is the video I was looking, I was wondering how I could collect the past data of matches already played by inserting the date as input information. Thx

  • @AidarIsayev
    @AidarIsayev 2 роки тому +1

    Thank you! This's really a game changer. )

  • @grahamlindsay9798
    @grahamlindsay9798 2 роки тому +1

    That is really useful, thank you for that.

  • @adnan-hz7ed
    @adnan-hz7ed Рік тому +1

    can i access the "Statistics" too this way? like if i wanted to make a code that checks if the home team has 4 shots on target and the away team has 0 and other conditions like that

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      Yes I think you could, do the same process but on the page where the stats load up and find the api

    • @adnan-hz7ed
      @adnan-hz7ed Рік тому

      @@JohnWatsonRooney hmm thanks i will keep trying. Seems a bit difficult since some live games have livestats when you click on them but couldnt find any keys in the json file they all were false altough some shouldv been true

  • @vignesh_waran
    @vignesh_waran 2 роки тому +2

    Thankyou so much for this video

  • @kamaleshpramanik7645
    @kamaleshpramanik7645 2 роки тому +1

    Thank you very much Sir.. learning so much.

  • @ThuyTran-bc2mt
    @ThuyTran-bc2mt 3 роки тому +7

    what about making a scrapy splash tutorial? I hope you will make it

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +2

      I have one on my channel already, but will be doing more as I do more scrapy videos

    • @ThuyTran-bc2mt
      @ThuyTran-bc2mt 3 роки тому +1

      @@JohnWatsonRooney it's so great to hear that. I have learned a lot with your videos

  • @pkpkpk_9811
    @pkpkpk_9811 Рік тому

    This is a perfect simple video. However, if the api called is changed how can you parse it since the old one brings old data???
    Thanks in advance.

  • @i701Dev
    @i701Dev 3 роки тому +1

    Thanks for this video!

  • @playtune9217
    @playtune9217 Рік тому +1

    Instead of new API calls, can I get data from the browser's network tabs when the API returns data on the client's browser?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      If i understand you correctly, yes you can - if you use playwright or selenium you can access the network events and have it return you the json data each time i loads up a page. I use this method for some sites, depending on what I am doing and how they respond

  • @ppena120
    @ppena120 2 роки тому +1

    Super helpful. Thanks

  • @Ionut.C
    @Ionut.C 10 місяців тому

    Hello, it works great. What should I do if I want the odds before the matches start? Let's say that every morning I want to copy the quotas. I notice that each match has a numerological event identifier, how do I identify this numerological event so that I can copy the odds and the next day I can enter next to each event the score that was recorded? Thank you and all the best!

  • @hobo_1616
    @hobo_1616 2 роки тому +1

    Thank you so much!

  • @bisratgetachew8373
    @bisratgetachew8373 3 роки тому +1

    Great Video

  • @niccolotomei316
    @niccolotomei316 2 роки тому +1

    Thank you!

  • @justinjchambers
    @justinjchambers 2 роки тому

    Thanks so much for this tutorial. I was wondering if there is a work around when a site isn't returning any such xhr data, regardless of what links and buttons you click to try and initiate a response?

  • @lordmo3416
    @lordmo3416 3 роки тому +4

    Your structuring is amazing.
    Since the website calls data from the API every 10 seconds or so, why did I get banned when I automated an interval to request updated data from the API?
    Is there a workaround not to get banned?
    Like, what other criteria does the website use to recognize a bot?

    • @lordmo3416
      @lordmo3416 3 роки тому

      @Loja Outweb how did you fix yours?

    • @leonardoplaza7677
      @leonardoplaza7677 3 роки тому

      @Loja Outweb He mentioned the website probably works with cloudflare to avoid DDOS attacks. That's why they will block your IP if you make constant requests. Try rotating IPS like he mentioned or just lower the requests by searching every minute.

    • @seankw2880
      @seankw2880 2 роки тому

      @Parth Kulkarni he has another video on that ua-cam.com/video/vJwcW2gCCE4/v-deo.html

  • @DjElio100
    @DjElio100 Рік тому +1

    Thanks man

  • @A.R.-rs
    @A.R.-rs Рік тому

    Can I apply this method on flashscore websites? I guess that site doesn't have api url

  • @matheosmattsson2811
    @matheosmattsson2811 2 роки тому +1

    Could you do a video on something similar but where the API wants a key? I copied the request like you did into insomnia, but I cannot replicate it in there. The response says "no API key provided". I am unable to figure out how the client code in the browser embeds the api key without the request on the network tab knowing about it... The site I am trying to scrape seems to use Vue, if it makes any difference. I tried to inspect the "initiator" javascript file but obviously it is minified and unreadable.

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому

      I usually find adding the full headers works, we are then telling the backend we are a browser and we need the information - I'd have to check the site example you mentioned though to see. You can email it to me if you like, email on my YT page.

    • @matheosmattsson2811
      @matheosmattsson2811 2 роки тому

      @@JohnWatsonRooney Yeh I thought I had left something out earlier when I tried it a couple of weeks ago. I then saw your video and figured I would give it another shot with copying everything "automatically" copy -> cUrl cmd, but it did not help (earlier I made the request myself "from scratch"). I will email you the site and details. Thanks!

    • @XiagraBalls
      @XiagraBalls 2 роки тому

      @@matheosmattsson2811 This method will only work for public APIs - where private API keys aren't required. Usually you encrypt your key details into a hash, send it over and its decrypted by the server and your key is extracted there. This means that all an anonymous user would see in the headers from the Network tab is the encrypted hash and you can't just use an existing hash as it will also include a timestamp.

  • @abhijeetbonde8635
    @abhijeetbonde8635 3 роки тому +1

    can you please try to make a video on how to scrape websites that are using cloudflare protection?

  • @killian.1603
    @killian.1603 2 роки тому

    the video is really well explained, thanks for that. However I'm trying to add a condition for tennis games, how should I add the coming set "period" on this API to python

  • @stephensunday7653
    @stephensunday7653 19 днів тому

    hi, cool video. how can I get a correct score market on a sport betting site . where I can print teams name with thier corresponding odds e.g Team A vs Team B , 2:1 at 9.6

  • @void-qy4ov
    @void-qy4ov 3 роки тому +1

    For protected API, do you think it is possible to make the first call with selenium, grab a token, and from this point use it in calls toward API using requests ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +1

      Yes I think so, you can take the cookie from your selenium request and reuse it in other parts of the code

    • @void-qy4ov
      @void-qy4ov 3 роки тому

      @@JohnWatsonRooney it seems that it is easier with selenium-wire, since you actually get access to all requests/responses including the headers

    • @kissoffire76
      @kissoffire76 3 роки тому

      ​@@JohnWatsonRooney did you mean by mimicking logging oneself in there in the 1st place by using Selenium, so as to make this secret part of the Header (call it a token or cookie or whatever the site owner stated it is) accessible? I am just making a strategy as how to scrape API protected JSON stored reviews, sliced by a company name, for my master thesis. However, with no BEARER statement and code of Authorization (which you ONLY CAN SEE by Postman-analyzing a JSON GET request ONLY when logged in there) it returns only JSON 0 page (regardless of how many there might be per company) with 2 reviews only (out of 10 per JSON when logged in). So if I try to put all the code from Postman in my Web Scraping script Header, i.e. with the Bearer code, and ignore Selenium log-in, I am afraid I would miss some part of the server communication protocol and will be blocked or banned (robots.txt doesn't state anything is forbidden though). What do you advise?
      Btw, you make awesome tutorials, dude! I am literally living in them these days!

  • @munyaradzijeche7365
    @munyaradzijeche7365 Рік тому

    How do I prefix team names with their log position on soccer upcoming fixture? How do I add Points per game PPG column? Please assist

  • @Analyse_US
    @Analyse_US 3 роки тому +1

    Will this approach work with dynamic web pages? Or is requests-html still the best approach for dynamic pages?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +1

      Yes it will - it cuts out the need to get the data from the page, I’d recommend checking this way out first and see if it can work for you. If it’s not available then rendering the page is the next option

  • @h.screation2817
    @h.screation2817 3 роки тому +2

    Sir which theme you use in vs code???

  • @black_platypus
    @black_platypus 2 роки тому +1

    Why have I wasted so much time manually reading out HTML results? 🤯
    I guess I feared the XHR requests might be too inscrutable or there might be too many hurdles, like cookie management, request tokens/nonces etc.
    How often do you run into trouble with those?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому +1

      It’s often down the individual site, but it’s usually just a new cookie needed. Sometimes parsing the html is the best way though! Explore the site first and then decide your approach

    • @black_platypus
      @black_platypus 2 роки тому

      @@JohnWatsonRooney I will!
      Thank you for being so helpful in the name of empowering the users again! ❤

  • @brunogarcia2336
    @brunogarcia2336 2 роки тому +2

    John! Amazing video! I am starting with coding and was nice to learn a lot with you. Question: How can I set up one filter for live games? For example, just show the live games with 0x0 on score, or with away team score once? Is it possible to filter the live games with parameters? It would be amazing to learn form you this as well. Thank you for your effort!

    • @XiagraBalls
      @XiagraBalls 2 роки тому

      I think the API would simply return no-score draws as just that - 0 : 0

  • @gisleberge4363
    @gisleberge4363 3 роки тому

    A few questions. If you peform this API endpoint strategy as suggested here, aren't you creating some kind of "imbalance" in the requests that the server (?) could easily detect as automated computer activity and not a real person? Something that one needs to considering avoiding being blocked when you scrape the API like suggested here (except from the obiouvs, don't do it too fast etc)? Else, also believe Captcha is not an issue here (which can be a hassle sometimes)?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +2

      Yes you can absolutely be detected and blocked still. If scraping lots of data proxy’s are a must. With most sites doing it this way you need the cookie generated from your browser - this cool data is transferred when we used insomnia and that allows us access

  • @goncalosilva4974
    @goncalosilva4974 Рік тому

    How could I get the current minutes?

  • @mgmyo7066
    @mgmyo7066 2 роки тому +1

    Is that possible with node is sir?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому

      Yes of course, I don’t know Node or JavaScript that well though I’m afraid!

  • @football-scalper
    @football-scalper 7 місяців тому

    I can't determine the game minute - is there a solution?

  • @nnld218
    @nnld218 2 роки тому

    Hi sir, any ways for scrape video stream (live video) football?

  • @sharankrishna9815
    @sharankrishna9815 3 роки тому

    Hey!! Thanks for this! Its very informative! :))
    I have a doubt regarding scraping, could you help me w it??
    Question:
    I have a list of 100 (X0, X1, X2..., X99) products along w their pricing (P0, P1, P2....., P99).
    Is it possible to scrape the google shopping price data for all the 100 products? And if the prices of the individual products, say for instance product X0's price on google shopping is greater than the given price (P0), update that as the new price in a new column?
    Your input would be much appreciated!
    Thank you!! :)

  • @mth6311
    @mth6311 Рік тому

    So im trying to create a live events feed as a personal project for premier league games, so goals, cards, assists, etc etc. Would it be possible to use this method and not get banned somehow?
    What if i made 6 different scripts to scrape 6 different score websites? Therefore id only be sending 1 request per minute to each site
    Could this work?

  • @RicardoMilbrath
    @RicardoMilbrath 2 роки тому

    Is possible get statics in real tiime? Bad english (brazilian boy) :)

  • @plavali_znaem
    @plavali_znaem 2 роки тому

    Was trying to scrape Internet speeds from speedtest with this method, got only 2 tick boxes under "Name" section under "Fetch/XHR" tab on inspector. In "response" there is several letters only, for first tick box it's "1d" and for the other it's "1gfi". Is there anyone knowledgable enough to help me to find a way around this? Or speedtest webpage doesn't use the API and tables in the first place? (There are speedtests which I would want to scrape, and the very speeds are placed on the graph curves, so I was thinking the graphs are auto-generated based on some table).

  • @Cubear99
    @Cubear99 2 роки тому

    Can you do a new youtube about Amazon for 2022? Amazon has been changed. I tried it but does not work anymore gives me 504. I tried in Java and does give me all the info.

  • @vs6x3
    @vs6x3 Місяць тому

    I learned python too!

  • @vuducanh001
    @vuducanh001 3 місяці тому

    May you instruct everyone a step by step data analysis project from scratch? Thank you in advance!

  • @coalitea
    @coalitea Рік тому

    This is exactly what I was looking for to scrape off live data on bitcoin etc. But sir, is this illegal?

  • @kuniling
    @kuniling 3 роки тому +2

    I find your web scraping videos the most useful and user friendly in youtube. I'm just wondering if there is a way to scrape an html file from the local hard drive for practising purposes since I spend some time travelling with no internet connection, in addition, I think it would be nice to avoid overloading a server when practising.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +1

      Sure, save the html to file and open it in Python - it will load into bs4 for scraping practise on the go

    • @kuniling
      @kuniling 3 роки тому

      @@JohnWatsonRooney wonderful, thank you so much.

  • @uttamsharma6358
    @uttamsharma6358 3 роки тому +3

    Will you start a discord server?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +2

      I’ve thought about it, I will at some point and I’ll post it up so you guys know. Just not sure when yet!

  • @obiwanfisher537
    @obiwanfisher537 3 місяці тому

    Man, I have been scraping wrong for so long.

  • @kuhicop
    @kuhicop Рік тому

    for bet365 any ideas? :(

  • @TheDzideek1
    @TheDzideek1 2 роки тому +1

    @John Watson Rooney I got banned by SofaScore "The system identified you as a scraper and banned the IP. To use the data on the website contact the owner and request permission"

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому

      unfortunately that's a part of it, you'll need to use proxies ideally to continue - it kinda turns into an arms race

  • @CrazyFanaticMan
    @CrazyFanaticMan 2 роки тому +1

    Cloudflare didn't even give me a chance, blocked my IP instantly 😂😂

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому

      Ah yeah that’s a real possibility, I use my vpn for testing usually but even then a lot of those IPs are blocked already so it’s much harder.

  • @wangdanny178
    @wangdanny178 2 роки тому

    I find another problem. When i run scoreslive.py, it raise the exception JSONDecodeError, would you pls help me with that? thanks ahead