Don't Start Web Scraping without Doing These First

Поділитися
Вставка
  • Опубліковано 18 жов 2024

КОМЕНТАРІ • 61

  • @shoebshaikh6310
    @shoebshaikh6310 3 роки тому +21

    By far the best channel on UA-cam for web scraping ❤️

  • @tubelessHuma
    @tubelessHuma 3 роки тому +7

    My favorite tip: Parse Locally 👍🌹

  • @_domdge_687
    @_domdge_687 2 місяці тому +1

    been following you for a week and i learn so many tips. Thank you!

  • @DIY-Investors
    @DIY-Investors 3 роки тому +10

    John, that was a really helpful (top down) overview which I found very helpful. As a visual learner, I almost need a decision tree diagram to take me down the most appropriate route... thereby taking me to the right set of tools/ routines to use. It’s also helpful to have a video in the 7- 10 minute time range, to focus in on the particular topic in hand. 10 out of 10 from me! 👍

  • @khaliqsalawou3092
    @khaliqsalawou3092 2 роки тому +3

    Thank you, John, the tips were really helpful. and I would love it if you can share more of this in the future.

  • @balazseduard4016
    @balazseduard4016 3 роки тому +5

    You are the best man. Much respect, keep up the good work, I learn a ton from you as a beginner

  • @stevefox42
    @stevefox42 3 роки тому +3

    Man!, I'm having so much fun learning from watching your videos.

  • @amonged911
    @amonged911 3 роки тому +5

    I think we need a video where you talk about all the challenges that will face us when scraping like blocking ip or problems caused by sending too many requests.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +3

      Yes good idea I’ve been thinking doing about a video like that

  • @theinstigatorr
    @theinstigatorr 3 роки тому +3

    Thank you I just completed my first scrapy project today

  • @rtxmax8223
    @rtxmax8223 3 роки тому +1

    Your channel is too good for us scrapers!!!

  • @TheJdB21
    @TheJdB21 3 роки тому +3

    When building my scraper, I love to do it on a jupyter-notebook first so that I could separate the request and parse part of the program.

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 роки тому +1

      yes thats a great way too - i personally never got into notebooks but i certainly see the appeal.

  • @nurlansalkinbayev3890
    @nurlansalkinbayev3890 3 роки тому +1

    Hello John. Thanks for your tips.

  • @higaj
    @higaj 2 роки тому +1

    Thank you for the great advice.

  • @bn_ln
    @bn_ln 2 роки тому

    Thanks for the great content, your channel is an excellent learning resource. May I ask for a starting suggestion for a project that involves authentication and downloading CSV and Excel files.

  • @l0remipsum991
    @l0remipsum991 3 роки тому +1

    Thanks for the tips!

  • @Lahmeinthehouse
    @Lahmeinthehouse Рік тому

    Nice video! What do you use for screen recording ?

  • @RenatoEsquarcit
    @RenatoEsquarcit 3 роки тому +1

    Top content as usual

  • @tnssajivasudevan1601
    @tnssajivasudevan1601 3 роки тому +1

    Great video Sir.

  • @techmumus6780
    @techmumus6780 3 роки тому +1

    Great video! Thanks!!

  • @RonWaller
    @RonWaller Рік тому +1

    Thanks John, I have 2 questions...First, how do you download the HTLM with requests? I tried looking it up and didn't find the solution. Second, looking at the source, what are we suppose to be seeing? I have dont that but not sure what I am looking for.Thanks

  • @daniel76900
    @daniel76900 3 роки тому +1

    parsing locally...men....that was it!!!

  • @alikorloo8425
    @alikorloo8425 Рік тому +1

    it helped mate. what lib do you recommend for parsing lxml/html? and ofcourse for async request.get (only) and request.post(rarely). minimal libs just to get the work done. in one of your vids u talked about selectolax, and request-html in this one. I only need those two functionalities I mentioned above(parsing, requests). much appreciate it.🙏🏼

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      Thanks - my go to now is httpx for requests and selectolax for parsing

  • @Analyse_US
    @Analyse_US 3 роки тому

    Gold! Great channel.

  • @BeSharpInCSharp
    @BeSharpInCSharp 3 роки тому

    Wonderful video. Do you have any on decision tree ?

  • @ahmedgamalelkattan2231
    @ahmedgamalelkattan2231 3 роки тому

    We urgently need video about scraping from TripAdvisor using Selenium please 😀

  • @drac.96
    @drac.96 2 роки тому +1

    How would you recommend dealing with IFrames? Any tips to extract data from those easily?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 роки тому

      From previous experience you need to find the actual url that the is and use that, you can usually find it in the source

    • @drac.96
      @drac.96 2 роки тому

      @@JohnWatsonRooney Oh okay, so we have to visit the URL to get the content of the IFrame, sounds easy enough. Really appreciate the quick reply! I like your videos, they're very informative.

  • @spicer41282
    @spicer41282 3 роки тому

    Hey John,
    Just recently sub'd...
    These are great tips!
    How about a separate vid for each one?
    Looking over your shoulder,
    The 1st one:
    What will You be looking for? Keeping an eye out for?
    Listening to your train of thought - while you're going through the motion/ process would be awesome!
    Hope you consider this request.

  • @chiamaka2885
    @chiamaka2885 3 роки тому

    Wonderful videos you have. How can I select the columns I want to scrape. Maybe the the information I need is in column 1,2 and 4. How do I don that? Thank you

  • @higheringai68
    @higheringai68 3 роки тому

    Thanks.

  • @codetechpro
    @codetechpro Рік тому +1

    Hey John Can you make a short crash course on phantom js?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      I’m afraid my js skills aren’t that good but I could look into it

  • @jorgev4656
    @jorgev4656 3 роки тому

    hello john. i would like you make a video scraping linkedin without selenium. for search jobs. thanks

  • @eziola
    @eziola 5 місяців тому

    I'm starting to see #shadow-root elements that I don't know how to get into. Thoughts on these?

  • @gwulfwud
    @gwulfwud 3 роки тому

    Hey man, I have an e commerce site I'm trying to scrape and I found that one section of the page I'm trying to get calls an API post and it's paginated. With that said, will it be better to just go straight and call the data through the API on that part instead of scraping it off the page? Follow up, should I still use scrapy or in combination of bs4? One to load and scrape the page and the other one just for the post API call.

  • @TiaDzn
    @TiaDzn 3 роки тому +1

    bell gang!

  • @Automatic-show
    @Automatic-show Рік тому

    Tnx

  • @surfcow
    @surfcow 3 роки тому

    Valuable advice from 50,000 ft, not the usual 500 ft.
    Don't just start coding. Stop, think, design, look harder.
    Do you really understand the specific details of the problem, or are you guessing?

  • @ugwuanyiarinze5626
    @ugwuanyiarinze5626 3 роки тому

    I'm looking for a market place where people hire scrapers?

  • @sujatapatil9152
    @sujatapatil9152 3 роки тому

    Hi John,can you please help to scrape the reviews from slicksdeals site for all the sublinks of a product..I have tried bit failed to do it... please help me

  • @nimishabhide2950
    @nimishabhide2950 3 роки тому

    Why can’t I scrape most amazon sites?

  • @ankeet7x
    @ankeet7x 3 роки тому +1

    bell gang! (2)

  • @dnyaneshctech7409
    @dnyaneshctech7409 3 роки тому

    Scrap location wise loaded content.... Please

  • @ALANAMUL
    @ALANAMUL 3 роки тому

    How to scrape site that have " Loade more " or "show more" Button.plaz show us example

    • @harigovind6706
      @harigovind6706 3 роки тому +3

      Show more button will probably have a href with it you can send request to that url

    • @ALANAMUL
      @ALANAMUL 3 роки тому

      @@harigovind6706 thanks

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq 2 роки тому

    Bro if you're yanking 500k files saving them all in github is not ideal

  • @user-zj8id7kc1r
    @user-zj8id7kc1r Рік тому +1

    nice video. i use bs4 because a lot of your videos use bs4 and i try to adapt your examples to my projects. Could you do future video with more complex selectors please :) because i have a lot problem to adapt with something like that lol .