Web Scraping with Python - Start HERE

Поділитися
Вставка
  • Опубліковано 9 лют 2025

КОМЕНТАРІ • 94

  • @ianrickey208
    @ianrickey208 Рік тому +22

    I would love to hear you present a real world web crawler design, complete with IP proxies, horizontal scaling, rotating user-agents, anti-bot detection...yadda yadda yadda. I have no dount this is your bread and butter, but hearing about complexity considerations and tradeoffs would be *very* informative to us all. Just a thought.
    Thanks for everything John!

  • @Noname-ok4tf
    @Noname-ok4tf 5 місяців тому +1

    Wow this is the best web scraping tutorial I have found on youtube. I appreciate the effort made to make the example as real world as possible, and that you clearly demonstrate how to handle errors.

  • @cosimomastropietro7801
    @cosimomastropietro7801 Рік тому +6

    I approached web scraping like 2 weeks ago, and u are the one from which i learn the most... I'm so excited for this series thank you man

  • @emphieishere
    @emphieishere Рік тому +2

    My friend! Thank you for covering this topic in a such understandable and straight to the point manner, it was a pleasure to watch your video

  • @poemPulsetv
    @poemPulsetv 11 місяців тому +1

    Thank you for sharing this comprehensive tutorial on web scraping with Python! This video is a great starting point for beginners like me who are interested in learning about web scraping techniques and tools.
    I appreciate how you broke down the process step-by-step, covering everything from setting up the environment to extracting data from websites. The explanations were clear, and the examples provided valuable insights into various Python libraries and their functionalities.
    The practical demonstrations helped me understand how to apply the concepts learned in real-world scenarios. I particularly liked the section on handling different types of data structures and navigating through HTML elements efficiently.
    Overall, this video has equipped me with the knowledge and confidence to explore web scraping further. Looking forward to diving deeper into this fascinating topic with your guidance. Keep up the excellent work!

  • @UmerFarooq_697
    @UmerFarooq_697 Місяць тому

    dear it's superb method, less rush and get more. your work really magical. i have stuck with your channel regarding scraping .
    it's a hard job but i expect more advanced tech from your end.
    bundle of thanks.

  • @mitchconnor8764
    @mitchconnor8764 Рік тому +5

    Thanks for this, looking forward to the rest of the series!

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +5

      I'm going to release part 2 tomorrow! its ready to go

    • @GhazKhan99
      @GhazKhan99 Рік тому +2

      @@JohnWatsonRooney stoked.. 🤟

  • @zedzpan
    @zedzpan 7 місяців тому +2

    Thank you for this. Learnt so much. The try exception in the function helped a lot as well.

  • @obiwanfisher537
    @obiwanfisher537 3 місяці тому +1

    I have been doing webscraping for a while now but I like your modern approach, I think most scraping teachers just teach you outdated bs they never used with handpicked and obscure targets because they are tech influencers in the coding niche, not because they are programmers who do tech videos like you.
    It's so important to find a good instructor and teacher.
    This video was a little basic and old school, they'r ethe fundamentals after all, but I was speaking about your other videos as well.

  • @VipinKumaarr
    @VipinKumaarr Рік тому +19

    Hi John, May be you can create a playlist just like a course by sequentially collating video list, would be great to have that as it is easier to flow and does provide a rhythm in learning the basics and advanced stuff pretty fast

  • @LesvosBeaches
    @LesvosBeaches Рік тому +1

    Hi John. your tutorial is much better than every other video i saw. from you i learn the most!!! looking forward to the rest of the series. thanks a lot.

  • @coyoteden8111
    @coyoteden8111 Рік тому +1

    You are an absolute legend. I hope you enjoy the time you have before exploding into one of the top dogs of this niche on the internet, because you're def headed there

  • @GrumpyDave1
    @GrumpyDave1 Рік тому +3

    I come for the lessons. I stay for the typing skills (and the lessons). Touch type coding using Vim. RESPECT.

  • @luisemilioogando
    @luisemilioogando Рік тому +2

    Exactly what I was looking for.. I will start tomorrow thank you.

  • @AliceShisori
    @AliceShisori Рік тому +5

    I also like this series so much that you used a real website that ALSO has stuff that won't just work right away! I was just following your steps in the video and I ran into errors and tried to understand why before I resumed the video and realized you also faced the problems too.

  • @marcosoliveira8731
    @marcosoliveira8731 5 місяців тому +1

    I´ve just find out yor channel. I´m lovin it. So much good content.
    I´m learning a lot. Thanks!

  • @patientson
    @patientson 5 місяців тому

    You are the best male intuitive programmer, i have come across online. The idea of Extract Transform and Load analogy makes learning and understanding very receptive to the one's mind. I can think of it while walking.

    • @Noname-ok4tf
      @Noname-ok4tf 5 місяців тому

      Lol thats very specific. Are there more female intuitive programmers on youtube?

  • @MoSizzle
    @MoSizzle Рік тому +1

    You are the GOAT. Thank you for this video

  • @Doggy_Styles_Coding
    @Doggy_Styles_Coding Рік тому +4

    Hell i always want to make a bot which kinda can dive it's way through the web using webscrapping and requests to find hidden spots in web :D tutorial looks awsome

  • @einekleineente1
    @einekleineente1 Рік тому +1

    Perfect! Exactly what I was waiting for. 😃👍🏻

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      Great i hope you like the rest of the mini series too. next one is tomorrow!

  • @AhmedAl-Yousofi
    @AhmedAl-Yousofi Рік тому +2

    Thanks for this video, I wish this video was a bit longer, and go more deeply to extract links of each product and get data from product details page.
    looking forward to the rest of the series!

  • @ezoterikcodex
    @ezoterikcodex Рік тому +1

    That was very informative. Thank you so much.

  • @doncheeto7796
    @doncheeto7796 Рік тому +2

    thank you! upload as many tutorials as you can 🙏

  • @sheikhobada8305
    @sheikhobada8305 Рік тому

    Thank you John, for such helpful material

  • @anarikobi23
    @anarikobi23 Рік тому +1

    Great Video. I just love the way you describe step by step. Keep uploading, please. And If possible please make a playlist.

  • @darrentan.6284
    @darrentan.6284 Рік тому +1

    Enjoyed the video, looking forward for more tutorials

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      Thanks for watching glad you enjoyed it, more coming (next one today)

  • @AliceShisori
    @AliceShisori Рік тому +4

    thank you for creating a series, I learn a lot of cool and new things with your videos but they mostly do not have a chronological order so as a beginner I have troubles understanding them due to not having prerequisite knowledge.
    edit: may I ask in this industry is there a career path of position for people who are advanced with webscraping/webautomation? I'm mainly learning because I find it useful but I don't know if there are jobs that would require this skill set.

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +3

      thank you! yes there will be 4 videos I think, all leading on from each other in a mini playlist to help out!

  • @chandrasekaran2429
    @chandrasekaran2429 Рік тому +3

    Thanks 👍

  • @KrAsHeDD
    @KrAsHeDD Рік тому +2

    Just knowing about the new html parser. Thank you.

  • @tasfarsowad7612
    @tasfarsowad7612 11 місяців тому +1

    Your setup looks so organized and efficient. Do you have any tips for configuring a similar development environment ?

    • @JohnWatsonRooney
      @JohnWatsonRooney  11 місяців тому

      keep it simple and in time you'll find what you like and don't like!

  • @truemufti
    @truemufti 10 місяців тому +1

    Keep posting

  • @iitsTech
    @iitsTech 8 місяців тому

    Great video ty!

  • @Fabricio-mq2uk
    @Fabricio-mq2uk Рік тому +1

    big hugs from brasil.

  • @gracyfg
    @gracyfg 9 місяців тому

    Hi John, thanks for this course. Absolute life save. Let me know the solution if the element I see cannot be found in the html what would be the solution to scrape that

  • @ram_qr
    @ram_qr Рік тому +1

    brilliant

  • @eliah787
    @eliah787 5 місяців тому

    this one is working fine but what is a button in the page to show more book ? the html in this case is not fully displayed untill you click show more ... how to fix that?

  • @easypeasyph
    @easypeasyph Рік тому +1

    +1 abo great content simple explanation top teacher .

  • @zakariaboulouarde4591
    @zakariaboulouarde4591 Рік тому +1

    Thaaaank you so much it is very helpful. One question please , how can deploy and host it as an Api?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      using a python web framework like fastapi we can turn this into a simple API easily enough sure!

  • @talaldardgn2550
    @talaldardgn2550 Рік тому +1

    Thank you, I hope to make tutorial how we can dockerize scrapy with postgres

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      more scrapy content is in the works, I could look at using docker and postgres too

    • @talaldardgn2550
      @talaldardgn2550 Рік тому +1

      @@JohnWatsonRooney thank you ..

    • @samoylov1973
      @samoylov1973 Рік тому

      @@JohnWatsonRooney, please do. Waiting for continuation of this series and docker + PostgreSQL also. THANK YOU!

  • @malwaredev33
    @malwaredev33 Рік тому

    Hi, John, your video content is very awesome for everyone who learn scrapping. But one thing I think everyone face that is blocked by some websites due to bulk of sending request. In this video you mention to avoid blocking while scrapping data. can you share how to get unblocked from these types of websites.? It's very helpful for everyone. Thanks

  • @duffercat1
    @duffercat1 Рік тому

    John, thank you for the very informative videos. The products you scraped in this video came from one specific category of the store's website. How would one scrape all products without going into each category separately? Thanks again

  • @Dizmore
    @Dizmore Рік тому

    greetings, im following your tutorial and when i print the products (line 13) and run it , it just gives off an empty list [ ]. what am i doing wrong?

  • @natalieleon7045
    @natalieleon7045 Рік тому

    I was able to get everything working, except it would only give me one product no matter what I did! It wouldn't give me the full list of products on the page - just the first one. any suggestions?

  • @WestSideLausanne1
    @WestSideLausanne1 Рік тому

    Hello, what if the web-page has a login? I do have the credentials, but how to I make it log in in this scenario?

  • @mihgeza2000
    @mihgeza2000 Рік тому

    Hello there, I have a question. I want to scrape a website, but it gives me 403 error, when I want to connect to it. Is there any way to bypass it? I tried changing the user agent, but it did not work

  • @KontrolStyle
    @KontrolStyle Рік тому

    ty for video 8)

  • @mecrayavcin
    @mecrayavcin Рік тому +1

    I love you John Watson Rooney

  • @dragonore2009
    @dragonore2009 Рік тому

    I know how to scrape sites and I do it sometimes writing a Python script, but I get scared I will get IP banned or blocked. It's frustrating.

  • @abdifatahabdi3939
    @abdifatahabdi3939 Рік тому +2

    is this a new series you are starting or just one vedio?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +3

      series, so far 4 parts, next one is tomorrow and there will be a playlist in order

    • @abdifatahabdi3939
      @abdifatahabdi3939 Рік тому

      @@JohnWatsonRooney i would like you to create videos about deep scrapy..otherwise thank you so much

  • @LLlikeme
    @LLlikeme 11 місяців тому

    Have a question for anybody or John. If the response for get(url) is 403, I have read it is because the page has block the access for users to scrape his information and you need to use other libraries like Selenium. Any comment is highly appreaciate it.

    • @ryosukoe
      @ryosukoe 2 місяці тому

      I solved this problem.
      first get your User-Agent, you can find it on the network page in chrome devtools.
      in code:
      headers: {"User-Agent": "your user agent"}
      include the header in your request:
      httpx.get(url, headers=headers)
      if you still get the same error, try to get more specific headers info like:
      "Accept", "Accept-Encoding", and etc
      if still, you can fake your User-Agent to android, iphone, linux and etc

  • @ronarcher2523
    @ronarcher2523 Рік тому

    Can you web scrape email addresses of realtors?

  • @arpsami7797
    @arpsami7797 Рік тому +1

    I tried to install httpx for a couple of hours but it didn't go okay, at all :(

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      You can absolutely use requests for this too if you prefer. Httpx is just my preference

  • @JokeryEU
    @JokeryEU Рік тому +1

    if only all ecommerce website offered an endpoint from where to pull all the data we need, instead of relying to scrape their website

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      shopify actually does that.. go to any store and add "/products.json?limit=250" at the end of the URL

  • @bakasenpaidesu
    @bakasenpaidesu Рік тому +2

    ....❤....

  • @paa5497
    @paa5497 9 місяців тому

    what do you do if you get code 302

  • @alexandrecostadev
    @alexandrecostadev 10 місяців тому

    First thanks for the tutorial, I'm starting learning about scripe and found your channel. I'm trying to execute this tutorial but I always got a timeout. Can you help me please?

    • @japhethmutuku8508
      @japhethmutuku8508 6 місяців тому

      Hello do you still have this problem?

    • @alexandreoutystems
      @alexandreoutystems 6 місяців тому

      @@japhethmutuku8508 Yes still same problem.

    • @stephena8965
      @stephena8965 3 місяці тому

      For anyone having this issue open up your network settings and do a hard refresh. Then:
      1. look for "scd-deas" or do a CMD+F to find some text that exists on the page
      2. right click on it
      3. copy has fetch
      4. add the extra headers to your headers dictionary

  • @Creem16
    @Creem16 Рік тому +1

    why do u use venv and not conda?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому

      conda has loads of extra stuff i dont need, its aimed towards data analysts really

  • @mikezang2008
    @mikezang2008 Рік тому +1

    can this scrape JavaScript site without Selenium?

    • @JohnWatsonRooney
      @JohnWatsonRooney  Рік тому +1

      afraid not, to render javascript you need a browser, which is currently out of scope for this series - but I may add to it to include a selenium/playwright version

  • @jagannathishere
    @jagannathishere Рік тому

    damn now the website in the video is giving 403 http status code (access is forbidden)... even with headers

  • @DeviceDuo-sl9rb
    @DeviceDuo-sl9rb 6 місяців тому

    I want scrape data from tiktok
    How can i do that
    Can you help me please???

  • @sherinab770
    @sherinab770 4 місяці тому

    Hi John, appreciate it. Am very new to this. Very good tutorial session for us. @11:00, results is empty [ ] (for I used . instead of # for class) my sample data is : and all the children are