Web Scraping With Selenium And A Raspberry Pi - All You Need To Know

Поділитися
Вставка
  • Опубліковано 11 лют 2025

КОМЕНТАРІ • 82

  • @SimSimsTECHcrunch
    @SimSimsTECHcrunch 3 роки тому +17

    The UA-cam legend has returned again!!!!

  • @UserUnknown07
    @UserUnknown07 3 роки тому +15

    Can't imagine the amount of editing this video must have took, woah! Great explanation. Thank you.

  • @cyrustakem7993
    @cyrustakem7993 2 роки тому +1

    I miss your videos, i don't know why youtube stopped recommending them, they are highly educative

  • @timothycain8639
    @timothycain8639 3 роки тому +4

    love this project. you made many aspects of programming with python INFINITELY MORE CLEAR TO ME.

  • @lachlanmoore2345
    @lachlanmoore2345 3 роки тому +11

    Use Explicit Waits when you can instead of the time module, Expected Conditions are great for this.

  • @johnbushur6080
    @johnbushur6080 3 роки тому +7

    Very useful. I came across selenium a while ago but wound up using excel tools instead. I’ll have to give this a try for my next project.

    • @2mrRB
      @2mrRB 2 роки тому

      Hey John, are you able to use excel tools to scrape websites too? Or do you mean something else? Thanks in advance :)

    • @johnbushur6080
      @johnbushur6080 2 роки тому

      @@2mrRB I’ve used Excels web/power query for this in certain cases. Check out Leila Gharani’s channel for some good tutorials. I’ve also written some scripts in VBA to do it as well for specific tasks. That is what I meant by excel tools. Hope that helps.

  • @jasonbailey9139
    @jasonbailey9139 3 роки тому +3

    We had a Perl script that we used to scrap data off of a website. They changed the way the login worked and Perl didn't support the new method (OK, it probably does, but I hate working with Perl scripts, so I I didn't bother researching after our consultant said it didn't), so I just made the users start doing the scraping manually. Now I'm tempted to give this a try to start scaping that data again.

    • @NightRider0101
      @NightRider0101 3 роки тому +2

      Python requests and beautiful soup are the best tools for scraping

  • @twys124
    @twys124 3 роки тому +1

    Great explanation and great video. I just learned about web scraping w BS4 and selenium.

  • @NitishKumarIndia
    @NitishKumarIndia 2 роки тому +1

    This guys belongs to the golden age of UA-cam when the things were simple.

  • @thehoneyseals
    @thehoneyseals 2 роки тому

    This made me so happy thank you so much . you have no idea

  • @georgewilliams9422
    @georgewilliams9422 4 місяці тому

    Thank you so much. Very helpful!

  • @AS-fj7ox
    @AS-fj7ox 3 роки тому

    Good work dude.. keep it runnin!!

  • @jayataroy201
    @jayataroy201 3 місяці тому

    Mate where are you, these videos are inexplicably good!

  • @abrandnewcompany
    @abrandnewcompany 3 роки тому

    Beautiful soup combined with request can do everything what you want, even more than selenium. But I didn't know the NoSuchElementExist Try and catch which is really handy indeed I always use to program it myself a function like that. Thanks!

  • @CodingWithBen
    @CodingWithBen 3 роки тому +1

    I literally just watched your last video lol. How do I know whether it is allowed to scrape a website or not. Is there an easy way?

  • @OnixEdge
    @OnixEdge Рік тому

    @Tinkernut Do you have any tips on how to keep the webdriver updated if you are using the pc and chrome?

  • @mejia414
    @mejia414 3 роки тому

    Gracias desde Colombia me ayudo mucho tu video

  • @VisesEntei
    @VisesEntei 3 роки тому

    Welcome back.

  • @lukasdegle8313
    @lukasdegle8313 3 роки тому +2

    Like it a lot!
    But why don't you use a context handler while writing to files? :)

  • @d-rey1758
    @d-rey1758 Рік тому +1

    where in the video did you mention running this on a raspberry pi?

  • @mmuneebahmed
    @mmuneebahmed 3 роки тому

    Awesome, thanks! Will this selenium library also work with any social media websites or do we have to use other libraries in conjunction to selenium?

  • @JNET_Reloaded
    @JNET_Reloaded 8 місяців тому

    drivers dont work on rpi 5 that well with new supported borowsers so we need real automation without selinium bs any ideas? it should be able to take screenshots and click mouse and use brave browser i got it doing a lot of this stuff but still needs work can you make a video about doing this for rpi 5 using latest brave browser on raspian os debian???

  • @webslinger2011
    @webslinger2011 3 роки тому +4

    For hiding username and passwords I use config parser to grab from a separate file. What I haven’t figured out is how to use proxies to avoid bot detection. Sorry for the hijack but I need to ask. Anyone with a good tutorial? Thanks!

    • @NightRider0101
      @NightRider0101 3 роки тому

      You can use proxy cycling.

    • @leader1944
      @leader1944 3 роки тому +3

      Proxies would work great to avoid detection if you are sending a large amounts of requests to a site very quickly. However, some sites can detect that you are using an automation software by checking for a string when you send your request with webdriver. This string is $cdc_ and it’s located in the webdriver exe file using a hex editor you can replace $cdc_ with any other string that contains $ at the beginning 3 letters of any kind and then an _ at the end. For example $dog_. Note: Changing $cdc_ only works if you are on chrome otherwise you need to change a different string. Hope this helps :)

  • @JasonOBrienThinksHeCan
    @JasonOBrienThinksHeCan 3 роки тому

    Awesome!

  • @papusa9878
    @papusa9878 3 роки тому

    Good video

  • @domasberulis
    @domasberulis 8 місяців тому

    what are your rpi specs? Mine 1gb ram RPI 3B takes 3 minutes to launch the browser

    • @cemk86
      @cemk86 3 місяці тому

      I have the same one, a big disappointment…
      Did you accomplish using selenium code on that little friend ?

    • @domasberulis
      @domasberulis 3 місяці тому

      No, i took a normal pc

  • @100996julen
    @100996julen 3 роки тому

    I'm planning to do a web Twitter-scrapper program with Python. Which raspberry pi modek is better for it? I want to buy the cheapest that I can. Thanks!

    • @randomhominid9816
      @randomhominid9816 3 роки тому

      Why not just use your desktop or laptop computer? A raspberry pi isn't needed but if you want one the rpi 4 with 2GB will probably be enough but maybe get the rpi 4 with 4GB to make sure you have enough memory as browsers tend to use a lot of memory.

    • @arjix8738
      @arjix8738 3 роки тому

      It's much better to sign up for the twitter API

  • @paulmagu3054
    @paulmagu3054 3 роки тому

    Selenium is very useful.
    Any ideas of running web-scraping on the server side with selenium preferably? (Other libraries in python or Node are welcomed suggestions!)
    thx.

  • @AliAli-rj9qb
    @AliAli-rj9qb 2 роки тому

    if I use bs4 it works fine but with the selenium i get TypeError: zip argument #1 must support iteration. the program is exatly the same as yours so why do i get this error

  • @VikashXman
    @VikashXman 3 роки тому

    Thanks man

  • @dontbelasagna5968
    @dontbelasagna5968 3 роки тому

    my csv keep separating the string by characters.. like, the word "the", in csv it is t in one cell, h in the cell next to it, and e in the next one as well..how do i fix this

  • @AliAli-rj9qb
    @AliAli-rj9qb 2 роки тому

    sorry i was missing an s in find_elements so now it is working

  • @OffGridAussiePrepper
    @OffGridAussiePrepper 3 роки тому

    hahahahaha ur the pun king today :)~

  • @Sokar599
    @Sokar599 3 роки тому +3

    How about puppeteer, isn't that the standard nowadays? Good tutorial als always.

    • @Tinkernut
      @Tinkernut  3 роки тому +3

      I thought puppeteer was developed for node.js. Is there a python branch too? Selenium is the OG, that's why I went with it.

    • @Sokar599
      @Sokar599 3 роки тому +1

      @@Tinkernut Ah yes indeed, I don't often use python I guess. Good to see you're still uploading videos! I used to watch you as a kid all the time. Thanks for educating :)

  • @jemalguillory
    @jemalguillory 3 роки тому

    New drip!

  • @serhiyranush4420
    @serhiyranush4420 3 роки тому

    I am running this script on Windows 7 machine and it works beautifully. However, when running from Thonny, no password prompt appears in the Thonny's console. However, when launching it from the command line window, the password prompt does appear.
    How can it be fixed for the password prompt to appear in Thonny?

    • @jyvben1520
      @jyvben1520 3 роки тому

      in the console window or did you expect a gui popup window

    • @serhiyranush4420
      @serhiyranush4420 3 роки тому

      @@jyvben1520 No, I didn't expect a GUI popup window. But I did expect a console prompt, as at 6:42 in this clip.

  • @spumeeuw430
    @spumeeuw430 3 роки тому

    I am running into the following issue when trying to install the chromedriver: "E: Unable to locate package chromium-webdriver". Has anybody run into this issue before?

  • @gkchimzz28
    @gkchimzz28 3 роки тому

    nice

  • @mefaun
    @mefaun 3 роки тому

    Yay now I can be Thomas Anderson in the Matrix

  • @Illvidri
    @Illvidri 3 роки тому +1

    I see the next button and I think "He's just scraping the surface"

  • @Pod-Z
    @Pod-Z 3 роки тому

    Holy shit you listened to my comment

  • @mfawzi89
    @mfawzi89 3 роки тому

    Can I use this code to hack the username and password 😌

  • @myriadtechrepair1191
    @myriadtechrepair1191 3 роки тому

    You can scrape my web anytime, pun man.

  • @mohmedbadr1947
    @mohmedbadr1947 3 роки тому +1

    You are late to the party my friend. Most of the website we want to automate or scrap have some antibot

    • @Tinkernut
      @Tinkernut  3 роки тому +2

      I can see how that may be true for you, but not in general. Most popular websites (twitter, wikipedia, imdb, amazon, youtube, etc) have no such measures. It depends on the website and what they allow. If they have antibot precautions in place, then it's probably not legal to scrape that site anyway. I'm trying to avoid legal issues with this video.

    • @nibblrrr7124
      @nibblrrr7124 3 роки тому

      ​@@Tinkernut IANAL, but in the US, *merely violating some corporate website's terms of service is not illegal* _in itself._ See e.g. the EFF's reporting on Oracle v. Rimini 2018 which actually involved scraping. _(Ninth Circuit Doubles Down: Violating a Website’s Terms of Service Is Not a Crime)_
      Naturally, I completely understand that you'd want to steer clear of legal issues on your channel, though. (Thanks & keep up the great work, BTW!)

  • @sarthoknextt5150
    @sarthoknextt5150 3 роки тому

    Have you worked as a QA in the past?

  • @thekevalpanchal
    @thekevalpanchal 3 роки тому +1

    Hello

  • @1_and_only_Crjase
    @1_and_only_Crjase 3 роки тому

    requests could of done this

  • @4crafters597
    @4crafters597 3 роки тому

    Anyone has a solution to sending the password without including it in code?

    • @userz111
      @userz111 3 роки тому

      Seperated config file
      Or
      Use/save-load browser profiles

  • @dudds6699
    @dudds6699 2 роки тому

    Web Scraping with Selenium I know it can be done but its the wrong tool for the wrong job.

  • @dunste123
    @dunste123 3 роки тому

    Not enough dad jokes :P

  • @woodenbeast9337
    @woodenbeast9337 3 роки тому +3

    what do you gain by scrapping data? Is this useful?

    • @yetzt
      @yetzt 3 роки тому +4

      data journalist here. yes, scraping is useful if the data you need is not provided any other way. and often times it is not.

    • @TheOnlyRaichuu
      @TheOnlyRaichuu 3 роки тому +1

      I'm a freelancer web scraper. There are so many clients. So yes, this is useful.
      Data is knowledge you can turn into profit. Think about big data companies like Google for example.

    • @woodenbeast9337
      @woodenbeast9337 3 роки тому

      ​@@TheOnlyRaichuu It just teaches how to strip our privacy and profit off selling very sensitive data. Running a for profit hack

    • @TheOnlyRaichuu
      @TheOnlyRaichuu 3 роки тому +1

      @@woodenbeast9337Why are you asking when you already made up your mind beforehand? What you're saying is absolutely wrong and ridiculous.
      How does it hurt your privacy when a car dealership wants to get all the data of car listings with their details and price tags to optimize his own pricing? Is anyone affected now in the own privacy? No.

    • @woodenbeast9337
      @woodenbeast9337 3 роки тому

      @@TheOnlyRaichuu weak comparison

  • @gmog7857
    @gmog7857 3 роки тому

    Who do you think you are talking to? python experts?

    • @nibblrrr7124
      @nibblrrr7124 3 роки тому +3

      Curious people with access to a search engine, motivated to build something they want? :^)
      If you tell me what it is you'd like to do, what you tried, and where you got stuck or have questions, maybe I can help you or point you in the right direction.

    • @drewmillett2089
      @drewmillett2089 8 місяців тому

      @@nibblrrr7124 Hey I would enjoy some help if you still read these comments. I think I'm getting stuck on pointing Selenium to the correct browser driver path. If I right click on Chrome it shows a path of the executable file but I'm getting webdriver errors when I use this line of code: browser_driver = Service('C:\Program Files (x86)\Google\Chrome\Application\chrome.exe') . I didn't really see how tinkernut came up with the path...

  • @yetzt
    @yetzt 3 роки тому

    whats up with your sound? it sounds like its out of sync with itself.
    also i'd recommend going with puppeteer and node if one was more comfortable with js. it just integrates better.

  • @SeaJay_Oceans
    @SeaJay_Oceans 3 роки тому

    That is very Edgey comedy...

  • @astemet
    @astemet 3 роки тому

    i got discord bot ready

  • @otmw6726
    @otmw6726 10 місяців тому

    thanks for not explaining how you found the identifier for the log in button

  • @Dikkedimi
    @Dikkedimi 3 роки тому

    dude, your audio is real bad. all over the place.