Selenium Headless Scraping For Servers & Docker

Поділитися
Вставка
  • Опубліковано 19 січ 2025

КОМЕНТАРІ • 72

  • @Akshatgiri
    @Akshatgiri Рік тому +12

    Good man. This was super helpful. Easily saved me 5+ hours of searching around.

  • @taz2177
    @taz2177 9 місяців тому +5

    I have been trying to setup chrome and chrome driver for the docker image from past 5 hours, chatgpt got me swinging from one command to another, finally your 16 min video helped me thanks a ton. Alhamdulillah. AI cannot replace devs today I got a taste of it finally.

    • @nygma6
      @nygma6 9 місяців тому

      As salam aleykum which url did you use to dl google chrome th one in the video return a 404 ?

    • @knowledgedose1956
      @knowledgedose1956 4 місяці тому

      yeah, chatgpt😂

    • @knowledgedose1956
      @knowledgedose1956 4 місяці тому

      ​@@nygma6easily searchable

  • @roflcopter645
    @roflcopter645 Рік тому +3

    This video tutorial came at the perfect time. I'm currently working on a project that scrapes from a docker container, and I've been struggling to find out how to make it work. Thank you NeuralNine.

  • @Linuxtech65
    @Linuxtech65 10 місяців тому +2

    Realmente uno de los videos mas utiles de python y selenium. You're a Crack!

  • @ВладимирГусев-п7н
    @ВладимирГусев-п7н 8 місяців тому +1

    Thank you from St. Petersburg! Your video helped me a lot in my automation work project. Now I can continue to create the project.

  • @martinloeffler2119
    @martinloeffler2119 9 місяців тому

    Thanks a lot
    struggled since yesterday to get selenium up and running inside docker.
    this works perfect

  • @po6577
    @po6577 Рік тому +7

    There is another way that you can use a remote web drive(set up this in remote server with selenium official docker image). Then run the scarping part in remote.

    • @allailqadrillah
      @allailqadrillah 10 місяців тому

      can you provide some reference? i want to find out more

    • @knowledgedose1956
      @knowledgedose1956 4 місяці тому

      ​@@allailqadrillahpython Selenium documentation is your reference, man

  • @amodseth8448
    @amodseth8448 Рік тому +5

    It was very helpful thank you! I'll definitely keep this in mind ❤

  • @joseantonioromeroespejo160
    @joseantonioromeroespejo160 11 місяців тому

    "Great video man. Very helpful and well explained. Thank you very much!!!"

  • @muhammadumer4127
    @muhammadumer4127 Рік тому

    300K subscribers ❤🖤
    Congratulation man.
    All of your videos are always good and helpfull. keep it up.
    Thankyou

  • @GeorgeChar95
    @GeorgeChar95 Рік тому +1

    Thanks for the awesome video! This is exactly what I needed for my project!

  • @bernardosilva697
    @bernardosilva697 Рік тому

    You saved me, you won a new subscriber.

  • @KhuzaimaAhmed-c3d
    @KhuzaimaAhmed-c3d 4 місяці тому

    Great work. I was struggling to run selenium in the docker. Thanks

  • @desouzafelipe
    @desouzafelipe Рік тому

    Thank you so much for posting this video, it solves exactly what was blocking me!

  • @richardhoppe4991
    @richardhoppe4991 Рік тому +2

    In the main file I was getting an error "Failed to send GpuControl.CreateCommandBuffer" when I ran the script locally. Adding the chrome_options.add_argument('--disable-gpu)' made the error go away. Just in case anyone else is running into that error message.

  • @devmts
    @devmts 10 місяців тому

    thanks! greetings from Brazil.

  • @marcelfranca5304
    @marcelfranca5304 9 місяців тому +1

    since the 'new headless', for me, is not working anymore. Do you know how to make it work?

  • @reaganlopezmusic
    @reaganlopezmusic 10 місяців тому

    Thank you. This was really helpful.

  • @DanielLima97dlcs
    @DanielLima97dlcs Рік тому

    Thank you my bro! Works like a charm!

  • @davidlopezfelix3668
    @davidlopezfelix3668 8 місяців тому

    Awesome video. I run it as in the video and it worked!! thanks

  • @thandokuhlebrianmsane7043
    @thandokuhlebrianmsane7043 8 місяців тому

    You might want to revisit the documentation and see that some modifications have been made. Thanks
    .

  • @jaimesaldarriaga2910
    @jaimesaldarriaga2910 Рік тому

    Thanks, this is an incredibly useful video.

  • @vkfalan
    @vkfalan Рік тому

    Great tutorial, thank you for your efforts !!

  • @timothyspottering
    @timothyspottering Рік тому

    Very helpful video!
    Thank you

  • @sahil5124
    @sahil5124 Рік тому

    thanks man, this is very helpful. Can you also create one for scrapy as well. What are the areas we should be concerned about when deploying a service that requires scrapy.

  • @felipe1990batista
    @felipe1990batista Рік тому

    Thanks a lot.
    from Brazil

  • @manickpillai
    @manickpillai Рік тому +1

    Good tutorial. Minor correction on verbiage at 9:55 its building a `docker image` from a Dockerfile. Then from image we run container using `docker run`

  • @djalan84
    @djalan84 5 днів тому

    Why not using a selenium docker standalone image?

  • @Franx570
    @Franx570 Рік тому

    Wouldnt it be better to use Selenium Grid instead? So I can use the Grid as a driver instead of doing all that?

  • @wallarichard8981
    @wallarichard8981 Рік тому +1

    Hi,
    Probably i didn't catch that information but why the selenium necessary ?
    You can get the html content with Beautiful Soup ?!

    • @soul_maestro
      @soul_maestro Рік тому +2

      selenium is handy for javascript heavy websites where you need a browser to execute the javascript to render parts of the site.
      with beautiful soup you'll pull in the bare html by itself, and have to pull all the javascript seperate and execute it correctly.

  • @joseantonioromeroespejo160
    @joseantonioromeroespejo160 11 місяців тому

    I've followed the steps, and it works correctly on my PC, but when deploying it on AWS EC2, Selenium fails and doesn't scrape. Do you know what could be causing this?

  • @haggard17335
    @haggard17335 Рік тому

    Hi, or several days I have been dealing with a problem that I cannot solve, I have a script that obtains the profile url, but in some profiles it does not work, I made sure that the selectors in both profiles are valid in addition to the html structure, I am running my code on a digital ocean server with linux without interface

  • @raninaga835
    @raninaga835 6 місяців тому

    i am not getting desired out put
    no exception.
    when i ran the docker file docker desktop, container build successful, in the logs it is opening only python screen
    can anyone suggest on this

  • @paulthomas1052
    @paulthomas1052 Рік тому

    Useful utility - thanks !

  • @cbacca2999
    @cbacca2999 7 місяців тому

    I have Python 3.11 on Windows 10. I'm just using a text editor to edit the Python program and I'm using a virtual environment in my cmd.exe shell. In this line "driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))" I get this error: "'powershell' is not recognized as an internal or external command, operable program or batch file. 'powershell' is not recognized as an internal or external command, operable program or batch file."
    It looks like selenium supports Python 3.11 so that should be an issue. I also have Selenium 4.21.0.
    Any idea how to fix this?

  • @vitorsilva-or1dj
    @vitorsilva-or1dj 7 місяців тому

    thanks bro! you solved my problem

  • @aryangoel5578
    @aryangoel5578 7 місяців тому

    Docker File isn't working for dependencies for chrome aren't getting installed over docker container

  • @richardmarch3750
    @richardmarch3750 Рік тому

    Thank you so much for creating such helpful videos! Can you make a video on how to make a AI spotify playlist generator where each track seamlessly transitions from one track to another?

  • @vihari2010
    @vihari2010 Рік тому

    can we use pupetter to do this? also make a video on pupeteer

  • @TheShox79
    @TheShox79 Рік тому

    This is great! Thanks!

  • @LegionLeague
    @LegionLeague Рік тому

    Great video! Quick question: if you need to scrape several pages from your website, is it possible to make it async and print the results as soon as selenium is done scraping each page as opposed to printing the whole thing after every page is scraped? If so, I would love to see a video on that topic.

  • @ammadkhan4687
    @ammadkhan4687 11 місяців тому

    You are Genius

  • @collinsikotun1436
    @collinsikotun1436 8 місяців тому

    This doesn't work on my M1 Mac, any suggestions?

  • @JustinK0
    @JustinK0 Рік тому

    The current set up i have works in a docker container when i have it running on windows but when i pull it to my ec2 instance on aws, it doesnt work,
    it tries to go to the url to get the data but just takes forever then times out.

    • @Priyanka-jb5jf
      @Priyanka-jb5jf 3 місяці тому

      same issue. did u get the solution?

  • @UnderworldCoder
    @UnderworldCoder Рік тому

    nice, would be nice if you did a video on seleniumbase using from seleniumbase import SB

  • @alexs7612
    @alexs7612 10 місяців тому +1

    No github code?

  • @werqweadcwer659
    @werqweadcwer659 4 місяці тому

    Why Python ? Is it doable with Java in the same way ?

  • @hrithiksharma2047
    @hrithiksharma2047 Рік тому

    Wouldn't it be much easier to use firefox instead of chrome?

  • @chillfill4866
    @chillfill4866 Рік тому

    Does anyone know any good cloud options? I want my scraping script running 19 hours/day, and obviously thats expensive.

  • @o2c4r1
    @o2c4r1 Рік тому

    Thanks man!

  • @jerick242
    @jerick242 Рік тому

    Is it works with streamlit cloud?

  • @digitalmachine0101
    @digitalmachine0101 7 місяців тому

    Good information

  • @ekopras6095
    @ekopras6095 Рік тому

    Bro you dont need chrome driver? Why its work normally?

    • @WorldMartialArt
      @WorldMartialArt 11 місяців тому

      because using webdriver manager, automatically install chromedriver. I think so

  • @kanwaradnan4849
    @kanwaradnan4849 10 місяців тому

    updated! To bad it didnt work for the amazon.

  • @buzadam1144
    @buzadam1144 Рік тому

    Thx bro

  • @kanwaradnan4849
    @kanwaradnan4849 Рік тому

    Nice trick

  • @kzqils7700
    @kzqils7700 3 місяці тому

    --headless not working on me, the browser is still opened

    • @bonessnap
      @bonessnap 3 місяці тому

      chrome 130 bug, just put window on pos -2500 -2500 to hide it, we are waiting for fix

  • @i4i3i360
    @i4i3i360 Рік тому +2

    First comment bro ❤

  • @AliceShisori
    @AliceShisori Рік тому

    dude how come you know everything :D

  • @philtoa334
    @philtoa334 Рік тому

    Thx_.

  • @alex_law_codes
    @alex_law_codes 10 місяців тому

    Has anyone run into this error:
    executor failed running [/bin/sh -c apt install -y ./google-chrome-stable_current_amd64.deb]: exit code: 100

    • @tianniezing1941
      @tianniezing1941 10 місяців тому

      Yes me, have you already found a solution? I still can't figure out what the problem is.