I love the content, I'm starting scraping, I'm grateful to have come across your channel. At work they are asking me to automate the checking of some websites that today is done manually, my question is: how do I present the scraping results in a report? that says if the website is ok or down, etc. Thanks
Great video as always, John! I think you didn't address the issue of why you use one browser for one page only and the close it right away. Would be interesting to know. Don't fresh browsers without any history and cookies seem more suspicious to target sites?
What about launching multiple tabs inside a single browser window for concurrency? instead of launching multiple browser windows with each having a single tab....
Yes you can do that as you would normally but over multiple browsers with grid. Grid allows you remote connect to concurrent browsers and manages them for you, inside each one you could have multiple tabs
how to scrap table data with BS4 in python ? table data has ul and li tags nested, taking all li tags repeats the data again. Didn't find any method to get only main tags with which the nested tags and data can be obtained. All li tags have everything in common. No class names also given to main li tags.
Anyone got an alternative to selenium wire? I have a site that when page loads it makes the API call that I want to intercept, but it's IMPOSSIBLE to replicate the API call. Selenium wire solves this, I scan the requests made in the background and I get what I want. But now I'm stuck with selenium. Does anyone know alternatives to selenium wire that are reliable?
Playwright is the best tool fpr intercepting network events. My only gripe is that you need to store the intercepted data in a global data structure rather than just returning it. If that was solved, there's no alternative for playwright both in terms of speed and stability.
Hey, I love your videos, can you make a video on how to use the new chromedrivers for webscraping using selenium on Mac and Windows. that would be really helpful, as I am planning to scrape a lot of data for a machine learning project.
Fantastic tutorial John, I’m looking forward to having a play with remote and grid. Keep the tutorials coming 🎉
I love the content, I'm starting scraping, I'm grateful to have come across your channel. At work they are asking me to automate the checking of some websites that today is done manually, my question is: how do I present the scraping results in a report? that says if the website is ok or down, etc. Thanks
Great video as always, John! I think you didn't address the issue of why you use one browser for one page only and the close it right away. Would be interesting to know. Don't fresh browsers without any history and cookies seem more suspicious to target sites?
does anyone know if its possible to use undetectable_chromedriver with selenium grid?
Not that I am aware of - although I wouldn’t be surprised if someone has put a solution together
Hi Sir, how we can scrape a webpage or website which is showing status code 403.
(Not by saving html) kindly another method.
Is it one core per instance?
That's awesome !
I have two questions:
1-why you don't use Threads for running Firefox?
2-Your .vim setups :)
Do you have any paid course on Selenium or Software testing anywhere?
What about launching multiple tabs inside a single browser window for concurrency? instead of launching multiple browser windows with each having a single tab....
Yes you can do that as you would normally but over multiple browsers with grid. Grid allows you remote connect to concurrent browsers and manages them for you, inside each one you could have multiple tabs
nice content
how to scrap table data with BS4 in python ?
table data has ul and li tags nested, taking all li tags repeats the data again. Didn't find any method to get only main tags with which the nested tags and data can be obtained. All li tags have everything in common. No class names also given to main li tags.
John, absolutely no idea on how to, however where can I get your tuition?
Great video. One question: What paid proxy do you recommend for web scraping? Which one do you use usually? Thanks you are a mentor for me :)
Anyone got an alternative to selenium wire? I have a site that when page loads it makes the API call that I want to intercept, but it's IMPOSSIBLE to replicate the API call. Selenium wire solves this, I scan the requests made in the background and I get what I want. But now I'm stuck with selenium. Does anyone know alternatives to selenium wire that are reliable?
i've done similar with playwright, you can get into the network events from there too but it wasn't any easier than selenium wire
Playwright is the best tool fpr intercepting network events. My only gripe is that you need to store the intercepted data in a global data structure rather than just returning it. If that was solved, there's no alternative for playwright both in terms of speed and stability.
Why cant you replicate the api call?
@@JohnWatsonRooneyPlease make a more detailed video in SeleniumWire and it's alternative, including the best practices.
@@adnantaufique68 thanks for this
Cool
Hey, I love your videos, can you make a video on how to use the new chromedrivers for webscraping using selenium on Mac and Windows. that would be really helpful, as I am planning to scrape a lot of data for a machine learning project.
Webdriver Manager Solves this with ease
@@mabdurrafeyahmed9256 god bless you my man
Yes!!!
when im doing that i get - Max. Concurrency: 1, how to change it to 12?
It’s depended on cpu cores and if your running it via docker compose you need the se max nodes line in the yml file