PSA - looks like eBay implemented a new captcha authentication page that will cause this scraper to error out. If you try to run John's code, you will get an error - something like "AttributeError: 'NoneType' object has no attribute". If you print the soup, you will actually see that the page being parsed is a security page. That's because there is now a captcha verification needed. To see what I'm talking about, open up an incognito browser, search for a product, then filter by "sold". John, it would be great to see a follow-up video on how you would advise tackling the problem - would adding some selenium to select the captcha do the trick?
The downsides of scraping - eBay reworked their site since this video so as you say this code no longer works. I’ll pin this so people can see and work on an update for the future
@@JohnWatsonRooney Hey there - firstly, thanks for this rundown. It’s been a huge help. I’m new to programming and web scraping, so this has been something I’ve wanted to do to automate some of my work project. I came here to see if this was mentioned, as I’ve run into the same issue as the pinned comment today. I know you mentioned the potential of an update video, but would you know of any resources to try and find ways to bypass the update? Or to somehow derive that information another way. I keep hitting dead ends, but if you have any general advice, it’d be a huge help. And if not, no worries. The update definitely made things a bit of a pain
As a beginner in the programming world, I found this channel as "vision enhancer" and I feel myself very lucky as it taught me a lot just in two videos. Thank you very much!
Never worked on python neither heard of beautiful soup before! But I achieved what I wanted to scrape in 1 day after watching this video! Thanks Jhon! PS: I have prior programming experience
@@sebastianmt02it is because of the new captcha identification page, which breaks this scraper. Basically, if you are not signed in, you get the captcha. To see what I'm talking about, open up an incognito browser, search for a product, then filter by "sold".
Can you redo this with Scrapy with possible pagination/next_page? Thanks. As a secondary maybe have a follow up video with SQLModel of the data scraped from Ebays sold items? Thanks and Thanks.
Thanx. This was my first scrape script when I start learn python and web scrape. It takes me about 2 days hehe. I create it to follow pc parts prices then I plot graph to see price trends for each day. I still have few things to fix but it do job.
Thank you John. Easy to follow tutorials as always. Would you consider good practice to make use of 'headers' or perhaps this doesn't matter on small projects?
great videos. I'd like to caution readers - you may encounter very strange behaviour when viewing a page of ebay items in your browser, versus what you actually extract via python. My items were simply not aligning... if I had to guess, perhaps something to do with my browser repeatedly showing me the ebay items stored in cache instead of "live". Anyway, things seem to "make sense" now I view ebay in incognito mode (??)
actually let me revise that: I am increasingly of the view that there is a somewhat probabilistic / random element to what items ebay will display each time you request a give page (e.g. 3rd page of results). Hence... you can find yourself pulling hair out wondering why what you see on screen doesn't match what Python is pulling out (of the very same URL!).
I gave it a try to do the scrapying of ebay with those codes but it did not work. It says like this, "Security measureSkip to main content Please verify yourself to continueerror To keep eBay a safe place to buy and sell, we will occasionally ask you to verify yourself. ". So it is not possible to do anymore?
Hello sir , please is there a way that the bot can index all the sites in google for a having a particular keyword and extracting the info need like name and price ? if yes please how can it be done
Took a noob like me a little bit to figure out what Visual Studio was but after that smooth sailing! Thanks! Do you happen to know how to return more than 50 per page? My ebay screen shows 200 but the csv only has 50.
Awesome video as always man. Can I ask you a question? I am having trouble scraping a site, I am looking for the element that contains all the divs so I can loop through each one
Thanks John I ended up getting it, just having trouble looping through all the pages now. I'm using a for loop with the url containing the page number to loop through but I'm only getting the first page scraped. Couldnt get grequests to work or request threading so far requests has been the only functional way to scrape going back at it today
Hi John I am new in programing , also i try to follow your steps in pycharm , but when y try to print results , run console dosent display any information only that: ,, "C:\Users\Thinkpad x260\PycharmProjects\pythonProject1\venv\Scripts\python.exe" "C:/Users/Thinkpad x260/PycharmProjects/pythonProject1/main.py" Process finished with exit code 0" tell me please where can be the problem ? cause i am searching for 3 days in the net and can't find any information . Thank you
Hey! Is that not what PyCharm says when it completes? Do you have any output like print statements in your code? Also I don’t think the code from this videos works properly anymore as eBay updated their site
hello john, I want to ask, how to find a job with web scraping skills, I am a beginner in the field of web scraping, are there any specific tips for finding a job ..? btw thanks for making this tutorial
So I think it’s more taking the skills we learn by scraping and applying them to other dev jobs. I’d you scrape a lot you know a lot about web protocols, you know coding well, APIs and frameworks. Shopify is a good place to look too people always want things done for that platform
I got all data: title, price, etc in one cell in excel not in separete columns, what is wrong with my code? I got everything exactly like in the video?
@@JohnWatsonRooney I tried changing the class tag from the current structure of html but it still isn't getting the title and sold date lol. Hoping an update in the future!
hi! that means it didnt find the data you selected, check by trying to print the css selector or bs4 selection in the code first to see what comes back
Hey hohn, very great full.. I need such kind of script that give me rendered html+javascript for given url And its support multi threading and rotate proxy(i have 10 target website for each 100k request in 48 hour)i am able to purchase ratating ip).how i do this... Please help me...
Hi Nil. That’s very doable but will be reasonably complex - I would look to use splash as the renderer and rotate through your proxies, utilising concurrent futures to manage your multi threading. The hardest part will be the error handling as without a good way to deal with that the code will struggle. It’s a lot to cover I won’t be able to do it specifically but most of the information is on my channel for each part
Çok güzel bir anlatım olmuş elinize dilinize sağlık ben daha çok şunu merak ediyorum ofis ortamında yapılan günlük sıkıcı bir takım işler var firmanın farklı online satış platformlarında bulunan ilanını incelemek ilanda düşüş varsa bunu analiz etmek gerekirse ilana sil yükle yapıp drop shipping uygulamak falan bu türden bir mesleği icra eden birisi için dışarıdan teknik hizmet almak çok verimli olmuyor yani bir noktadan sonra işi sistem yapınca sistemi yöneten kişi baypas oluyor bu yüzden bu konuda ilk etap da amatörce ama verim alır ve adaptasyon sağlarsam profesyonel bir eğitim alıp kariyerimde kendimi geliştirmek isterim. Konu ile araştırmalarım beni python programlama diline yöneltti başında ifade edeyim yazılım sektörü ile hiç bir alakam yok ama işim gereği araştırma ve kendimi geliştirmeye açık bir yapım var sizce böyle bir bot yazabilirmiyim botun aynı zamanda sistem tarafından fark edilmemesi lazım internette bunun ile alakalı çok çalışma var ama ileri seviye python değil kusursuz bot nasıl yazılır işim ile nasıl entegre ederim bununla ilgileniyorum bu konu sizden ricam öneri ve tavsiyelerinizle bana bir yol göstermeniz.Cevabınız için şimdiden teşşekür ederim
Thanks so much for all of your videos, they're very easy to follow and understand! I just have one question: If, in theory, 'bids', for example, was included for some products but not others, how would you change the script so the bids are still included in the dictionary, even if there is no 'bids' to find for a certain product. Hope my question makes sense!
Hi - glad you found my videos helpful! The easiest way would be to use a "try and except" block - if no bids are found, then the code runs "bids = 0" or something similar.
You really should cover pagination on e-bay, as far as i can tell the HTML seen in my browser isnt in the soup file. I think they are masking it somehow
PSA - looks like eBay implemented a new captcha authentication page that will cause this scraper to error out.
If you try to run John's code, you will get an error - something like "AttributeError: 'NoneType' object has no attribute". If you print the soup, you will actually see that the page being parsed is a security page. That's because there is now a captcha verification needed. To see what I'm talking about, open up an incognito browser, search for a product, then filter by "sold".
John, it would be great to see a follow-up video on how you would advise tackling the problem - would adding some selenium to select the captcha do the trick?
The downsides of scraping - eBay reworked their site since this video so as you say this code no longer works. I’ll pin this so people can see and work on an update for the future
@@JohnWatsonRooney Hey there - firstly, thanks for this rundown. It’s been a huge help. I’m new to programming and web scraping, so this has been something I’ve wanted to do to automate some of my work project. I came here to see if this was mentioned, as I’ve run into the same issue as the pinned comment today. I know you mentioned the potential of an update video, but would you know of any resources to try and find ways to bypass the update? Or to somehow derive that information another way. I keep hitting dead ends, but if you have any general advice, it’d be a huge help. And if not, no worries. The update definitely made things a bit of a pain
Big thanks pal! Was going nuts trying to figure out what I did wrong!
we need a soloution ASAP!!
Thank you so much for posting this. I was driving myself up the wall with getting the NoneType error and not knowing I did wrong.
Just wanted to say "Thank you" for the time and effort it takes to make these videos. They're great practical examples. Please keep them coming.
Thanks for watching them, I’m glad you guys find them useful
As a beginner in the programming world, I found this channel as "vision enhancer" and I feel myself very lucky as it taught me a lot just in two videos. Thank you very much!
Thank you for watching!
Never worked on python neither heard of beautiful soup before! But I achieved what I wanted to scrape in 1 day after watching this video! Thanks Jhon!
PS: I have prior programming experience
That’s great glad I could help
Anybody figure out why using .text in the parse triggers?:
AttributeError: 'NoneType' object has no attribute 'text'
i have the same error...
Thanks for creating the video. It’s really helpful for those of us that are new to programming and web scraping in general.
As always, very clear...thanks for the effort into making these videos...
Thank you!
I got an error: item.find('h3' ...etc...) 'Nonetype object has no attribute text
Me too !! I cant find solution in google !!
@@sebastianmt02it is because of the new captcha identification page, which breaks this scraper. Basically, if you are not signed in, you get the captcha. To see what I'm talking about, open up an incognito browser, search for a product, then filter by "sold".
If possible, make a video on bypassing captchas. It would be very helpful...
Yes. I also request sir
Ya let me know too
I'll have to look at your channel, see if I can do a crash course to get up to speed w/ Python and how to use what you coded on github...
your explanations are as clear as water. love it👍
Thank you!
As clear as wootah
Can you redo this with Scrapy with possible pagination/next_page? Thanks. As a secondary maybe have a follow up video with SQLModel of the data scraped from Ebays sold items? Thanks and Thanks.
Thanx. This was my first scrape script when I start learn python and web scrape. It takes me about 2 days hehe. I create it to follow pc parts prices then I plot graph to see price trends for each day. I still have few things to fix but it do job.
Thanks a lot for all these awesome gifts.
Currently stuck at 12:37, it says process finished but doesn't show any output. Is there something I'm doing wrong?
Thanks much...finally...a step by step clear walkthrogh...now i get it....testing and using it on kijiji...works well...thanks again.
Very nice scraping tutorial!
I’m just wondering what’s the benefit of using scraping vs ebay api?
bonjour, can we use in this case the library requests-html instead?
yes, thats is my favourite web scraping library right now too
Please make a video to prevent block when scraping.. I follow your video regularly.. I said it on previous comment also. I will be very grateful.
+1
Thank you John. Easy to follow tutorials as always. Would you consider good practice to make use of 'headers' or perhaps this doesn't matter on small projects?
It’s good practise to use them but for small things like this if it works without I don’t bother- perhaps I should!
great videos.
I'd like to caution readers - you may encounter very strange behaviour when viewing a page of ebay items in your browser, versus what you actually extract via python. My items were simply not aligning... if I had to guess, perhaps something to do with my browser repeatedly showing me the ebay items stored in cache instead of "live". Anyway, things seem to "make sense" now I view ebay in incognito mode (??)
actually let me revise that: I am increasingly of the view that there is a somewhat probabilistic / random element to what items ebay will display each time you request a give page (e.g. 3rd page of results). Hence... you can find yourself pulling hair out wondering why what you see on screen doesn't match what Python is pulling out (of the very same URL!).
Learned a lot John, thank you. I adjusted it to make it work correctly, but great video!
im having an error with getting the item titles and item links. also an error with writing to the csv file. any chance you could help with that?
I gave it a try to do the scrapying of ebay with those codes but it did not work. It says like this, "Security measureSkip to main content Please verify yourself to continueerror To keep eBay a safe place to buy and sell, we will occasionally ask you to verify yourself. ". So it is not possible to do anymore?
Hello sir , please is there a way that the bot can index all the sites in google for a having a particular keyword and extracting the info need like name and price ? if yes please how can it be done
What editor do you use? Jupiter that I am using doesn't highlight words as yours does and it looks very helpful.
its VS Code - free and works get with Python
@John Watson Rooney how would you go about making your search term a list of search terms? Or use this to look into multiple terms?
Essentially create a list of terms and loop through them, sending each one to be scraped
@@JohnWatsonRooney thanks for the reply. I'll try it out now.
@@JohnWatsonRooney when I run the program it only records the last (search)term in pandas, and no other terms? Why is that?
Took a noob like me a little bit to figure out what Visual Studio was but after that smooth sailing! Thanks!
Do you happen to know how to return more than 50 per page? My ebay screen shows 200 but the csv only has 50.
How would I solve for an error stating "product is not defined" ? As far as I can tell my syntax matches up with what is presented
Awesome video as always man. Can I ask you a question? I am having trouble scraping a site, I am looking for the element that contains all the divs so I can loop through each one
you can also find_all on the divs themselves if they have a matching ID or class or similiar, if that helps?
Thanks John I ended up getting it, just having trouble looping through all the pages now. I'm using a for loop with the url containing the page number to loop through but I'm only getting the first page scraped. Couldnt get grequests to work or request threading so far requests has been the only functional way to scrape going back at it today
Hi John
I am new in programing , also i try to follow your steps in pycharm , but when y try to print results , run console dosent display any information only that: ,, "C:\Users\Thinkpad x260\PycharmProjects\pythonProject1\venv\Scripts\python.exe" "C:/Users/Thinkpad x260/PycharmProjects/pythonProject1/main.py" Process finished with exit code 0" tell me please where can be the problem ? cause i am searching for 3 days in the net and can't find any information . Thank you
Hey! Is that not what PyCharm says when it completes? Do you have any output like print statements in your code? Also I don’t think the code from this videos works properly anymore as eBay updated their site
@@JohnWatsonRooney the output after 'run' is :" Process finished with exit code 0" , also i used up to this date url and web data
Is there a way to then check the prices (or the csv file) from your phone? And how would you do it
Nice job John👏
Hey,
I really liked the video, but is there a way to fix the problem with the new captcha authentication page?
Hey, I haven’t updated this since the new site and protection was launched - sorry I don’t have a solution for this method right now
hello john, I want to ask, how to find a job with web scraping skills, I am a beginner in the field of web scraping, are there any specific tips for finding a job ..? btw thanks for making this tutorial
i want to know also..?
So I think it’s more taking the skills we learn by scraping and applying them to other dev jobs. I’d you scrape a lot you know a lot about web protocols, you know coding well, APIs and frameworks. Shopify is a good place to look too people always want things done for that platform
@@JohnWatsonRooney what framework should i study..?, ok thanks for tips i will try to scrape a shopify, what framework should i study thanks john
@@ferilukmansyah3037 For Python I think Django is a great framework to learn to get paid work
I got all data: title, price, etc in one cell in excel not in separete columns, what is wrong with my code? I got everything exactly like in the video?
What if I want to find all the products there are pagination there, for example there are about 60 pages, how to find all the products on that page
question how to grab the price if there have one is original price and 2nd price is on sale
amazing asusual your kudos... to your efforts and making things easy for newbies in python
Thank you!
tells me that all the missing some attributes. such: find, text and other in product . Also, 'h3' and find nether. Thanks
Yes unfortunately since writing this eBay updated their site, so the specific code doesn’t work. It is still possible to scrape though
Hey John, how to get the data from site that has #shadow-root I don't know what it means would u plz help me with it.
excellent video..! congrats.!
I'm running into the captcha issue. I tried to add headers to work around it but it's still blocking it. Any suggestions?
I don’t think this way works properly anymore, i think I need to update the video!
@@JohnWatsonRooney Please let me know when this video get updated. thanks
Sir how can i use proxy in edge browser in python?Plz let me know if possible.
Brooooo your channel is so helpful thankyou so much!!
Thankyou , love you're videos :)
hii.nice video btw.but plz make video on bypass captcha and bypassing anti web scrappers
How can i let the code trying all day? To send the info to my discord?
it seems like the structure of html at ebay changed. i cant get the title of the products and sold date
Yes im afraid this video is outdated now
@@JohnWatsonRooney I tried changing the class tag from the current structure of html but it still isn't getting the title and sold date lol. Hoping an update in the future!
Hi John, I have a matter !!! , I cant put .text !!! because geve me error : AttributeError: 'NoneType' object has no attribute 'text', Best Regards !!
hi! that means it didnt find the data you selected, check by trying to print the css selector or bs4 selection in the code first to see what comes back
Hey hohn, very great full..
I need such kind of script that give me rendered html+javascript for given url
And its support multi threading and rotate proxy(i have 10 target website for each 100k request in 48 hour)i am able to purchase ratating ip).how i do this...
Please help me...
Hi Nil. That’s very doable but will be reasonably complex - I would look to use splash as the renderer and rotate through your proxies, utilising concurrent futures to manage your multi threading. The hardest part will be the error handling as without a good way to deal with that the code will struggle. It’s a lot to cover I won’t be able to do it specifically but most of the information is on my channel for each part
@@JohnWatsonRooney thank you John can you do free lancing job
Çok güzel bir anlatım olmuş elinize dilinize sağlık
ben daha çok şunu merak ediyorum ofis ortamında yapılan günlük sıkıcı bir takım işler var firmanın farklı online satış platformlarında bulunan ilanını incelemek ilanda düşüş varsa bunu analiz etmek gerekirse ilana sil yükle yapıp drop shipping uygulamak falan bu türden bir mesleği icra eden birisi için dışarıdan teknik hizmet almak çok verimli olmuyor yani bir noktadan sonra işi sistem yapınca sistemi yöneten kişi baypas oluyor bu yüzden bu konuda ilk etap da amatörce ama verim alır ve adaptasyon sağlarsam profesyonel bir eğitim alıp kariyerimde kendimi geliştirmek isterim. Konu ile araştırmalarım beni python programlama diline yöneltti başında ifade edeyim yazılım sektörü ile hiç bir alakam yok ama işim gereği araştırma ve kendimi geliştirmeye açık bir yapım var sizce böyle bir bot yazabilirmiyim botun aynı zamanda sistem tarafından fark edilmemesi lazım internette bunun ile alakalı çok çalışma var ama ileri seviye python değil kusursuz bot nasıl yazılır işim ile nasıl entegre ederim bununla ilgileniyorum bu konu sizden ricam öneri ve tavsiyelerinizle bana bir yol göstermeniz.Cevabınız için şimdiden teşşekür ederim
Thanks so much for all of your videos, they're very easy to follow and understand! I just have one question: If, in theory, 'bids', for example, was included for some products but not others, how would you change the script so the bids are still included in the dictionary, even if there is no 'bids' to find for a certain product. Hope my question makes sense!
Hi - glad you found my videos helpful! The easiest way would be to use a "try and except" block - if no bids are found, then the code runs "bids = 0" or something similar.
@@JohnWatsonRooney Cool, I managed to figure it out and write a workable code! Great way to almost start the new year. Happy holidays!
i comment before i see the video, because i know it's going be very nice :D
:D
Do a video on “updating scraped prices on ebay programmatically”. Solemn request.
You really should cover pagination on e-bay, as far as i can tell the HTML seen in my browser isnt in the soup file. I think they are masking it somehow
Would like to see you put up a paid course.
I have plans for one, hopefully I’ll be able to release some information about it in the new year
John please make video related to scrap adidas men's shoes please
I wana to contact you please tell means of communication i want help
perfect