Build A Python App That Tracks Amazon Prices!
Вставка
- Опубліковано 12 січ 2025
- Check out my courses and become more creative!
developedbyed....
🎁Get all files, projects, exclusive videos and more on my Patreon: / dev_ed
In this episode we are going to build out a project using python. If you are a python beginner this simple application will give you some good practice.
We will be using python to make requests and do webscraping on amazon.de. If you have difficulty following along, I highly recommend my python for beginners tutorial which will teach you all the basics of python in 1 hour.
🛴 Follow me on:
Twitter: / deved94
Instagram: / developedbyed
Github: github.com/Dev...
#python #webscraping
Begins with, "If you're poor, you're gonna love this episode".
Noice.
Toit
Cool cool cool cool cool
I smell pennies
AM POOR, FEED CONTENT PLS
Bored i think
Instead of price[1:5] you could have used price[:-3]. That way, the actual length of the string doesn't matter. It will just remove the last 3 characters (decimals + € symbol)
Or you can do price[1:] to get all the characters after the currency symbol and cast it to float. The resulting value can also be used to calculate total price for a list of products
@@RickyC0626 Good idea!
or use regular expressions but as for me i still dont understand them fully :D
This! if the price goes under 1000 it's gonna crash with price[1:5]
Noice
"It will never stop, just like my mental state." Once again, I am left with more questions than answers. Thank you Ed -jake
wonderful.
It's not working in my system
It's not working
@@ramiiii amazing too.
Will the program continue running if I close the pycharm
thank you so much for this tutorial! some things dont currently work (since this video is a few years old now and Amazon has changed your ability to scrape from them, it seems), but I still learned the basics about web scraping from this! I changed up my program to also send price differences from other websites too. Even the parts about 2-factor verification were useful and now its something extra I know! Thanks again!
Thank you! I'm trying to learn python and little projects like this are really what i was looking for!
Good to know that you are interested in Python project, I have also created a playlist for Python project. Do check it: ua-cam.com/play/PLBeeFF3JmXWCQh987TsdowLK5U8XwbSzw.html
I'm not still in Python, but subscribed to make sure that this awesome man stays when I need him.
thanks for cool stuff bro.
Your videos are natural and unique. You make me feel we are having one on one tutorials.
Brilliant as a computer science teacher I will be utilising this for a project
Keep up the great work
this was a great episode, thanks!
i'm new here but appreciate the simplicity - almost would prefer you also show you googling the things you googled to FIND the packages you dl'd and imported. SOunds a little goofy, but really new people often ask how you knew what packages to use, and it's so helpful to show experienced people googling it.
Thank you Dev Ed! This works!
I feel so powerful right now xD
Looking for more tuts from your side!
Cheers from India!
This dude: "Okay, if you're poor, you're going to love this episode!
Me: "I'm listening..."
Me: No shit-te!
Haha
sees while(true)
*Yells in C++*
*Never stops yelling*
I started learning c++ and I am now kinda used to it and when I look at this... why didn't he make a universal function....
I was so triggerd
im new to coding, mind explaining?
@@walfranpinto80 it's just an infinite loop that will always run when the program gets to it. True is always true so it will never not run.
You usually want some kind of limit on how much a loop can run, especially in something like the c language family. Python kinda saves you from bad memory management because it does a lot on it's own, but it is a convention not to do that if others are working with or going to use your code.
Since this is more of a concept style of video tutorial, it isn't really an issue though.
Awesome!! Maybe you can share your setup about VSCode, sir! Your theme, fileicon theme, font or even useful extension. Thanksss
x34, deliver dude, pretty please!
Matherial theme
He has a video about his VS CODE config:
ua-cam.com/video/ULssP63AhPw/v-deo.html
I am python beginner and this simple application will give some good practice
great solution and tips bro!! thank you for your video.
as a non-tech user, I am using amazon review scraper e-scraper maybe it helps to somebody too.
Thank you, Jack, awesome tip. ESCRAPER really helps me.
awesome suggestion Jack. it helped in my case.
Bölüm
proyecto usando python
lihtne rakendus
This guy is a legend! Just look at the way he sits behind his computer and in front of camera!
loving the python tutorials! It'd be awesome if you did a python app with some interfaces or something. 😄
Trueeee
Just write the price to a tkinter frame
Tkinter is fun...
Python isn't mean to dev GUI apps
Towelie why not?
I'm a Python beginner, your video is awesome. Please make more Python tutorial
5:18 someone at amazon has too much time
LMFAO
wtf
Duck boi
I was wondering the same thing! LMAO
.__(.)< (MEOW)
\__)
hey ed, prashant here.. i appreciate the way you are always smiling and something in the presentation takes the stress out of the coding part :) I am a subscriber to the channel now ::)
15:05 Duplicate code; you forgot that you already wrote that!
Marked as duplicate.
@@nourios6991 I understood that reference!
What do you expect. They know shit about programming.
@@saptarshisengupta5073 what
Good sample but anyway you have to use in headers the directive 'Cache-Control': 'no-cache' else you will see the same page always and you can't get the new price.
headers = {'Cache-Control': 'no-cache', "Pragma": "no-cache"}
Are u sure? Shouldn't it check first the date of the last real update of a page before returning cached page? Hint: there is a special header for that in HTTP protocol.
i'm interested, following the conv
just to point out, remember that price[0:5] will not be effective for smaller prices, in such a case, i would consider split this string by delimiter, which is dot and then save the first part of the outcome. This way script won't be dependent by different number of digits, nice tutorial!
Why not just .replace() the comma and currency symbol with nothing and convert to float to get the exact price? Or is that very inefficient?
Man please more videos... You are a fantastic teacher 👨🏫
I have added also 'else' with small print to inform me that price has not changed without while True loop. and I used Windows task scheduler to check the price once per day. thanks a lot, love your positive attitude!
I found a solution for the "none" return/error instead of the title (or price - depends what you want to scrape):
instead of writing
soup = BeautifulSoup(page.content, 'html.parser')
I used
soup = BeautifulSoup(page.content, 'lxml')
and it works for me. I got to the point that i can send emails :) thanks for this tutorial!
Thank you so much!!
" In python we are cooler we do this the other way" 😂😂😂
Afaik there is no Import in JavaScript. However it exists in NodeJS.
Would the program still run if I close pycharm
@@alialshah8466 i had the same doubt
13:19 why not just add the "URL" variable after that string? that way you won't have a problem if you modify only the link above and forget about the second one
came here looking for this comment :)
@@jasperdiscovers Nice :))
And literally ONE second later he uses f' ' string formatting to explain how to include a variable inside a string LOL. He also called the send_mail() function twice with two identical if statements. I guess he forgot he already wrote that piece of code two minutes earlier. Anyway, he did a really good job. Not complaining at all.
9:15 he googled python web scrapping amazon XDXDXD i love this man xd
Of all the tutorials I have seen, this explained far better. Thanks.
We can use Regex to extract the price without specifying the elements, because what if the price by magic went down to the 10s or 100s.
Here's how i would extract it ... x = re.findall("[0-9]+\.?[0-9]*", price) then use float(x)
i never programmed in python but i learned it from java, and found that the library for regex in python is re.
Ok. After this video I'm definitly sure, that I want to learn Python. Damn. Easy win.
Seems python = crush JavaScript first, then forget { } exist instead just indent #ez
I am only concerned at the fact that Amazon used "MEOW" somewhere in their code...
That was a comment
jajajaja me too
someone had too much time to include that . Lmao .
Hey dude, this is 1st time I'm watching your video but loved it really I mean I'm in love with it. Yeah and subscribed it with notification bell so don't worry. See you soon.😍😍
This is so cool! I'm using it to track System of a down tour data, it will create a link-file to the page on my desktop as soon as there are new tourdates scheduled. I don't even have to check emails. Thank you so much! Want to give multiple thumbs up!
DISCOVERED YOUR CHANNEL LAST NIGHT! TOO GOOD.
Subscribed!
Didn’t know about you, saw one video and now you’re my fav! Haha.
Hello, Do check out the cool Python project video from this playlist: ua-cam.com/play/PLBeeFF3JmXWCQh987TsdowLK5U8XwbSzw.html
“HAVE YOU HEARD OF HONNEY!”
“THIS VIDEO IS SPONSORED BY HONEY”
“THIS VIDEO WOULDN’T BE POSSIBLE WITH OUT HONEY”
MisterabEAST lol
I like your project, keep sharing mind blowing projects champ
1:40 and i liked you already. Subscribed. I like how you talk about programming like it's some magic stuff.
There are 2 enhancements for this video
First one: You could pass the URL as an argument, so the function could be more generic
Second: If you were on Linux (I don't know about Windows), you could use CRON Jobs to execute the code
Nice job and good episode
Awesome man , loved this. Just now executed my code.
Thanks a lot for making this video
keep making video like this
At 16:30 , why converted_price > 1.700 in two if statements?
he forgot he did it already and didn't notice
just to be extra sure
By the way, these two conditions are not the same..
@@thebiziii What do you mean that they are not the same. He changed the second one, that's true but only because he wanted to demonstrate the function if the condition is true.
They are both the same
> is not the same as
"And dont send 1000 request to mess with their server"... pff you know us. We never wanna do things like this...
Amazon isn't very pro scraping. They'll block you real quick.
@@FlyingUnosaur True, that's way I had to develop scraper that use 50+ threads to collect thousands of products every day.
@@16bitart did you use proxies?
@@FlyingUnosaur Yes, this is only way. But I used free proxies with rotation.
@@16bitart ok thank you
I would like to program a dynamic logogenerator with export and preview function in Python. Specifically, an existing image is to be supplemented by five entries - after which the supplemented logo can be exported as ".zip", where .png, .pdf, and .png of the logo can be found.
In addition, there should be two preview areas, which should dynmaically adapt to what has been entered in the above mentioned entries. There will be two preview areas to the right of the five entrys
SMTPLIB stands for Simple Mail Transfer Protocol LIBrary. It is used to send,read,etc emails on a basic level.
omg this is so useful, I'm gonna apply this to other things
You should explain how to make it run silently and on the start of the pc
(On windows 10. Haven't tested on other versions of windows or any other os) Make the .py file .pyw file. Then type win+r and type 'shell:startup'. It will open a folder where you move your .pyw file or shortcut to it. Next time you boot it will run script silently on the background. You're welcome
Edit: Also change the code so that if the price hasn't dropped it goes back to checkPrice() and quits when it has dropped and send the mail
@@adventune375 can you run this on a web server ?
@@ZeroPlayer119 I don't know. Haven't tested it. You maybe need to modify it a bit but it could be possible
@@ZeroPlayer119 Yes, as long as the web server has python installed.
You could also just move the script to your windows startup repository (:\Users\Username\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup). I also made a schedule system for the script. I will post the link to my github if you want it.
Awesome. I set this up on my Raspberry Pi and had it ring a bell when my price target was reached. Saved me $100 today.
did anyone else have the problem where the soup data was really short and had things like "if you want access go to some Subscription API" and "we want to make sure you're not a bot and you should enable cookies"
Started from here and now Scrapy expert, Amazing video.
I am facing this problem when I try to print title on 6:34
AttributeError: 'NoneType' object has no attribute 'get_text'
Because you have to check the name of the "id". You can't simply copy his id because it returns None if it doesn't find anything. You are basically doing None.get_text()
Same. Any advice?
He is using the .de (germany) domain and not .com (international), if you use the german or uk domain it will work.
if soup2.find(id="productTitle").__len__() > 0:
title = soup2.find(id= "productTitle").get_text()
print(title.strip())
else:
title = "No result"
print(title)
at min 5:45 i tried it and it gave me this (AttributeError: 'NoneType' object has no attribute 'get_text')
what's the problem?
In the documentation for Beautiful Soup, it says that the find() method will return None if no element with the given id is found. Meaning, whatever you listed as the id was not found in the html content you pulled.
@@godihateyoutube you cant use bs4 anymore
The solution is actually quite easy, html parser cant parse Amazon html very well just use html5lib instead or lxml my friend
it works but only every 4th or 5th time =( what can i do now?
@@juliangeiler2515 what's the error you get? I will try to help you if i can.
I need this guy in my life
Subscribered.
You can have this python script executed in crontab (if using Linux), so the script will be run systematically some, say, 1 or 2 times a day, and the price will be checked in the background without even noticing it. I included the logging of the price to a text file, and the plot of its temporal evolution so to have a nice overview of the trend, and for each item I have included in a a list (with its belonging threshold).
Keep it up, Dev Ed - nice going!
For who has problems about converting string to float, you should replace coma with point
price = soup2.find(id= "priceblock_ourprice").get_text()
converted_price = float(price[1:6].replace(",","."))
print(converted_price)
Output = 1.998
I tried but when I try to find something it returns none 😅
Me also! What goes wrong?
also to me and I tried everything
If you're printing a function, it'll likely return none there. None is returned when, well, nothing is returned haha. He's printing off something the has a value inside the function which is why it worked, but if you just print a function, it'll say none unless you return something. Kinda like, say I made a function def math(): x = 5 x 5. if I print(math()), it'll say none. If i add a return, like def math(): x = 5 x 5 return x, now the print will display 25.
Try typing requests.get(URL), instead of requests.get(URL, headers=headers). Line 9 on video.
@@dunkboyys2361 Thank you
In my VS code i don't see an explanation of the functions as you have. At 5:01 typing "title = soup.find"
How can I get this?
guess python extension for intellisense
You get that by installing the Python extension : marketplace.visualstudio.com/items?itemName=ms-python.python
But as soon as you close the program (the code in Virtual Studio) you also close the loop, correct? So you need to have your python-program continuously running?
Noname Noname yea same thoughts. Has to run on a server to be not useless i guess :D
Or use cron/a scheduler to run it occasionally
Your projects are the only ones that I see are productive. Thank you for making these videos.
Wahey - I just did this and it totally works. Now I need to use Python to automate other aspects of my life... thanks dude!!
hmmm Yahoo?... internet modem flash backs🔥🔥🔥🔥🔥
I have a question, when I try to print using "print(page.status_code)", the result is 503 which means that the service is unavailable, what does it mean?
Does it mean it is not available to handle our requests??
14:45 how can you not see that you have the same if condition and send_mail() function just 3 lines above....???
Really cool Dev Ed! Easy to scrap things better.. loved it!!!
Yo, great content bruh. Could you also share with us as to how do you research about those libraries you used in the video? Thanks very much!
Hey Rishi! You can have a look at www.tutorialspoint.com/python/python_modules.htm Specifically have a look at built-in modules
:D Good luck
5:16 How come when I print(title) it says None
R u sure your variable is named correctly?
soup = bs(page.content, 'html.parser')
soup2 = bs(soup.prettify(), "html.parser")
use soup2 instead
im having same problem
@@HansPeter-gx9ew I'm using Ubuntu OS. I'm facing same issue.. I tried your suggestion but it doesn't work for me.
Error :
The code that caused this warning is on line 11 of the file scraper.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
soup1 = BeautifulSoup(page.content, "html")
scraper.py:12: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 12 of the file scraper.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
soup2 = BeautifulSoup(soup1.prettify(), "html")
output :
None
@@HansPeter-gx9ew Thank you! However, I got warning to specify a parser, so I used "lxml" instead of "html." Works as expected now.
Please make more JavaScript videos.
Will make js videos too don't wory, but it can get a bit monotone for me to only do that 24/7 😀
LOVED IT BRO...UR EXPLINATION IS LIT.......
you are awesome i don't have words to thank you for being on you tube...
love you 3000
lol when you set the price to price[0:5], this is bad programming. always do a regex lol. I'm also wondering how the euro symbol disappears in that string splice. when you start at 0, it should get the first character which is the euro symbol right?
Or for this specific case, you can find the euro sign and split the string until it, I agree with you man
@@yskhcl i just find it strange lol.
I too had the same query!
"it's not stopping! It will never stop, just like my mental state" lol, lmao. ;)
Shit
if statement with parentheses?
python zen: We don't do that here my friend.
The while loop as well
Дмитрий Авдеев didn’t watch that long, as soon as i saw that, i just bailed.
subbed! Just came from suggestion by youtube page, and simply loved your content and how you explain! cant wait to see more python videos.
This is actually more simple that I originally thought, very interesting video mate
Seems as if amazon no longer easily allows scraping like this.
oh is that why it wasnt working for me? i wasted 3 hours trying to get this working...
@@mokafi7 F
I expected this and first searched the comments before wasting hours of my life. F
@@FRElHEIT I added a Referer to my header and it seem to bypass the web scraper block ex. Referer : "www.google.com"
Works for mw
why does it not work anymore?? it just says none instead of price
I ran into that problem too. I even changed the text to a-price-whole to reflect the page's current format. 🤷
when ever i use .get text this comes up please help:Traceback (most recent call last):
File "main.py", line 7, in
title = soup.find(id="productTitle").get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'
try these out:
soup = BeautifulSoup(page.content, "html5lib") (you need to install html5lib)
if this doesnt work use
New code: soup = BeautifulSoup(page.content, 'lxml')
Hi, i tried html5lib as well as lxml. Same error , could you please help.
@@tinkug2798 are you finding element by id or class?
if you are using class use:
title = soup.find("div", {"class": "_35KyD6"}).get_text()
@Ayush, hey ayush, i am finding by id just like it is shown in the video
Thanks Ayush. I think i got the issue it is varying from website to website, i tried on amazon.in same code worked perfectly fine. Amazon.com it is throwing this error .
smtplib is a library in Python that uses the SMTP(simple mail transport protocol) layer to send emails from a client
awesome video. i’m gonna use this as a base for practice and make a little web app out of it which allows the user to input an url himself :)
i have written the same code like yours and i got "None" after executing the file
help please!
soup = BeautifulSoup(page.content, 'html.parser')
soup1 = BeautifulSoup(soup.prettify(), "html.parser")
title = soup1.find(id="productTitle")
print(title)
recommended by a guy earlier
@@sibbonsshrestha3438 Thank You
that part is written in JS so its technically not there.use selenium
@@sibbonsshrestha3438 I LOVE YOU.
Hey there! Thanks a lot for sharing your knowledge. It helps a lot for students like me to stand apart. :)
13:27 USE THE VARIABLE OH MY GOD.
Man you are a fantastic programmer and a more awesome person!
A big hug from Monterrey Mexico
Thanks a lot for everything you do !
u can also just fetch all the html data, compare new and old data and if they are different, then a change was made on the page. then just pull whatever data was changed into a table. u can also use text logic like integer,integer.integer is most likely assumed as the price format, then u can redact all string results and u most likely will end up with a collection of changed prices. I think this way allows the only changing input to be the site address, but can be used on multiple sites that use the price format integer,integer.integer.
or just use amazon api
can you make an app that buys the camera automatically whenever it reach a certain price...is it possible?
MOHAMED YUSEF i think you can make it whit selenium module. Its use in web automation.
Yeah but it's better to use Selenium for that. That being said, Amazon might stop you from connecting if you use a headless browser, so try some stuff out.
Andrei Lazar if it’s online there’s a way for it to be done. Amazon can do a lot to prevent it, but a determined coder can find away around it. Wether that be scraping the page for the CSRF token and sending that in the header or a hidden form value - its impossible
To prevent a public page from being used by bots completely
A friend of mine built a bot that did exactly that. He actually gave it full access to his PayPal credentials to purchase items. It went rogue and it bought some really weird items on its own. It bought a pink selfie stick, winter gloves, and some other totally useless stuff. Too funny.
could not convert string to float:(
Mine too! use single quotes converted_price < '1,700' as a string
in my case, the output of price variabel is $1,700.00 or something like that,,
so I make the code like this: converted_price = float(price[1:6].replace(",", "."))
this will return a float 1.700
@@fadhlulfahmi5125 Even I had to StackOverflow this stuff. Pretty confusing as str_replace() was not working properly but the . operator did the perfect job. You know why str_replace() didn't work? depreciated?
@@abbyboing as far as i know, python use str.replace() not str_replace()
@@fadhlulfahmi5125 maybe str_replace() would've been some old function, as I did see alot of that on stack overflow.
Anyone else have issues with the .get_text() request when he defined title and price?
Try importing:
from django.utils.translation import gettext
Did you find a solution? Getting this as well
@@__-vq9mb exactly as you've typed it or what command should we put in the terminal?
Great Video !!!
Since always I wanted to use this Beautiful Soup and now thanks to you I can do it.
I definitely give a like !
Great stuff! Straight forward and made simple! Thanks mate!
Do you not declare the conditional twice? thanks for your video!!
He did. saw it too xD
It say None our get an error when I print title
I've tried to add the second parameter to headers and it helped "headers = ({'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
'Accept-Language': 'en-US, en;q=0.5'})
@@elenatarasova9027 thanks!
Looks like Amazon updated their site. It keeps detecting me as a bot and gives me a different html.
My request returns a bunch of text that is not nearly as long as what you got when you used print(soup) at 4:00. It doesn't contain any info I can pull out like product title or price that I can find when I inspect the actual website. Does anyone know why this happens?
OMG bro it's like your brain works exactly like mine excpet you get the results (via hours of trial and error and study?). I love that you offer your learnings to share just like we were talking sh*t over a cup of coffee. Or a beer. Or several. You help keep this ageing brain growing. Much love =)
Theoretically you could observe the stock market using this tactic, and use an alternate email for the amount of emails you get.
ELABORATE.
When he checked the email if it was sent, it looked like he pulled up Yahoo and not Gmail, or was I tripping.
4.18 “it will never stop. Just like my mental state”
Man this was FANTASTIC
6:35 uhh unless you use === you can compare a string, for example
"1" == 1 is true
"1" === 1 is false
My man, you can't be sending an email from within a function called "check_price()". If it's called "check_price" then it should just "check a price" (whatever that may mean). It should certainly not be sending an email, or whatever else that is not related to checking a price. When you turn on your tv, you don't expect your front door to open.
Ok mr internet police, maybe instead of calling it "check_price" he could've just called it "main" or "amazon_scraper" but he did it for the sake of the video. if that's what you're getting out of this video then i feel very bad for you.