GPT-4 Vision API + Puppeteer = Easy Web Scraping

Unconventional Coding

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 26 гру 2024

КОМЕНТАРІ • 153

@unconv Рік тому ⁺¹⁰
Almost 20K views 😳 Part 2: ua-cam.com/video/PMLg6Rr8fcU/v-deo.html
@surya0202 Рік тому
please upload part 2 sir
@artemisfauls 8 місяців тому
I believe this method can be used to automate certain routine processes, but only if the price of gpt4v is reasonable. For example, you need to send 10,000 screenshots with a resolution of 1920x1080 pixels to gpt4v in 1 day - how much will it cost?🤔🤓
@Lewis64 Рік тому ⁺¹⁸
Didn’t expect a coding video to be this entertaining. Love the frank display of your thought process.
@arc0life Рік тому ⁺¹²
Your tutorial helps with the excitement and anxiety as a fellow dev. I knew I could do this myself but keep procrastinating and eventually some tasks end up as a mental block in WFH mode. Just forcing myself to watch a fella do something like this really helps, thank you!
@mooktakim Рік тому ⁺³
This guy has superpowers. He can talk and code at the same time!
@dustinsoodak8954 Рік тому ⁺⁵
I love how much of the process of programming he includes in the demo
@robbennett6053 11 місяців тому ⁺¹
Seriously impressive. I'm a NodeJS API engineer and you're writing that JS code faster than me!
@unconv 11 місяців тому
Thanks! Fast doesn't equal good, though 😅
@Salfie007 Рік тому ⁺²
I just wanted to tell you that you are doing great and I really like your format.
@unconv Рік тому
Thank you very much!
@thecount25 Рік тому ⁺⁵
Use the retry library and set a low timeout; you can use a simple decorator. If the timeout needs to be high and this isn't very pleasant, consider running multiple requests concurrently and waiting only on the first result.
@Autoscraping 11 місяців тому ⁺¹
A fabulous video that has been of great help in orienting our new collaborators. Your generosity is highly valued!
@fuba44 Рік тому ⁺⁴
This was super cool! Don't mind the long format at all. Would love to see you evolve this concept in another video.
@unconv Рік тому ⁺³
I've already filmed the next one. It'll definitely be long form 😅
@amritbanerjee Рік тому
With full page screenshots. Maybe create an assistant which looks at my bookmarks and the tags in there based on my question and tries to get me the info from the page.
@gmichael5506 10 місяців тому ⁺¹
Really appreciate your information and style. Learning much!
@unconv 10 місяців тому
Thanks for watching!
@grant_vine Рік тому ⁺³
So for cookies you just need to know what cookie is being set, in many cases it’s likely just a matter of causing the same effect in puppeteer, one way is to add to the cookie store directly (I’m sure puppeteer has a way to do this), and an alternative is specifying a “user directory” for puppeteer so you can actually agree to things like cookies, in many ways consent popups are easy to “locate” using standard html locators simply because it is often set to a priority load event and is often a div/container with a name/id containing the word consent or cookie etc, so regex can be used to find these reasonably easy. Use puppeteer to locate the “Ok” button and click it and then having that reusable user directory means you only check for any site if you have or haven’t accepted consent, if not click it if so just scrape it
@gaming_for_sanity Рік тому ⁺³
Legend has it, he’s still trying to find out what the weather is like in Alaska…
@thomas-sinkala Місяць тому
Great video. I have a question, can you suggest a way to select dynamically generated element id using puppeteer and OpenAI?
@ntgCleaner 11 місяців тому
I'm only up to 15:00 but the issue you had up at this point is that it CAN read sam altman's birthdate, but it doesn't know what the date is today. You can feed it the date in your response generated with `date()` or whatever.
@iceshoqer 11 місяців тому
Chain of thought is actually meant to be used for mostly information accuracy, not for fixing what you could do in a proper single prompt.
@PostMeridianLyf Рік тому
Its interesting that this is exactly what I was looking for. Llast night i spent a few hours asking copilot how to implement the same libraries. Thanks for the tutorial
@nathanl6598 11 місяців тому
No typescript and no copilot? This was a more wholesome time.
@Laowater 10 місяців тому
a Master in the Arts of coding!
@marcoaerlic2576 7 місяців тому
Thanks for the video. Great work.
@8COOL6 11 місяців тому
but the token authorization for use gpt-4 preview where is ?
@andrejuntermanns7660 Рік тому
I dont get the plus in funcionality compared to google in this demo. Help me out.
@evanlovett3553 11 місяців тому
What is the weather like in Alaska?
@silva8215 Рік тому
Couldn't you use backoff to handle the error when the API is stuck?
@LearnCode_withAI Рік тому
In package.json yku can set type : module
@terenceundbud 11 місяців тому
so you need to use gpt 3.5 turbo to get exact answers ijnstead of gpt-4? weird.
@TheChrisSoria 11 місяців тому
I still don’t have access to the vision API : (
@reuna4c3 Рік тому
This is so cool and nerdy! Maybe the best site to follow and learn more and more on OpenAI API. Difficult but entertaining to follow.
@Tyfeen Рік тому
what is the weather like in alaska?
@cutecute9189 Рік тому ⁺¹
This is awesome. I love your videos. Please keep these videos going specially this one. I learned so much
@unconv Рік тому
Thank you! More to come :)
@mt4u832 6 місяців тому
Very clever. Congratulation
@TarasKim 11 місяців тому
If you add something like "Strictly based on the information from screeshot" you get information based on the information he gets from screenshot.
@billybofh2363 Рік тому ⁺¹
A little speed up might be to use the python requests package to try and fetch the url first before running puppeteer - then short-circuit invalid domains, 404's etc? Also, when doing a completion you can pass `request_timeout=10` or whatever and it'll kill the call. Sometimes even works.... ;-)
@unconv Рік тому
Thanks, I'll try that. Yeah, you can set the request_timeout, but you still have to handle the error my having some recursive function that retries the request if it fails. And I don't have time to implement that. It would take like a minute, lol
@billybofh2363 Рік тому ⁺¹
I replied to this and youtube removed it (I think!) - but the python package 'tenacity' (or the original retry) is worth a look (I'll skip the url as I think that's what made youtube remove/hide my comment)
@TonyS1 Рік тому
What is the current weather in the world?
@edoardogribaldo2870 11 місяців тому
Crazy good content! Thank you!
@splashelot Рік тому
For getting Sam Altman's age, would it help if you stated that the screenshot is taken today? ChatGPT may be hesitant to assume this.
@digitalcivilulydighed Рік тому ⁺²
I'd like to see a video from you about navigating websites with Puppeteer. Now that you ask, I'd like a tutorial on how it follows links, fills out data, crawls four or more links deep into a website, how to handle session cookies, automate and run loops, etc. :-)
@albertwang5974 10 місяців тому
very interesting, thanks for sharing!
@yoyartube Рік тому
Great Video! Can these libraries handle auth like azure oauth flow in order to browse to the page?
@louisbertson 11 місяців тому
A good way is to include in user role message a timestamp. It will help him calculate the age of SAM Altaman easily!
@unconv 11 місяців тому
Yes, but only because he knows his birthday already (even without the Wikipedia screenshot)
@mohamedbasueny9476 Рік тому
i was wondering how this is different from the web-search capapblilty of chatgpt-plus right now .
in other words , if i asked gpt to look for an answer on the web will it struggle to do so ? ,
is this a hack way to use a better websearch via an api like method because it's not enabled yet in the openai dev tools .
any way i really like the video , can we use selenuim to do so also ?
@Y3llowMustang 11 місяців тому
I wouldn't call this easy web scraping, but this was very hilarious with all the bugs
@waneyvin Рік тому ⁺¹
is it possible to use selenium? at least it is python, you don't need to switch between 2 language.
@unconv Рік тому
Yes, it should work too. I just have more experience with Puppeteer (never tried Selenium)
@ddsmax Рік тому ⁺¹
You're already in javascript for puppeteer. Why do the gymnastics of writing your main logic in python?
@unconv Рік тому ⁺¹
That's a good point and in the next video I in fact switch to JavaScript only. I prefer Python, though
@eyoo369 Рік тому ⁺²
The Vision API downsamples the image.. thats why it cannot recognise small fonts.
@TeleV77_media 6 місяців тому
also i had checkout a patreon chat ( paid ). but now i am just unable to find it? it is gone?+
@unconv 6 місяців тому
I'm not on Patreon but I'm on BuyMeACoffee and you can find a link in the description
@TeleV77_media 6 місяців тому
@@unconv thankyou for the good job. i am improving and using it.
there are some pieces that doens work up to today and fixed them
@ScootLogix Рік тому
Great video dude. Im gonna rewatch later. I got a project this might help on.
@mysticminds1126 Рік тому
I appreciate your efforts mannn...
@foxdog9332 Рік тому
where do you put the openai key? I can't find anywhere to put it tried searching. Getting a billing not active error.
@unconv Рік тому
It grabs it from the OPENAI_API_KEY environment variable. You can set it on Linux by running "export OPENAI_API_KEY=YOUR_API_KEY" and if you're on Windows, I believe you can use "setx" or "set" instead of "export"
@uncleJuancho Рік тому
Great video! However, I noticed a few instances where you mentioned not having prior experience with certain tasks, but then you later showcased projects where the code was already complete. For example, at 9:29 in the video. This seems a bit contradictory and might confuse some viewers
@unconv Рік тому
Thanks! Which tasks did I say I didn't have prior experience with?
@uncleJuancho Рік тому ⁺¹
@unconv, this is my first time viewing a video on your channel. I observed that you started by looking through the documentation as if it was new to you, despite already having the answer in another file. This struck me as unusual, but I understand it might have been part of your process. When the documentation didn't seem to help, you referred to your existing project. I don't mean this in a negative way; it's just my personal observation from watching this video for the first time
@unconv Рік тому ⁺²
I've used Puppeteer multiple times in the past, but I never remember the boilerplate stuff. I didn't want to jump directly to my own previously written code, because I want to do things from scratch in my videos, not leaving out any steps. And I want to show how I go about researching stuff. But I get that it might have been confusing - although I suspect even more confusing if I directly copy pasted my old code.
@virdvird Рік тому
Make screenshot (do not close puppeteer session) and ask chatGPT is page looks loaded or not instead of relying on networkidle0, timeout, etc
@OBRosewell 10 місяців тому
hey man! I would need something like this posted onto a server of some sort, like AWS or Heroku. is that possible if i build this & deploy it? i need it to scale up for 1000 requests daily
@unconv 10 місяців тому
A lot of websites will block requests from AWS servers, so you would probably need some sort of proxy server in between.
@yolamontalvan9502 Рік тому
What is 4 Vision API?
@EduardsRuzga Рік тому ⁺⁴
Great video!
Interesting experiments with the GPT Vision API and Puppeteer. I have a couple of questions and a suggestion:
1. Could you share some insights on the cost aspect of using the GPT Vision API for this project? I'm curious about the pricing and whether it's feasible.
Also, have you considered combining classical web scraping methods with the Vision API in a synergistic way? Specifically, using traditional scraping to gather initial data and then employing the Vision API to verify or correct this data where needed. I think this could potentially address some of the limitations of both methods. What are your thoughts on this approach?Looking forward to hearing your thoughts!
@unconv Рік тому ⁺⁶
Thanks! On the day I filmed the video, my API costs were $0.58. The next day I maxed out the limit of 100 messages of the gpt-4-vision-preview while testing and the total cost for that day was $2.15. These costs include some other API calls as well, though.
Combining classical web scraping and Vision API seems like a good idea. I'll have to look into that when I run into an issue scraping something.
@gianmarcoferrara3397 Рік тому
You should try the JSON response mode. You can request to return a response like that in the system promp: {data: ExpectedDataInterface, error: ErrorInterface | null}. Good luck!
@thr0w407 11 місяців тому
The llm was wrong about what the light on the motorcycle means, since the headlight is ALWAYS on. A simple but important mistake.
@chromashift 10 місяців тому
BWAHAHAHAHA! the struggle (programming: errors = WTF!!!!) is real.
day in the life of code building...Awesome video!
@xsploit Рік тому
also i think this better suuited for assistants api. i made a private investigator that uses functions. one is serper api and if it finds a linkedin page crawls and de html it and send to get summzairzed with the link snippets ,then the other function is getting details on a image url you asked it to veiw using gpt 4 vision and i could make those functions paralell
@bogdanbogdan5276 Рік тому
Could you share more details, I'm trying to build similar functionality
@bkentffichter Рік тому
I made a drinking game out of the word Alaska. I died.
@erikaszvicevicius9191 Рік тому
First thank You. And question - how much token used this scraping method?
@unconv Рік тому ⁺¹
I haven't checked exactly but it seems to be around $0.017 per scrape based on my API usage during building this
@joebazooks Рік тому
I believe it’s not telling you his age because it is trying to provide you with a precise age i.e. his current age, given his birth date. Don’t ask what his age is, but what age the page or author of the page says he is
@kamalkamals Рік тому
14:50 i don't think that's a good idea because u will lose a lot of tokens (input, output), so it s better to use scrapping urls with vector store
@alon7110 Рік тому
Thank you for this helpfull video! can you please try the same task with the functions tool? Thanks!
@pourkin Рік тому
Excellent Job
@HaseebHeaven 11 місяців тому
Why you mixed Python + JS i dont see an requirement you could single programming language, Java script, or Python, and simply executed the same task with the single project
@markw7609 Рік тому
Can this work for Instagram scraping ?
@denzelcanvas5223 Рік тому
no. instagram doesnt allow repetitive actions.
@PolinomPolynets Рік тому
Is there a reason you don't use copilot?
@unconv Рік тому ⁺³
It often guides me to directions I don't want to go. Also, I'm still learning Python so I'd rather practice my memorization
@theoriginalrecycler Рік тому
Remove the word Like, ask what is the weather in Alaska. The question you ask leads to an answer such as “colder than a commercial freezer”.
@unconv Рік тому ⁺¹
Good point 😂
@alexeygrom1834 Рік тому
"In Alaska's land, where coders seek the weather's tale,
They type and query, 'neath the aurora's bright veil.
With every line of code, they ask the sky's mood,
Hoping for sunshine, but prepared for the cold and brood."
@la6188 Рік тому
Why not use everything in js? So confusing
@dreamphoenix Рік тому
Thank you.
@zeta_meow_meow 11 місяців тому
just kisses for you , so freaakin loved how you explained and debugged along us
@Alternativetips Рік тому
gpt4 vision api limits ?
@unconv Рік тому ⁺¹
100 requests per day
@cafeta Рік тому ⁺¹⁴
Why aren't you using the AI to help you code?🤔🤷
@Bartskol Рік тому ⁺⁹
I think that he wants to explain the code to us by writing. I use ChatGPT to write code as I'm not a programmer myself, but I find myself learning to code anyway because I still need to understand what I actually need. It's also tiring to pass every small error to chat; it's easier to make adjustments yourself. However, to do that, you need to understand the code at some level.
@unconv Рік тому ⁺¹⁴
I actually have Copilot but usually I disable it because it often guides me to directions I don't want to go. Especially when making videos, if Copilot suggests a different way than I was going to go, I get distracted. And I'm still learning Python, so I want to actually learn it. If I always use Copilot, I can get the job done but I probably won't memorize the syntax.
@-Jason-L Рік тому ⁺¹
@@unconvI think he meant let chatgpt generate the entire code, not copilot.
@yungjerky 11 місяців тому
Because fully AI generated code is unusable
@AIPulse118 10 місяців тому
@@yungjerkynot anymore it isn't. Never used Grimoire?
@AuditorsUnited 10 місяців тому
im not going to watch an hour of you coding but i will share that you can get a image of each element and selenium would problem be a good choice to use in this
@MrCaovang Рік тому ⁺¹
Thank you for the Video.
But the way you re-typing the question (instead of copy and paste it) make me frustrated 😖
@unconv Рік тому
Sorry about that 😄
@RonivaldoPassosSampaio Рік тому ⁺³
Nice content, but you should just copy paste the code, we know you can code well behind the scenes, don't worry. Keep doing great!
@murch5054 Рік тому
I see that 0420 there... in 00:31:50 : )
@TheBeefiestable Рік тому ⁺¹
If humans didnt all re-invent the wheel every hour, there would be a huge database of every query : response : list of problems : links to solutions if they ever figured it out , that would save humanity unlimited man hours... but probably put openai out of business
@guitaripod Рік тому
Cool video
@xsploit Рік тому
i would just using the scraping way and dehtml it. ive never seen seen someone with so much problems calling api
@avi7278 Рік тому
great video, just one suggestion, the repetition of what you're typing literally every time is a bit much.
@unconv Рік тому
Thanks! I'll try to avoid that in the future (and mistakes leading to repetition in general)
@sniegu84 Рік тому
to have productive programming ai has to return what you want in 100% cases. it has to be better than human in deduction.
@AlfonsoMenkel Рік тому ⁺¹
Grate video, at last I see on YT someone that struggles with the API as I do…
I know the topic of the video is to use the vision api, but you cold get better results using a terminal web browser like lynx , piping the result to a Tex file and asking ChatGPT with that text as context.
Just an idea. 😉
@unconv Рік тому
I was gonna dismiss your suggestion by saying one does not simply use Lynx in 2023 since it doesn't support JavaScript, which many websites require nowadays. But testing it out just now, all the examples I showed in this video could have worked with Lynx (based on its output). I don't know how I would extract links and input fields with Lynx, though, to make it crawl subpages. Perhaps all those pages were server side rendered, so I might as well have used Curl.
@HolyG2k6 Рік тому
"Hopefully this is not a Malware" :D :D
@User_1795 11 місяців тому
This could be the best kodi addon ever
@PDragonLabs 10 місяців тому
👍
@nitestrykerx01 Рік тому
Seems very inefficient to do it that way, yes it’s and interesting concept but you can do it all in Python and your logic can be simplified to get results
@_nom_ Рік тому
It's not hard to make a scraper. In fact you probably only need to use a http request, not a full on instance of chrome.
@chameeragamage1526 Рік тому
Want more
@aviralpatel2443 10 місяців тому
bro sounds like an AI. Good video tho
@Flameandfireclan Рік тому
Instant fork, all your code belong to us
@mnageh-bo1mm Рік тому
aaaaa this was frustrating as hell
@深夜酒吧 10 місяців тому
you would bankrupt if you use gp4 vision api scrap web.... just link your credit card and start scraping
@qasurfer Рік тому
coding 😛
@hidroman1993 Рік тому
Using the seed as if it was a hyperparameter shows how little you know about the stuff you're talking about, congrats!
@unconv Рік тому ⁺²
I mean, if you know more about it than me, you could maybe explain further or link to some more information about the subject
@ZweiBein Рік тому
@hidroman1993 What a stupid reply, guide him at least if you know better...
@mibaatwork Рік тому ⁺¹¹
It is intolerable how badly you prepared for the video. You can't teach people like that.
@unconv Рік тому ⁺⁸
This isn't Unconventional Teaching
@alqods80 Рік тому ⁺⁹
It is a more natural way as a developer, it is much better that way, learnt debugging
@noahgottesla3439 Рік тому ⁺¹²
This is definitely the practical way to watch and learn. I like your style. You are showing the humanity of future coding
@itheenigma Рік тому ⁺⁵
I love this approach - similar to how good developers actually code. Keep it up unconv
@JT-Works Рік тому ⁺²
Meh, he is teaching how to troubleshoot. If you want direct directions just read the API documentation.

Наступне

Автоматичне відтворення

Using Ollama To Build a FULLY LOCAL "ChatGPT Clone"