I have watched millions of videos on UA-cam, Udemy and other platforms but this was probably the best in terms of information, fun and engagement. Incredible work man, Cheers!
New versions of puppeteer no longer work in google cloud functions due to dependency issues. So now you need puppeteer to be in a container unfortunately... The puppeteer docs recommend cloud run
Thanks, but could you do a video that shows how to use these functions from within say, a React project? For example, I'm working within an existing React project and used firebase init within the root dir, which creates the functions folder. The firebase deploy command looks for functions inside that folder, and the syntax you show here won't work. Any help? Would be an awesome vid, thanks!
Very interesting. Good explanations. However, I have poked at a few of your other videos and haven't found a reference to your kickstart script yet. Did you show the whole thing around 1:02 of this video, or just a snippet? Thanks!
Yeah it's very short and I have a few for different types of projects (e.g. rust, typescript). You can check it out here twitter.com/jsoverson/status/1097881605824237569?s=19
@@jsoverson cool. Thanks. Doesn't seem to work on Debian under WSL (Windows Subsystem for Linux), but that's ok. I get the intent and idea of it and can poke around with it when I get the chance.
That what it looks like. I'm currently having a problem where it runs the same request twice at about the same time. The only neat thing about that is that it seems to show that that one instance is used to run both queries.
Can I use the same idea for running puppeteer on cloud functions with a logged area like GMP (Google Marketing Platform)? I have a local script in nodejs which does this job, but I need to run manually every day
Jarrod, have you ever run into this error when running puppeteer on a function? Timed out after 30000 ms while trying to connect to Chrome! The only Chrome revision guaranteed to work is r674921. I have tried to adjust the runtime environment between 8 and 10 and also changed the args to all the examples on the API. I get the same error no matter what. Any help would be great! Thanks again!
@@planetmall2 No I haven't seen that yet. 1.19 was what I used in the video and that worked (I just tried again). Maybe chrome isn't starting properly for some other reason? Is this new behavior or the first time you've tried it? Try some of the following args: --disable-gpu, --disable-dev-shm-usage, --disable-setuid-sandbox, --no-first-run, --no-zygote.
did you find a solve for that by now? I'm having the same one. It seems like this is a common issue if you check on github: github.com/puppeteer/puppeteer/issues/4796
@@darekdede1995 Rather than using Google Cloud services I ultimately just used AWS which I'm used to. If you're interested in the AWS Lambda route for using Puppeteer see: github.com/alixaxel/chrome-aws-lambda
I have the browser running with headless: false so that I can see whats going on and it seems like it is launching the browser twice using the same query: Am I doing something wrong? If I remove the res.end()/.send() it only runs once but then I am unable to deploy... exports.zillowScraper = async (req, res) => { const url = req.query.location ?`www.zillow.com/homes/${req.query.location}` : "www.zillow.com/homes/92259/" const browser = await browserPromise const context = await browser.createIncognitoBrowserContext() const page = await context.newPage() await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36')
Hello Jarrod! How do I display, in my index.html, a return message from a function (using puppeteer) from another .js file? For example: I have an index.html page, and an index.js. In index.js I have a function that returns data taken from any page using Puppeteer. And I would like to display this message as a on my index.html page. Thanks for your attention and great videos. * sorry for gooleTranslate's english
@@jsoverson I can return the data anyway. About the app, it's simple, there's only one .js file where I use puppeteer to enter, log in and extract data. And I wanted to display this data in an html page. Is there another better method of doing this?
@@leopoldocouto you can return html directly from the function if you want to. You can treat it like a normal express app so you can use any template system you want (like handlebars). If you are returning it as json you will need a separate html file to call your function via XHR or Fetch and render it that way. If I've misunderstood the question, can you upload an example of the problem to GitHub?
😖 I've been traveling and it's beating me up. I probably am still one week out until the next one. What should it be on? I've got puppeteer stuff, deepfake stuff, reverse engineering, GCP stuff, attack tools. There's a lot on the backlog.
I have watched millions of videos on UA-cam, Udemy and other platforms but this was probably the best in terms of information, fun and engagement. Incredible work man, Cheers!
Puppeteer on cloud functions + the devtools protocol can allow you to turn clever devtools hacks into a full HTTP API endpoint.
New versions of puppeteer no longer work in google cloud functions due to dependency issues. So now you need puppeteer to be in a container unfortunately... The puppeteer docs recommend cloud run
Please make more videos on this subject. Like cloud functions with parameters and issues with CORS.
Thank you Jarrod! It works for me and ur save my day! I realize that we are not always suppose to use latest versions
Great video. Thank you for continuing to make these!
Thanks, but could you do a video that shows how to use these functions from within say, a React project? For example, I'm working within an existing React project and used firebase init within the root dir, which creates the functions folder. The firebase deploy command looks for functions inside that folder, and the syntax you show here won't work. Any help? Would be an awesome vid, thanks!
Very interesting. Good explanations. However, I have poked at a few of your other videos and haven't found a reference to your kickstart script yet. Did you show the whole thing around 1:02 of this video, or just a snippet? Thanks!
Yeah it's very short and I have a few for different types of projects (e.g. rust, typescript). You can check it out here twitter.com/jsoverson/status/1097881605824237569?s=19
@@jsoverson cool. Thanks.
Doesn't seem to work on Debian under WSL (Windows Subsystem for Linux), but that's ok. I get the intent and idea of it and can poke around with it when I get the chance.
I could be totally wrong, but would it not be important to close the browser and context resources to keep billing costs down?
Do you need the close the browser when doing this? As well or the function will do it when the response is returned
How long will the Puppeteer instance be running? So even if we execute many cloud functions in parallel, those all use the same puppeteer instance?
That what it looks like. I'm currently having a problem where it runs the same request twice at about the same time. The only neat thing about that is that it seems to show that that one instance is used to run both queries.
Can I use the same idea for running puppeteer on cloud functions with a logged area like GMP (Google Marketing Platform)? I have a local script in nodejs which does this job, but I need to run manually every day
Jarrod, have you ever run into this error when running puppeteer on a function? Timed out after 30000 ms while trying to connect to Chrome! The only Chrome revision guaranteed to work is r674921. I have tried to adjust the runtime environment between 8 and 10 and also changed the args to all the examples on the API. I get the same error no matter what. Any help would be great! Thanks again!
I have but not recently. What version of puppeteer+chrome? Is it a bundled chrome? Is this locally or on GCP?
@@jsoverson puppeteer: "1.19.0" - I am using it with a Firebase function. It looks like the internet has recently had that same issue.
@@planetmall2 No I haven't seen that yet. 1.19 was what I used in the video and that worked (I just tried again). Maybe chrome isn't starting properly for some other reason? Is this new behavior or the first time you've tried it? Try some of the following args: --disable-gpu, --disable-dev-shm-usage, --disable-setuid-sandbox, --no-first-run, --no-zygote.
@@jsoverson I tried a few different versions for the args with no luck. I will continue to diagnose.
did you find a solve for that by now? I'm having the same one. It seems like this is a common issue if you check on github: github.com/puppeteer/puppeteer/issues/4796
can i make it without headless mode?
You sure you don't want to close both the context and browser? My script ends in a blank tab if i just close the context.
Super informational video, great stuff!
Thanks River!
Jarrod, you are simply the best.
The github repo is not accessible. Can you please check?
how can we determine what an appropriate memory size for our function will be?
This is scary powerful.
This runs fine locally but I get Error: could not handle the request when I go to the url in the cloud
Sorry, my mistake. I didn't have billing enabled.
@@darekdede1995 Rather than using Google Cloud services I ultimately just used AWS which I'm used to. If you're interested in the AWS Lambda route for using Puppeteer see: github.com/alixaxel/chrome-aws-lambda
You're awesome ! Keep it up. Subscribed your channel. Please post more videos on puppeteer.
NICE! got mine to work too!
Thank you my friend
I have the browser running with headless: false so that I can see whats going on and it seems like it is launching the browser twice using the same query:
Am I doing something wrong? If I remove the res.end()/.send() it only runs once but then I am unable to deploy...
exports.zillowScraper = async (req, res) => {
const url = req.query.location
?`www.zillow.com/homes/${req.query.location}`
: "www.zillow.com/homes/92259/"
const browser = await browserPromise
const context = await browser.createIncognitoBrowserContext()
const page = await context.newPage()
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36')
zillowScraper(page, url, context)
res.end()
}
Hello Jarrod!
How do I display, in my index.html, a return message from a function (using puppeteer) from another .js file?
For example:
I have an index.html page, and an index.js.
In index.js I have a function that returns data taken from any page using Puppeteer. And I would like to display this message as a on my index.html page.
Thanks for your attention and great videos.
* sorry for gooleTranslate's english
Can you elaborate? Like via AJAX on a regular webpage?
@@jsoverson Im trying tô develop. The code in puppetter os ready. Now, i have to display the function return in HTML. But i dont know how
That's a tough problem to solve without knowing much of the app. Are you returning the data as json?
@@jsoverson I can return the data anyway.
About the app, it's simple, there's only one .js file where I use puppeteer to enter, log in and extract data. And I wanted to display this data in an html page. Is there another better method of doing this?
@@leopoldocouto you can return html directly from the function if you want to. You can treat it like a normal express app so you can use any template system you want (like handlebars). If you are returning it as json you will need a separate html file to call your function via XHR or Fetch and render it that way. If I've misunderstood the question, can you upload an example of the problem to GitHub?
Jarrod, tens este vídeo com legenda para a língua Portuguesa do Brasil? Muito obrigada!😉
THANK YOU
Wow no need of express!?
Your subscribers (I) are hungry for content!! Haha xD
😖 I've been traveling and it's beating me up. I probably am still one week out until the next one. What should it be on? I've got puppeteer stuff, deepfake stuff, reverse engineering, GCP stuff, attack tools. There's a lot on the backlog.
@@jsoverson Good question. I'll give an answer to that question shortly.
@@jsoverson attack tools or puppeteer stuff, personally I think.
"A Gigabyte of RAM Should Do the Trick"