This is How I Scrape 99% of Sites

Easy Web Scraping with Playwright and AI - Tutorial

Selenium Web Scraping is too Slow. Try This.

СКАНДАЛЬНЫЙ бой Али, когда в ринге ему противостояли сразу ДВОЕ #shorts

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

Wall Rebound Challenge 🙈😱

Scraping with Playwright 101 - Easy Mode

John Watson Rooney

Переглядів 16 447

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 січ 2025

КОМЕНТАРІ • 32

@robertramirez2167 9 місяців тому ⁺⁶
I like that image blocking tip!
@bigoper 7 місяців тому ⁺¹
This is awesome!!
As an API Security Specialist, I always start by looking at the HTTP calls, searching for an API call that might have that same info. Saving me time from scraping the page. Most of the time I’m having success with that approach, especially when dealing with solid companies/websites/platforms.
@graczew 9 місяців тому ⁺¹
Good content as always. Enjoy your Easter break 😉👍
@alexanderkomanov4151 9 місяців тому ⁺²
Great one!
I think that using pytest-playwright package can save several lines of code in the initialization part, because you can just use the page:Page fixture
@NomadicDmitry 5 місяців тому ⁺¹
Really great tutorial! Thanks, John!
@bgriffin5447 5 місяців тому ⁺¹
That split move was nice
@Extrey 9 місяців тому ⁺¹
Nooooo waaaay, i just found schema on another websites, nice trick anyway, but i find it more efficient to read the info from the category pages. Thanks for your videos, they always inspire me!!!
@mohsinhassan88 9 місяців тому ⁺³
Omg why the white editor??
@рнт 9 місяців тому
Exactly. When I saw it I immediately remembered this video: ua-cam.com/video/XlgqZeeoOtI/v-deo.html 😂
@tendosingh5682 9 місяців тому ⁺²
For some its easier on the eyes. MY eyes cant stand the dark themes.
@mohsinhassan88 9 місяців тому
@@рнт exactly how I felt. And specially since John usually has amazing videos and everything is so perfectly balanced in terms of theme and ease on eyes.
I was a super shock
@elu1 9 місяців тому
Thank you John for the teaching. I seem to have issue with Xvfb for running 'headless'. Any suggestion or resources that I can learn from?
@user-wu4ip7mp3z 8 місяців тому ⁺¹
I'm following this exact code in VSCode and only the initial web is opened, it doesn't open the subsequent pages that direct to each of the product, no idea how to fix this...
@user-wu4ip7mp3z 8 місяців тому
nvm, fixed it, turns out the data-selenium=...GridView... has been changed to [data-selenium='miniProductPageProductNameLink']
@donaldandmijung 4 місяці тому
really well explained! is there a way to run the loop in the original browser? say if were only interested in the first page of the pagination and the products on only page 1.
@fredde7356 9 місяців тому ⁺¹
Hey John, can you please continue the scraping livestream with your test site? 😃
Would love to see how to handle the drop-down menus, Java script and how to handle stricter cloudflare rules
Would be happy to hear about some news! Enjoy easter :)
@munchcup 9 місяців тому
On cloudflare One idea is usually using undetected chrome driver to avoid cloudflare and you can put delay while logging in to solve the captchas the first time and save the cookies. After that you no longer need to solve captchas it will be automatic.
@carloiurcovici 9 місяців тому ⁺¹
Thank you John, I've been really enjoying your videos recently and applying everything at work where it comes in really handy. Would you consider creating a python/scraping course on Udemy or a similar platform?
@JohnWatsonRooney 9 місяців тому
thanks for watching. I have thought about creating a course but no serious plans yet i;m afraid
@carloiurcovici 9 місяців тому
@@JohnWatsonRooney thanks for the reply, if you change your mind you got my money 😂
@IshaqKhan010 9 місяців тому ⁺¹
sir can you make a video how to deploy playwright script on google cloud function / vpc please
@danueecitizen 9 місяців тому
can this work with amazon ? 🤔
@s6yx 9 місяців тому
Can’t you just do viewpoint for setting a screen size and header and run it headless with no issue
@archiee1337 7 місяців тому
why not headless?
@alexdin1565 9 місяців тому
Thanks john, but now days most websites don't allow you to open links like you do they will block you after 3 or 4 pages open in same time
another question If you can make a video on how we can use playwright inside a docker with proxy to make many requests at same time it will be very nice
sorry for my English, I'm not a native speaker
@badrenanna3961 9 місяців тому ⁺⁵
can you please start talking about some difficult cases :
- scraping a website that has cloudflare protection against bots (even using proxy rotation it didn't work)
- scraping website that have captchas protection
..
Thank you
@munchcup 9 місяців тому ⁺³
One idea is usually using undetected chrome driver to avoid cloudflare and you can put delay while logging in to solve the captchas the first time and save the cookies. After that you no longer need to solve captchas it will be automatic.
@pkavenger9990 6 місяців тому ⁺²
Your content is good but i think you should engage with your audience more instead of speaking like you are talking to yourself. You will see that you will get much more views. Take Gotham chess channel for example he is not a Grandmaster of chess but His channels have more views and subscriber than Hikaru and Magnus because of his communication skills.
@JohnWatsonRooney 6 місяців тому ⁺¹
Fair point thanks for the advice

Наступне

Автоматичне відтворення

This is How I Scrape 99% of Sites

This is How I Scrape 99% of Sites

Easy Web Scraping with Playwright and AI - Tutorial

Easy Web Scraping with Playwright and AI – Tutorial

Selenium Web Scraping is too Slow. Try This.

Selenium Web Scraping is too Slow. Try This.

СКАНДАЛЬНЫЙ бой Али, когда в ринге ему противостояли сразу ДВОЕ #shorts

СКАНДАЛЬНЫЙ бой Али, когда в ринге ему противостояли сразу ДВОЕ #shorts

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

Wall Rebound Challenge 🙈😱

Wall Rebound Challenge 🙈😱

Сестра обхитрила!

Сестра обхитрила!

Scrapy in 30 Minutes (start here.)

Scrapy in 30 Minutes (start here.)

7 Steps to Never Get Blocked when Web Scraping (From Data Extraction Expert)

7 Steps to Never Get Blocked when Web Scraping (From Data Extraction Expert)

Login and Scrape Data with Playwright and Python

Login and Scrape Data with Playwright and Python

10X Faster Testing?! Playwright vs Selenium

10X Faster Testing?! Playwright vs Selenium

EASIEST way to web scraping using Playwright!

EASIEST way to web scraping using Playwright!

Best Web Scraping Combo? Use These In Your Projects

Best Web Scraping Combo? Use These In Your Projects

The most important Python script I ever wrote

The most important Python script I ever wrote

Cleaning up 1000 Scraped Products with Polars

Cleaning up 1000 Scraped Products with Polars

The Biggest Issues I've Faced Web Scraping (and how to fix them)

The Biggest Issues I've Faced Web Scraping (and how to fix them)

Сестра обхитрила!

Сестра обхитрила!

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

"ХИТРЕЦ": Трамп РОЗЛЮТИВ Скабєєву / Оля ЛИЄ ЯДОМ #shorts

"ХИТРЕЦ": Трамп РОЗЛЮТИВ Скабєєву / Оля ЛИЄ ЯДОМ #shorts

СКОЛЬКО ИХ...?! #Shorts #Глент

СКОЛЬКО ИХ...?! #Shorts #Глент

Они Скупали ВСЁ Серебро Мира и вот ЧТО Было Дальше! #shorts

Они Скупали ВСЁ Серебро Мира и вот ЧТО Было Дальше! #shorts

1% vs 100% #beatbox #tiktok

1% vs 100% #beatbox #tiktok

😳Трамп ПОТІШИВ Скабєєву, але одразу РОЗЧАРУВАВ #shorts

😳Трамп ПОТІШИВ Скабєєву, але одразу РОЗЧАРУВАВ #shorts

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)