Comprehensive Python Beautiful Soup Web Scraping Tutorial! (find/find_all, css select, scrape table)
Вставка
- Опубліковано 31 тра 2024
- Practice your Python Pandas data science skills with problems on StrataScratch!
stratascratch.com/?via=keith
In this video we walk through web scraping in Python using the beautiful soup library. We start with a brief introduction to HTML & CSS and discuss what web scraping is. Next we start getting into the basics of the beautiful soup library. This includes how to load a webpage, the basic commands you need to know such as find & find_all, grabbing strings from an HTML elements, etc. The final section of this tutorial is a series of exercises where you can practice your skills. In this section we scrape a webpage for links, we learn how to scrape a table and load it into a pandas dataframe, and we see how you can scrape & download a web image. Hope you enjoy!
I’m looking into making future videos on more complex things you can do with web scraping as well as other libraries that are helpful such as Selenium & ScraPy. Subscribe to not miss those.
Join the Python Army to get access to perks!
UA-cam - / @keithgalli
Patreon - / keithgalli
---------------------
Resources used in this video
Simple webpage: keithgalli.github.io/web-scra...
Example webpage: keithgalli.github.io/web-scra...
Link to source code: github.com/KeithGalli/web-scr...
Beautiful Soup Documentation: www.crummy.com/software/Beaut...
CSS Selector Reference: www.w3schools.com/cssref/css_...
---------------------
Learn more about HTML/CSS
@Traversy Media HTML Crash Course: • HTML Crash Course For ...
@Traversy Media CSS Crash Course: • CSS Crash Course For A...
Codecademy: www.codecademy.com/catalog/la...
---------------------
Video timeline!
0:00 - Intro & Video Overview
1:09 - What is web scraping?
3:51 - Introduction to HTML
Using the beautiful soup library (5:29)
6:31 - Loading in a webpage (requests library)
8:21 - Starting to scrape
9:18 - find & find_all methods
16:00 - Finding specific text/strings in our HTML (regex)
18:38 - Select method (CSS path selections)
25:55 - Grabbing the string/text from an HTML element
28:17 - Getting a property of HTML element (href, src, id, class, etc)
29:41 - Code navigation (parents, children, siblings)
Let’s practice our skills! (33:57)
35:53 - Exercise #1: Grab all social links on webpage in 3 different ways
42:09 - Exercise #2: Scrape an HTML table into a Pandas Dataframe
53:09 - Exercise #3: Grab all fun facts that contain the word “is”
57:59 - Exercise #4: Use beautiful soup to help download an image from a webpage
1:04:20 - Exercise #5: Solve the mystery challenge!!!
---------------------
Follow me on social media!
Instagram | / keithgalli
Twitter | / keithgalli
---------------------
If you are curious to learn how I make my tutorials, check out this video: • How to Make a High Qua...
*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.
Shouts to Keith for giving us all an MIT education without the MIT debt
Haha I took one for the team xD
haha
Haha. That was a good one.
@@KeithGalli how to start never been good in math 50 years old sitting at home? thnx;-)
@@KeithGalli since breaking bad minivans are you know swag 😉
I paid a bootcamp for learning. But Keith you are way above all that. I understood the concepts from your video only. I owe you man!! Keep going and please don't stop putting up such videos.
I appreciate the support! Happy that the videos are helpful
I love that you have exercises for us to do in the videos! Learned so much from this.
This tutorial was incredibly helpful! Web scraping is something I've always found interesting but just hadn't been bothered to start learning, yet this video made it easy to understand and covered a huge range of ways to deal with potential problems. Seriously can't thank you enough for this video and will certainly be sticking around for any new tutorials you upload.
I have watched a couple of other videos on BeautifulSoup but believe me this one from Keith is the best one. Keith will take you from scratch to a decent level. Thank you so much.
Only a third of the way through this video and I already feel like I understand this better. Thank you, brand new at this
Keith, many thanks for giving us too many excellent information about hard topics. You do the things seem totally simple to do. Sincerely, your tutorials are the best. Again, thank you so much for sharing all of this with us.
Your tutorials are the best, honestly. Thank you so much for doing this.
Glad you enjoy them!! You're very welcome :)
This is one of the finest videos i have ever seen on training. You are an amazing trainer and most importantly you are explaining things in very simple english, also with examples or exercises that would give an hands on experience for viewers......thanks.
Keith, your videos are excellent. You are totally getting me through grad school just watching your tutorials. Keep it up!
Thank you Keith, amazing content, easy to follow, clear explanition, great exercices (with walkthrough) and love the funny breaks/comments during the video. Followed and like
Keith you'll be the first one I cite when I write my nobel prize winning book or whatever it is nobel prize winners write. Golden content. Gracias!
The Best thing about your tutorial are that you start from scratch and teach basic and explain each fragment of code with concept. Love from India.
One of the best web scraping contents I've seen to date. But the ending was hilarious!
Another fabulous real-wold tutorial. Thanks for the google and stack overflow searches and the errors with recovery!
This is such a great tutorial ! I loved being able to pause and figure out the problems on my own. I really learned a lot! Thanks Keith, you rock!
did the same 🤓🤓
i am from india . we really dont get this quality stuff here.. so thanks to youtue and you.. for spreading wonderful knowledge.. keep rocking !
Thanks a lot! Your video are clear and pretty useful! And it’s a joy watching them! I’m glad that I found your channel ✨
This is the best web scraping tutorial. Thank you so much!
The best python video i have ever seen. No wasted words, dive into the important topic. Lol, great!
Wow, really impressive. One of the best channel ! Keith you are very clear with your explanations.
Thank you for sharing your knowledge :)
Hi Keith,
I'm really excited to watch this video. Actually, I used to watch your all Python-related videos, especially the Pandas one.
Keep going, and I hope to meet you one day.
THANKS, A LOT
Brilliant, amazing channel. Major kudos to you Keith!!
CV Update: Web Scraping expert.
Joke aside what an awesome tutorial. Felt so satisfying to get the secret message with what you taught!!
Brilliant work!
Haha love to hear it! I had a lot of fun putting this one together, so I'm happy to hear that you enjoyed it :)
@@KeithGalli your tutorials are really appreciated! thanks man :)
one of the best beautiful soup videos, and really want to say thanks! Keith
You saved my life, I hope you're getting all the beautiful things in life you deserve
Please do a Seaborn Tutorial ! like you did with Pandas, Matplotlib etc. I watched all of them, really glad i found your channel. Simple, informative & on point.
@Lucas agree, Derek Banas has a great Seaborn tutorial at his channel!
If you know matplotlib you know most of seaborn, its a matplotlib wrapper. all matplotlib methods work in seaborn too
I am very glad that I found your videos. I learnt more from you than all other tutorials combined. Please do a tutorial on xlwings. Thank you
Value for time invested in watching your videos. Along with the subject knowledge, we understand how to practically approach a problem. Thanks a ton for sharing your knowledge.
This is by far the best tutorial I have found after searching through the internet for hours. I subscribed just because of this one great video. Please keep doing videos of practical applications of Python. Project tutorials are the best.
absolutley
The last time I tried to understand BeatifulSoup I gave up. You explain it so easy to understand. Thanks for the hard work and the time you spend on teaching us :)
Love to hear it! You are very welcome :)
Me too! It's almost like Keith is a godsend
@@rahuldavid4831 he sure is :)
Thank you Keith! This is the best video that i watch about bs4. 👏👏
Beautiful video Kieth. Loved it.
Well done! First class of Web Scrapping! Awesome
Thanks for sharing this! I am mostly just popping in to learn, but this is helping me know how to think about data & see that there are a lot of options.
Your video really help a lot to understanding the Beautiful Soup, thank you, Keith!
As someone earlier said, Big SHOUT OUT to Keith for getting the community such amazing content!
Man you save my life. your tutorials are amazing.
You have a lifetime sub from me. Been looking for videos like this for a long time. Keep up with the great content!
Thank you so much for this wonderful tutorial Keith! Words cannot describe how much I am grateful to you for making this gem of a video that covers everything you need to successfully scrape a webpage! Trust me when I tell you that NOBODY HAS MADE A BETTER VIDEO ON BEAUTIFULSOUP than you!!! If I could have the liberty of suggesting future videos, I would love if you made a video about "Regular Expressions". Keep up the good work and God bless!!!
Very happy to hear you enjoyed!! A regex video is a great idea :)
This tutorial was incredible. I've done 2 Python courses that touched the 'Web Scraping' subject, but I wasn't able to fully understand it. This video was one of the two videos that made me fully understand it, and I couldn't be more happy about it. And finding out the secret message was amazing too :D
wanna share the other video you found helpful ? :)
Wow path navigation is so powerful! Thanks for this!
Oh man.. your tasks are excellent. It helped me to get a better confidence in working with soup..
Subscribing, coz I loved it.
Glad I found you @keith.
Exploring your channel now.
Appreciate the way you did it so perfectly making it simpler to understand for me.
Thank you for the video!
You explain things so clearly
This is a very will organized web scrapping tutorial. Thanks for sharing.
by far the best tutorial on youtube for web scraping. you are very good at dumming it down, even total beginner can even understand.
waiting for NLTK tutorial.
thank you
Glad you enjoyed it!
Amazing.
Thanks Keith!
Looking forward to the Selenium and scrapy series.
You're welcome!
@@theduck3126 Try John watson Rooney channel.
He's got everything covered.
When keith do it, its perfect 🤩
Aww I appreciate that 😊
I like how you make simple stuffs that were really scary. Bravo man.
Yes! Awesome tutorial dude. Looking forward to your next web scraping video. Cheers!
This is a fantastic tutorial. When I last tried to learn beautiful soup, we were in the awkward transition phase between python 2 and 3 and every tuturial was in python 2 because they hadn't released code for 3 yet. I learned 3 because it was "the future". Of course, I then wanted to use BS so I had try and figure out what I wanted to do in python 2. I gave up in total frustration. This is a crystal clear guide and now I actually understand how it works and how to use it. Thanks Keith!
Happy that this tutorial could clarify the details and remove some of that frustration! :)
Hey this tutorial is great! I've been looking for a decent one like it for some time now and I can't believe it took the algorithm this long to show this on my recommended page
Great video .. and I watched A LOT videos about beautiful soup. Keep going with the series
best most concise and detailed tutorial on bs
Man watching that ending was almost like watching Jack sink, beautiful ending!! keep it up man, great content
This was a fun video! Thank you Keith Galli.
This guy deserves the world
Thanks for the tutorial Keith! Keep up the great work
Thank you Keith, love your tutorials ! I was able to solve the last exercise :D
Really learned a lot. Loved the exercises.
Thanks for great tutorial, Keith!
Love your videos im watching them nonstop...thank you❤️❤️
Very very good video and great exercises specially last one.
Thanks for such videos
Very good video Keith! Very clear and useful. Thank you.
Son increibles tus videos!! Gracias Keith
You are perfect ! You know how to teach. Thank you so much man. Liked your style, and got the subject i have been struggling. Liked and subbed.
This is exactly what I was looking for. Thumbs up Keith for this awesom video :-)
Amazing Lecture. Here we understood Everything. Thanks a lot Broo 🔥👍🙂
Great video and well done. I learned a lot from it. Thanks Keith!
Hi Keith, i second many of the viewers comments. Your tutorial on web scraping is by far one of the best ones out there. Thank you so much for producing this. I do have a question though. Hope you can help clarify, I've not had much success googling. Can you clarify what the difference between select function vs. find_all function? when would you use one over another?
Thanks Keith. A really great video. Keep them coming, really useful videos I am learning a great deal from you. Many thanks.
wow a fun exercise !! Have a great fun , Next one is the Pandas One
Wow! This is just too good. Thanks for the video Keith
How do you know you’ve learned something ?
Completing the challenge within 1 minute no hints. Thank you so much for all your efforts :)!
so surprised to find treasure youtuber here, will go through all your perfect tut in my summer holiday, hope that u will gain more and more subscribers~
Thanks alot for the comprehensive tutorial! Really appreciate it
SEMRUSH For Beginners!! Excellent Video.
Bro you're great at these videos. Keep it up. I'm very glad I found your channel and I'm learning a lot from you.
Regarding the task of getting the "is" from fun-facts, you can get them by this simple one liner:
[li.get_text() for li in webpage.select('ul.fun-facts li') if 'is' in li.get_text()]
no regex, no extra loops... just plain string methods with list comprehension!
LOL, loved the secret message. Great work, thanks for the video
As usual.... Awesome Tutorial!!!
It was my first time learning from you and I must say it was pretty awesome:-)
The best videos! Love your videos and way to present the ideas.
i wish i had a cool teacher like you
One spot for all my Python needs. Thanks Keith! ; )
of course i will smash that button!! Sos un crack amigo, gracias por la buena onda y dedicacion!
The idea with the secret message was super cool!) You've got that like! Well deserved.
Glad you enjoyed it! I had fun setting that up :)
Thank you so much for your tutorials. You are doing great!
Awesome video av always! Would love to see tutorials for selenium and scrapy aswell. Also PyTorch and Seaborn would be very interesting to learn more about! Your videos are soo easy to follow and learn from :)
Thanks for another great tutorial Keith :)
we just love your content.. u taught me pandas very well....!!!
Love to hear that! :)
That's the data scientist's way to tell "Like and Subscribe ". Thanks for sharing knowledge!!
52:30 Not sure if this was posted before but this works for the duplicates. Thanks for all the help Keith!!!
import pandas as pd
table = soup.select("table.hockey-stats")
df = pd.read_html(str(table))[0]
df
i learned numpy ,pandas and other things from ur play list. i was strucked for the past 3 days in webscraping i watched a lot of yt videos bt i coudnt understand as ur content...Thank you so much brother :D . Now i hit(smashed) the bell icon too...
Awesome glad this video could help clarify some of the confusion you had. Thanks for smashing the bell icon! xD
Thank you so much. This is very helpful.
Great tutorial, thanks!
Dang! that was a good tutorial. I love you Keith, sincerely.
Thank you so much for doing this.😍😍
Are you god? I have a simple approach to your videos. I like them first, then I watch the video. Thanks a lot man, you and few others youtubers are going to put universities out of business
Amazing video! Really helped me! Thank you!