Webscraping with Python How to Save to CSV, JSON and Clean Data
Вставка
- Опубліковано 10 лют 2025
- Join the Discord to discuss all things Python and Web with our growing community! / discord
This is the fourth video in the webscraping 101 series, aimed out how to export out scraped data to json and csv, along with some simple data cleaning pipelines.
This is a series so make sure you subscribe to get the remaining episodes as they are released!
If you are new, welcome! I am John, a self taught Python (and Go, kinda..) developer working in the web and data space. I specialize in data extraction and JSON web API's both server and client. If you like programming and web content as much as I do, you can subscribe for weekly content.
:: Links ::
Recommender Scraper API www.scrapingbe...
My Patrons Really keep the channel alive, and get extra content / johnwatsonrooney (NEW free tier)
I Host almost all my stuff on Digital Ocean m.do.co/c/c7c9...
I rundown of the gear I use to create videos www.amazon.co....
:: Disclaimer ::
Some/all of the links above are affiliate links. By clicking on these links I receive a small commission should you chose to purchase any services or items.
I really enjoy this series and will probably need to replay it in the future. this is helpful and practical as it shows the whole process on how to approach it.
thank you John.
Thanks for another informative video!
There is one tiny concern with the append_to_csv code. The file lacks the normal (but optional per RFC 4180) header that some apps expect or that may be needed if there were more fields in the file. This small change would create the header line just once when the file is created. Before the with block simply add this little bit of code:
# Check if the file exists
if not os.path.exists('append.csv'):
# Open file in write mode to write the header line
with open('append.csv', 'w') as f:
writer = csv.DictWriter(f, field_names)
writer.writeheader()
John thanks a lot for your videos! They are really interesting and well made, i learnt a lot with you! Many thanks! CHEERS!
Still waiting for the neovim set up video ❤
John,
Another great presentaion!
Also the program is very logically developed.
I liked to see list compressions.
Another idea I think. Could have a GUI front end where user inputs some conditions or product categories or names or whatever, and the program returns records based on the conditions either one at a time or in a table on the form. Just a thought.
Thanks!
Thanks for information but I have a question about something similar to this topic. If I have an local web page and I have some graphics in jpg format, how do I scrap them or store them in a specific file by using a web scraper? Thanks alot for all info
Wouldn't the clean_data function also remove the word "Item" and "$" from the name of the product too?
Thanks bro for sharing the great content, So if you not have any issue can you make the same or another web scraping content in object oriented programming concept.
🧡
I am scrapping a online shopping site
With from last 10days it's doesn't work properly
After 3-4 times scan it take about 15-20X more time to scan
And after again it work smooth for 2-3 times and then again it take lots of time
Why it's happing
I m using scrapy py
Can you scrape bet365?