Webscraping with Python How to Save to CSV, JSON and Clean Data

Поділитися
Вставка
  • Опубліковано 10 лют 2025
  • Join the Discord to discuss all things Python and Web with our growing community! / discord
    This is the fourth video in the webscraping 101 series, aimed out how to export out scraped data to json and csv, along with some simple data cleaning pipelines.
    This is a series so make sure you subscribe to get the remaining episodes as they are released!
    If you are new, welcome! I am John, a self taught Python (and Go, kinda..) developer working in the web and data space. I specialize in data extraction and JSON web API's both server and client. If you like programming and web content as much as I do, you can subscribe for weekly content.
    :: Links ::
    Recommender Scraper API www.scrapingbe...
    My Patrons Really keep the channel alive, and get extra content / johnwatsonrooney (NEW free tier)
    I Host almost all my stuff on Digital Ocean m.do.co/c/c7c9...
    I rundown of the gear I use to create videos www.amazon.co....
    :: Disclaimer ::
    Some/all of the links above are affiliate links. By clicking on these links I receive a small commission should you chose to purchase any services or items.

КОМЕНТАРІ • 12

  • @AliceShisori
    @AliceShisori Рік тому +1

    I really enjoy this series and will probably need to replay it in the future. this is helpful and practical as it shows the whole process on how to approach it.
    thank you John.

  • @PanFlute68
    @PanFlute68 Рік тому +5

    Thanks for another informative video!
    There is one tiny concern with the append_to_csv code. The file lacks the normal (but optional per RFC 4180) header that some apps expect or that may be needed if there were more fields in the file. This small change would create the header line just once when the file is created. Before the with block simply add this little bit of code:
    # Check if the file exists
    if not os.path.exists('append.csv'):
    # Open file in write mode to write the header line
    with open('append.csv', 'w') as f:
    writer = csv.DictWriter(f, field_names)
    writer.writeheader()

  • @andrepereira1807
    @andrepereira1807 Рік тому

    John thanks a lot for your videos! They are really interesting and well made, i learnt a lot with you! Many thanks! CHEERS!

  • @bakasenpaidesu
    @bakasenpaidesu Рік тому +5

    Still waiting for the neovim set up video ❤

  • @thebuggser2752
    @thebuggser2752 Рік тому

    John,
    Another great presentaion!
    Also the program is very logically developed.
    I liked to see list compressions.
    Another idea I think. Could have a GUI front end where user inputs some conditions or product categories or names or whatever, and the program returns records based on the conditions either one at a time or in a table on the form. Just a thought.
    Thanks!

  • @mohammedaldbag9827
    @mohammedaldbag9827 Рік тому

    Thanks for information but I have a question about something similar to this topic. If I have an local web page and I have some graphics in jpg format, how do I scrap them or store them in a specific file by using a web scraper? Thanks alot for all info

  • @rajatkumar35
    @rajatkumar35 Рік тому

    Wouldn't the clean_data function also remove the word "Item" and "$" from the name of the product too?

  • @adarshjamwal3448
    @adarshjamwal3448 Рік тому

    Thanks bro for sharing the great content, So if you not have any issue can you make the same or another web scraping content in object oriented programming concept.

  • @chamikagimshan
    @chamikagimshan Рік тому

    🧡

  • @lordlegendsss7776
    @lordlegendsss7776 Рік тому +1

    I am scrapping a online shopping site
    With from last 10days it's doesn't work properly
    After 3-4 times scan it take about 15-20X more time to scan
    And after again it work smooth for 2-3 times and then again it take lots of time
    Why it's happing
    I m using scrapy py

  • @theclam1338
    @theclam1338 Рік тому

    Can you scrape bet365?