How To Extract Scraped Data To Excel (Using Python)

Поділитися
Вставка
  • Опубліковано 16 вер 2024

КОМЕНТАРІ • 58

  • @oxylabs
    @oxylabs  2 роки тому +2

    Thank you for watching! We hope you find this video helpful. If you have any questions, don't hesitate to ask them in the comments. Discover more about web scraping, proxy servers, tutorials in our channel: ua-cam.com/users/Oxylabs

  • @ravichandra2084
    @ravichandra2084 Рік тому +3

    A Big Thanks from Palakollu, West Godavari, INDIA.

  • @peterimade6025
    @peterimade6025 2 роки тому +3

    The quality of this channel is dope, needs more subscribers

    • @oxylabs
      @oxylabs  2 роки тому

      Thank you! We really appreciate it :)

  • @ИсломКобилов-щ7ж

    It's very useful high-quality video without any water, thank you for making such big efforts 😊

  • @DeborahOdion
    @DeborahOdion Рік тому +1

    Thank you for the knowledge. The content was amazing!

  • @ericzaver5063
    @ericzaver5063 2 роки тому +1

    scraping is a quite difficult process for me. thanks for the vid, super helpful

    • @oxylabs
      @oxylabs  2 роки тому

      It is complex indeed! But no worries, we're here to stay and share more insightful videos. If you have any specific questions - you're more than welcome to reach out to us via hello@oxylabs.io

  • @growlandroll
    @growlandroll Рік тому +1

    I tried to replicate it and it worked! Thank you so much

  • @efleon9
    @efleon9 Рік тому

    From now, I love you forever! Thanks for share this amazing skill!!!

    • @oxylabs
      @oxylabs  Рік тому

      We're happy you found it helpful!

  • @user-fy4rc2dp1x
    @user-fy4rc2dp1x 10 місяців тому

    I run the program and get the message of done but when I type open books.xlsx is says that “open” is not recognized

  • @gleovas
    @gleovas 2 роки тому +1

    this saved my live, NEW SUB, thank u

    • @oxylabs
      @oxylabs  2 роки тому +1

      Thank you, glad you enjoyed it!

  • @gerritsx9
    @gerritsx9 2 роки тому +3

    As a beginner this is hard to follow, as you only explain for your example. I would appreciate a more dynamic explanation of how the libraries work without the need of goin gin depth.

    • @oxylabs
      @oxylabs  2 роки тому

      Thank you for your feedback, we really appreciate it :)

  • @Ariful_Islam10
    @Ariful_Islam10 Рік тому

    One of the best video that I want....thank you so much😍😍❤❤

    • @oxylabs
      @oxylabs  Рік тому

      We're glad you enjoyed it!

  • @aerotraveldji
    @aerotraveldji 11 місяців тому

    Hi, on line 14 the word books comes up as "books" is not defined Pylance. And on line 30 export is also not defined Pylance. Could you tell me how to fix this please :)

  • @JohnOmnes
    @JohnOmnes 6 місяців тому

    Great vid

  • @snipegodgaming6361
    @snipegodgaming6361 Рік тому

    What if there are same class. Names for different text in web pages

  • @deejaysoulution
    @deejaysoulution 2 роки тому +1

    Amazing ! It works !!!

    • @oxylabs
      @oxylabs  2 роки тому

      Yes it does! Glad you liked it :)

  • @tarztarzs2063
    @tarztarzs2063 Рік тому

    i have a syntax error 'return' outside function ;(

    • @oxylabs
      @oxylabs  Рік тому

      Hey, make sure to double check indentation. Here's some more info on that: bit.ly/3FUrukv :) Hope it helps!

  • @raffimannarelli
    @raffimannarelli 2 роки тому

    Thanks for the guide!
    I am getting a NameError when running the name-main guard block of code. Im running in Jupyter nb as well and not sure if scope is any different there but have no idea how to get around it.

    • @oxylabs
      @oxylabs  2 роки тому

      Hello there. You're very welcome! About your problem - in Jupyter you don't need to have the import guard. Remove the if __name__ line altogether. Hope this helps!

  • @return_1101
    @return_1101 2 роки тому

    Nice and interesting video!

  • @gormiksoc
    @gormiksoc 11 місяців тому

    Can I ask, how would I go about using python as backend and excel as front end to pull data from the web, and show it on excel in desired form when you press a Macro button in excel?
    Python:
    Requests: To make HTTP requests to fetch data from websites or APIs.
    Beautiful Soup: For parsing HTML content and extracting data from web pages.
    Pandas: For data manipulation and cleaning.
    Flask or FastAPI: To create a web service that exposes endpoints for Excel to interact with.
    openpyxl: For reading from and writing to Excel files.
    VBA (Excel):
    ActiveX Controls: To create buttons or user forms in Excel for user interaction.
    VBA Macros: To write VBA code that runs when the button is clicked.
    Excel Object Model: To manipulate Excel workbooks, worksheets, cells, and charts.
    Shell Function: To run external programs or scripts (in this case, Python scripts).

  • @itsmfactor
    @itsmfactor 2 роки тому

    pretty neat! Thanks a ton.

    • @oxylabs
      @oxylabs  2 роки тому

      You're very welcome!

  • @adamklimt1600
    @adamklimt1600 2 роки тому

    is there an option to extract scraped data to google sheets instead if excel? or excel is simply more "powerful" to process the data

    • @oxylabs
      @oxylabs  2 роки тому

      Hello! You can always import the Excel files to Google Sheets. Excel file format is the most compatible because it's not stored on any cloud network and doesn't require an internet connection to read the data.
      Start a blank file; in the top menu click "file"> "import"> and select the Excel file you would like to transfer.

  • @nikolairodriguez5147
    @nikolairodriguez5147 Рік тому

    Getting a syntax error (pyflakes E) in the code "item["Title"] = book.find( ..."
    Spyder is pointing at the equals sign... why is this happening?

    • @oxylabs
      @oxylabs  Рік тому

      Hey! Thanks for asking :) Maybe you could share your code, so we could check where's the problem?

    • @nikolairodriguez5147
      @nikolairodriguez5147 Рік тому

      @@oxylabs I figured it was a typo... thanks for asking though :)
      I am trying to scrape another website, but each product listing has three different prices.
      They are under a table class, how would you scrape each table column? Any idea on this? :)
      Both the title and prices have the same tags:

      24h
      +24h
      Semana


      153,00 €
      120,90 €
      489,60 €

      Perhaps do a video on this too? I am stuck here for the time being :)

    • @oxylabs
      @oxylabs  Рік тому

      You could try using BeautifulSoup find_all method, which finds all corresponding elements. In your case, it would find all

  • @harrystone7954
    @harrystone7954 2 роки тому

    Nice video! Is it possible to extract data from a website that requires login credentials? Thx

    • @oxylabs
      @oxylabs  2 роки тому

      Hello, thank you! Technically this could be possible, however data under the login is not considered as publicly available data and usually includes multiple scraping restrictions. You may also need to have permission from the website owner.
      Accordingly, before engaging in any scraping activities, we recommend you to take appropriate professional legal advice regarding your specific situation.

  • @nikolairodriguez5147
    @nikolairodriguez5147 Рік тому

    Same thing is happening to "data.append(item)" ... :/ points to the "d" in "data"

    • @oxylabs
      @oxylabs  Рік тому

      Hello! In this case also, please send us the code, and we will check where might be the problem :)

  • @prasadjadhav3829
    @prasadjadhav3829 2 роки тому

    I tried your method. my excel file shows 5 columns to 1 row where it should've shown 5 columns to 312 rows. Can u help me solve this

    • @oxylabs
      @oxylabs  2 роки тому

      Hello, it all depends on your code, this is a very abstract question. Could you please share/specify your code so we could help?

    • @prasadjadhav3829
      @prasadjadhav3829 2 роки тому

      Thank you for the reply. Here is the link to my code github.com/DolorPJ/webscraping-demo/tree/main

    • @oxylabs
      @oxylabs  2 роки тому +1

      @@prasadjadhav3829 The data structure you make is printed in a default way. The keys of the dict are put to headers and the values are the rows below.
      If you wish to have multiple rows for the same headers, put multiple dictionaries to a list and make sure those dictionaries have the same keys. Hope this helps!

  • @navachaitanyanishithavlogs
    @navachaitanyanishithavlogs Рік тому

    It is very good

    • @oxylabs
      @oxylabs  Рік тому

      Thanks for watching!

  • @jamest4027
    @jamest4027 2 роки тому

    i don't understand what line 28 is doing ...what is this line doing? if __name__ == ''__main)__"

    • @jamest4027
      @jamest4027 2 роки тому

      what does __name__ and __main__ refer to ?

    • @oxylabs
      @oxylabs  2 роки тому

      @@jamest4027 Hello! Hopefully this explains in detail:
      www.geeksforgeeks.org/what-does-the-if-__name__-__main__-do/

  • @veiisk
    @veiisk 2 роки тому

    i don't understand how you found product_pod and price so fast, i look at the f12 and im lost :-)

    • @oxylabs
      @oxylabs  2 роки тому +1

      Hey Michael! It's okay, we're all learning here. If you could please specify the problem maybe we could help you out!

    • @raffimannarelli
      @raffimannarelli 2 роки тому

      any HTML crash course can help you understand the HTML and CSS on a website within an hour or 2!

    • @niviprince02
      @niviprince02 2 роки тому

      right click on the book and click on inspect🙂

  • @robertojunior9520
    @robertojunior9520 7 місяців тому

    i love you

  • @user-fy4rc2dp1x
    @user-fy4rc2dp1x 10 місяців тому

    I run the program and get the message of done but when I type open books.xlsx is says that “open” is not recognized

  • @prasannakumar3340
    @prasannakumar3340 2 роки тому

    i love you