Thank you for watching! We hope you find this video helpful. If you have any questions, don't hesitate to ask them in the comments. Discover more about web scraping, proxy servers, tutorials in our channel: ua-cam.com/users/Oxylabs
It is complex indeed! But no worries, we're here to stay and share more insightful videos. If you have any specific questions - you're more than welcome to reach out to us via hello@oxylabs.io
As a beginner this is hard to follow, as you only explain for your example. I would appreciate a more dynamic explanation of how the libraries work without the need of goin gin depth.
Hi, on line 14 the word books comes up as "books" is not defined Pylance. And on line 30 export is also not defined Pylance. Could you tell me how to fix this please :)
Thanks for the guide! I am getting a NameError when running the name-main guard block of code. Im running in Jupyter nb as well and not sure if scope is any different there but have no idea how to get around it.
Hello there. You're very welcome! About your problem - in Jupyter you don't need to have the import guard. Remove the if __name__ line altogether. Hope this helps!
Can I ask, how would I go about using python as backend and excel as front end to pull data from the web, and show it on excel in desired form when you press a Macro button in excel? Python: Requests: To make HTTP requests to fetch data from websites or APIs. Beautiful Soup: For parsing HTML content and extracting data from web pages. Pandas: For data manipulation and cleaning. Flask or FastAPI: To create a web service that exposes endpoints for Excel to interact with. openpyxl: For reading from and writing to Excel files. VBA (Excel): ActiveX Controls: To create buttons or user forms in Excel for user interaction. VBA Macros: To write VBA code that runs when the button is clicked. Excel Object Model: To manipulate Excel workbooks, worksheets, cells, and charts. Shell Function: To run external programs or scripts (in this case, Python scripts).
Hello! You can always import the Excel files to Google Sheets. Excel file format is the most compatible because it's not stored on any cloud network and doesn't require an internet connection to read the data. Start a blank file; in the top menu click "file"> "import"> and select the Excel file you would like to transfer.
@@oxylabs I figured it was a typo... thanks for asking though :) I am trying to scrape another website, but each product listing has three different prices. They are under a table class, how would you scrape each table column? Any idea on this? :) Both the title and prices have the same tags:
24h +24h Semana
153,00 € 120,90 € 489,60 €
Perhaps do a video on this too? I am stuck here for the time being :)
Hello, thank you! Technically this could be possible, however data under the login is not considered as publicly available data and usually includes multiple scraping restrictions. You may also need to have permission from the website owner. Accordingly, before engaging in any scraping activities, we recommend you to take appropriate professional legal advice regarding your specific situation.
@@prasadjadhav3829 The data structure you make is printed in a default way. The keys of the dict are put to headers and the values are the rows below. If you wish to have multiple rows for the same headers, put multiple dictionaries to a list and make sure those dictionaries have the same keys. Hope this helps!
Thank you for watching! We hope you find this video helpful. If you have any questions, don't hesitate to ask them in the comments. Discover more about web scraping, proxy servers, tutorials in our channel: ua-cam.com/users/Oxylabs
A Big Thanks from Palakollu, West Godavari, INDIA.
The quality of this channel is dope, needs more subscribers
Thank you! We really appreciate it :)
It's very useful high-quality video without any water, thank you for making such big efforts 😊
Thank you for the knowledge. The content was amazing!
scraping is a quite difficult process for me. thanks for the vid, super helpful
It is complex indeed! But no worries, we're here to stay and share more insightful videos. If you have any specific questions - you're more than welcome to reach out to us via hello@oxylabs.io
I tried to replicate it and it worked! Thank you so much
From now, I love you forever! Thanks for share this amazing skill!!!
We're happy you found it helpful!
I run the program and get the message of done but when I type open books.xlsx is says that “open” is not recognized
this saved my live, NEW SUB, thank u
Thank you, glad you enjoyed it!
As a beginner this is hard to follow, as you only explain for your example. I would appreciate a more dynamic explanation of how the libraries work without the need of goin gin depth.
Thank you for your feedback, we really appreciate it :)
One of the best video that I want....thank you so much😍😍❤❤
We're glad you enjoyed it!
Hi, on line 14 the word books comes up as "books" is not defined Pylance. And on line 30 export is also not defined Pylance. Could you tell me how to fix this please :)
Great vid
What if there are same class. Names for different text in web pages
Amazing ! It works !!!
Yes it does! Glad you liked it :)
i have a syntax error 'return' outside function ;(
Hey, make sure to double check indentation. Here's some more info on that: bit.ly/3FUrukv :) Hope it helps!
Thanks for the guide!
I am getting a NameError when running the name-main guard block of code. Im running in Jupyter nb as well and not sure if scope is any different there but have no idea how to get around it.
Hello there. You're very welcome! About your problem - in Jupyter you don't need to have the import guard. Remove the if __name__ line altogether. Hope this helps!
Nice and interesting video!
Thank you!
Can I ask, how would I go about using python as backend and excel as front end to pull data from the web, and show it on excel in desired form when you press a Macro button in excel?
Python:
Requests: To make HTTP requests to fetch data from websites or APIs.
Beautiful Soup: For parsing HTML content and extracting data from web pages.
Pandas: For data manipulation and cleaning.
Flask or FastAPI: To create a web service that exposes endpoints for Excel to interact with.
openpyxl: For reading from and writing to Excel files.
VBA (Excel):
ActiveX Controls: To create buttons or user forms in Excel for user interaction.
VBA Macros: To write VBA code that runs when the button is clicked.
Excel Object Model: To manipulate Excel workbooks, worksheets, cells, and charts.
Shell Function: To run external programs or scripts (in this case, Python scripts).
pretty neat! Thanks a ton.
You're very welcome!
is there an option to extract scraped data to google sheets instead if excel? or excel is simply more "powerful" to process the data
Hello! You can always import the Excel files to Google Sheets. Excel file format is the most compatible because it's not stored on any cloud network and doesn't require an internet connection to read the data.
Start a blank file; in the top menu click "file"> "import"> and select the Excel file you would like to transfer.
Getting a syntax error (pyflakes E) in the code "item["Title"] = book.find( ..."
Spyder is pointing at the equals sign... why is this happening?
Hey! Thanks for asking :) Maybe you could share your code, so we could check where's the problem?
@@oxylabs I figured it was a typo... thanks for asking though :)
I am trying to scrape another website, but each product listing has three different prices.
They are under a table class, how would you scrape each table column? Any idea on this? :)
Both the title and prices have the same tags:
24h
+24h
Semana
153,00 €
120,90 €
489,60 €
Perhaps do a video on this too? I am stuck here for the time being :)
You could try using BeautifulSoup find_all method, which finds all corresponding elements. In your case, it would find all
Nice video! Is it possible to extract data from a website that requires login credentials? Thx
Hello, thank you! Technically this could be possible, however data under the login is not considered as publicly available data and usually includes multiple scraping restrictions. You may also need to have permission from the website owner.
Accordingly, before engaging in any scraping activities, we recommend you to take appropriate professional legal advice regarding your specific situation.
Same thing is happening to "data.append(item)" ... :/ points to the "d" in "data"
Hello! In this case also, please send us the code, and we will check where might be the problem :)
I tried your method. my excel file shows 5 columns to 1 row where it should've shown 5 columns to 312 rows. Can u help me solve this
Hello, it all depends on your code, this is a very abstract question. Could you please share/specify your code so we could help?
Thank you for the reply. Here is the link to my code github.com/DolorPJ/webscraping-demo/tree/main
@@prasadjadhav3829 The data structure you make is printed in a default way. The keys of the dict are put to headers and the values are the rows below.
If you wish to have multiple rows for the same headers, put multiple dictionaries to a list and make sure those dictionaries have the same keys. Hope this helps!
It is very good
Thanks for watching!
i don't understand what line 28 is doing ...what is this line doing? if __name__ == ''__main)__"
what does __name__ and __main__ refer to ?
@@jamest4027 Hello! Hopefully this explains in detail:
www.geeksforgeeks.org/what-does-the-if-__name__-__main__-do/
i don't understand how you found product_pod and price so fast, i look at the f12 and im lost :-)
Hey Michael! It's okay, we're all learning here. If you could please specify the problem maybe we could help you out!
any HTML crash course can help you understand the HTML and CSS on a website within an hour or 2!
right click on the book and click on inspect🙂
i love you
I run the program and get the message of done but when I type open books.xlsx is says that “open” is not recognized
i love you