Web Scraping With Python 101
Вставка
- Опубліковано 17 лис 2020
- Web Scraping With Python 101
Break The Code and Win a Macbook Pro - go.tech/btckalle
Follow me on instagram: / kallehallden
LiveCoder channel: / @livecoder7639
"Clean Code Friday"
If you want to receive one short email from me every week, where I go through a few of the most useful things I have explored and discovered this week. Things like; favourite apps, articles, podcasts, books, coding tips and tricks. Then feel free to join kalletech.com/cleancode/
CONTACT: contact@kalletech.com
Follow me on:
TWITCH: / kallehallden
INSTAGRAM: / kallehallden
TWITTER: / kallehallden
GITHUB: github.com/kallehallden
VIDEO EDITOR: editingmachine.com (use coupon code KALLE to get 50% off your first month)
--------------------------------------------------------------------------------------------------------
GEAR:
kalletech.com/tech/ - Ігри
I was first going to just grab the title of a video and utilize the .text method of find_element_by_xpath method. But felt it was not as interesting as clicking on a video. So for the people saying that this is ”not web scraping”, technichally they are right since I am not extracting data from a site. But, doing that would look exactly the same except instead of using .click() you would end it with .text. (You would also need to get the title element from the UA-cam page). So for the purpuses of getting someone started from zero knowledge to being able to start doing web scraping, this difference is not important :)
Your are doing a really great job. Could you make a video about yourself i mean which u have studied etc etc.
Can you also make a video on Ubuntu and how good it is for programming
I would agree but for the sake of those who "complain" maybe adding a "disclaimer" of some sort would be better.
I do not know anything about coding, programming and do not understand a single word about it. I only watch your videos to see your beautiful smile and eyes. There, I had to say it. 😊
Bruh, just type notepad in the command prompt to open up text files. You can do this with visual studio code as well.
What video was supposed to be: Webscraping with Python
What the video actually was:
Roasting windows
Full of ads actually...borderline clickbait
LoL
@@nuwankarunaratne4772 Too good!!!!
What he probably doesn't know, is that there's a Linux Subsystem feature in Windows. And you can use Linix in Windows. Try it you'll love it!
Him: my desires are unconventional
Her: show me
Him: *runs the python script*
0:04 good heavens, what even was that
"I certainly do" in the beginning made me subscribe immediately. Weird humor is the best.
i've done quite a bit of web scraping in c# and python, but man you just made my life so much easier showing how to copy the xpath in chrome.
0:04 I absolutely love when you do something that I wouldn't expect you to do lol. I just know this channel is going to blow up!
When using Vim with a dark terminal background, add "set bg=dark" to your ~/vimrc file. You can also just set it temporarily by typing ":set bg=dark" (without quotes) in Vim when in normal mode (not in insert mode). This will make colorized text a whole lot easier to read.
Video quality has improved 1000x in the last year. Great stuff
There was no Web Scraping. This was literally just setting up selenium
lmao
requests > selenium
and thats fax
@@WillyJL indeed, selenium is only good when there is no other way to get the result you are looking for.
@@arjix8738 exactly, I recently made a full fledged captcha bypasser with voice recognition and vpn switching using selenium, but thats the only project I used it for (apart from the first testing projects we all do to get the gist of how it works)
Lmao
Thanks a lot for the video Kalle! The one I had done is a different way for sure and appears much more simpler. Lookin forward to your next video!
I been loving the videos lately man, you make mundane stuff cinematic an interesting, thanks!
knowledge of python: "Check." Knowledge of web scraping:" I certainly do." hahaha that made me laugh like Muttley (aka a wheezy snicker)
0:04
LMAO
"opening files in windows sucks"
*googles how to open files in windows*
"windows actually has the simpler way of opening files"
😂😂😂
Maybe we are too hard on windows 😅
@@letusplay2296
@echo off
::-------
:: Usage
::-------
GOTO START
:USAGE
echo.
echo Usage:
echo.
echo example_spectral_index.bat inputRasterFilename spectralIndex outputRasterFilename
echo.
echo Example:
echo.
echo example_spectral_index.bat "c:/input.dat" "Normalized Difference Vegetation Index" "c:/output.dat"
echo.
exit /B 1
:START
if [%1]==[] GOTO USAGE
if [%2]==[] GOTO USAGE
if [%3]==[] GOTO USAGE
::----------------
:: Input Arguments
::----------------
:: Input arguments are strings
set "input_file=%1"
set "spectral_index=%2"
set "output_raster_uri=%3"
:: Remove double quotes from the arguments, if any
set input_file=%input_file:"=%
set spectral_index=%spectral_index:"=%
set output_raster_uri=%output_raster_uri:"=%
:: Now put double quotes around each input argument
set input_file=^"%input_file%^"
set spectral_index=^"%spectral_index%^"
set output_raster_uri=^"%output_raster_uri%^"
::---------------------------------------------
:: Build input JSON describing the task to run.
::---------------------------------------------
set json={
set json=%json% "taskName" : "SpectralIndex"
set json=%json% ,"inputParameters" : {
set json=%json% "input_raster":
set json=%json% {"url" : %input_file%, "factory":"URLRaster"}
set json=%json% ,"index" : %spectral_index%
set json=%json% ,"output_raster_uri": %output_raster_uri%
set json=%json% }
set json=%json%}
::------------------------------------------------
:: Get the location of the envitaskengine relative to
:: the location of this script.
::------------------------------------------------
set mypath=%~dp0
::now go up to levels
pushd %mypath%
cd ../..
set loc=%CD%
popd
::--------------------------------------------
:: Run the task.
:: Task output JSON will be written to stdout.
::--------------------------------------------
echo %json% | "%loc%/bin/bin.x86_64/envitaskengine"
i love windows and despise mac, never tried linux tho
Im glad to see this type of content back! I was scared you turned into a 'Look at my life' guy, but luckily you're back on programming!
hope you're having a good day Kalle
❤️❤️❤️
Love Your Videos
🔥🔥🔥
Webscraping 101: always try to identify an element with an id first
and most importantly don't copy the xpath expression from the chrome inspector build your own
@@torrebooksapp9508 why?
@@milanscienceacc3041 you want to make sure your selector is as accurate as possible and will work even if some changes are applied to the website structure. when you copy an xpath or css selector from the browser inspector in most cases it will be prone to break on the smallest changes cause it follows a tree structure from parent to child to generate a selector, sometimes it will generate a good enough selector though that's why i recommended to always do it manually
@@torrebooksapp9508 or use css, css is much better...
how do I extract only text without opening up the browser in my PC
Very nice and useful. Was getting started with webscraping soon anyway, but this helps a lot
This, hacking wifi passwords, hacking into a car and a lot more are videos that inspire so many young teenagers to start programming.
Nice job.
I've been waiting for this video so long .
This has just given me sooo many ideas on how I could annoy everyone I know
how tho?
@@DavidCVdev just imagine making their Google Chrome open up some pages they wouldn't want to see...
@@eliotportevin5714 Never gonna give you up, never gonna let you down...
Funny that you posted this today!
I just dipped back into python today because I wanted to find a way to manually fuzz a URL.
After noticing that the youtube banner image cropping happens on their server side rather than in html, and that if you play around with some parameters in the URL within hex range (0-f) you can get the image to load in with different crops, I wanted to find the parameter values that would load the largest version/resolution of the image.
So I discovered selenium and got my script working with a handy little GUI (using tkinter).
But now I want to play more with selenium's webdriver library and make a URL fuzzer that will, e.g., test against 40X vs. 200 responses to automate the process of finding the limits of ambiguous parameters.
Say, speaking of opening files into the OS's GUI from command line, on Unix (& linux?) you can type use 'open .' to open the current directory in the GUI (e.g. on Mac if your working directory in command line is '/User/someone/Downloads' and you type 'open .' then Finder will open the 'Downloads' folder).
I find that handy when switching between command line and GUI navigation. 👍
EXACTLY WHAT I WANTED AND NEEDED RIGHT NOW! THANK YOU KALLE :D
I kinda always use selenium for scraping + automation. Should I be combining selenium with bs4? Like selenium for automation and bs4 for html parsing? Or is my approach good and would I be more inefficient if I incorporated bs4?
The sheer number of videos You put out every week is amazing!
I found about Your channel not long ago, but I'm loving it!
Keep up the good work 👍
This was very engaging. Please do more content in this format. From a 13 year old from England :)
I've subscribed this channel for a really long time and it is getting better and better unstoppably.
QUICK REQUEST: Kalle, do you have any experience with cloud computing ? If you have, can you film a video about it?
love your creativity bro, for sure you are enjoying creating videos and coming up with cool stuff. 100% would watch all kind of content from you, thanks for entertainment
Great! Thanks for making this quick video with a lot of good content! Congrats!
Hey man thanks to your tutorials for automating things with Python few weeks ago I got inspired and I've wrote a bot for job applications in Selenium and it got me a few interviews! :))
You can also do this using UiPath. It has GUI and the community version is free. Then dump the info into an excel file. Some knowledge on VB and C# needed for more advanced options.
Do you have to create an environment every time you want to create a python web scraping script or just once and use the same all time?
Sorry if it's a very noob question
This's like a tutorial bro!
I love your tech videos as a vlog. Your channel is also moving as a tut!!!!
I like your videos, they are of good quality and you bring very great value! 😁
i just wondering what happened to your investment ?which is u using pythoon to auto trade.
With the goofy windows code at the start, it is windows batch, first created in 1981 was used fo run most windows operated machines and still is very useful in 2021! This was my first coding language so I'll explain it:
echo is the eqivilent command of print so
print("example") == echo example
Adding the full stop afterwards makes it blank eqivilent as print(" ")
Then the > just outputs the result to a file of your choice if it existes already it will override it bug if it doesent then it will create a new one...
echo. >yourfile
Great stuff Kalle, I like very much your style of teaching, keep up the good work which I hope in teaching us wanna be programmers you are making some money at!
Cool video ! Very instructive, quick and clear. I liked it way better than other style of videos like "A day of Meetings".
I find this kind of tutorial to be more interesting than the vlog-type videos.
Keep it up 💪🏻
Is there a way to use javascript for webscraping?
Because ifttt doesn't allow filtering codes in python
Im new at this..
Can you use an IDE like VS or does it have to be on cmd?
You don't have to necessarily use the command line, but it's pretty useful. If you want to use IDE, no matter which one, you'll definitely find all answers on the internet ;)
UPDATE:
In recent updates; browser.find_element_by_xpath() is deprecated
instead use;
path = "//*[@id="thumbnail"]"
browser.find_element( 'xpath' , path)
Hello a technical question: can we web scrap an html table when a login is required for access to that page? (assuming we have the credentials ofcourse)
Actually, This is the content I’m looking for on a daily basis. Again I really had to laugh about the windows jokes. 😅 Hope that there will come more of this stuff in the future. Great praise!
I have actually made a project which keeps checking if any reservation slots pop-up/come free at my closest gym because it is almost always completely reserved. This way I will be the first one to reserve an upcoming free slot when someone deregisters.
this will be the class I really want to attend in all of my lockdown classes.😂👍🏻
hello i am pretty new on your channel and started my journey of learning coding in python earlier this year and i was looking for easy projects to do and that video looks really good definitely going to try it keep up the good work and hope wewill have more 101 video . Maybe you could do a 101 video each time you do a a build in one day ?
just checking up upon, you still progressing?
Hi, if you develop in a virtualbox with ubuntu/visual studio code/python/selenium, can you run the program on windows(microsoft)/python different computer?
You really mastered the art of into
Intros
Thats increible how you can do a video everyday with this quality.
does the webscraper have to open up my chrome to scrap, or is there a way that I can just run a scraper script and it does it thing in the background?
LOL wat a timing jus now stuck with an error for 4hrs, suddenly got notification of this video and viola!!!... got the solution ... THX Kalle!!!
The ad got me at the PS5😂😂
2:02 Finally! I Switched to Manjaro. Going Good!
what line of code do I need to add to make it write something?
I literally started a small webscraping python project 2 days ago lol
Yesterday 🤣🤣🤣
Same, started one yesterday to check for computer parts prices. I'm using beautifulsoup tho but I will try selenium
@@hola_chelo can u extract data without bs4 only with regular expression
Great one to watch laugh and learn at the same time !! Need more content likr this !!
Use powershell, it uses a lot of syntax like unix systems
This helped a lot thank you
Ey that thumbnail code is one of my background pictures...
That intro 😂💯
Thank you so much I really needed this.
I would be happy if the video was longer and accomplished some automation project in the end. Also xpath can be a bit tricky so some explanation of xpaths in future videos would be great. I think i wont mind a web scraping 102. 😉
I love the video by the way but when I try to open the file it immediately closes and second when I type in vim web it says 'vim' is not recognized as an internal or external command,
operable program or batch file. how to I fix this. also when I install selenium it does a huge paragraph of code, is that normal. I know this is a little late to make a comment on a video but I want to learn how to code really bad using python because I tried c++ but I was very complicated and python looked easier from a beginners view.
WOW I REALLY LIKE THE WAY HE TEACHES. KEEP IT UP 👍 WE WANT MORE TUTORIAL VIDEOS PLEASE ♥️
How do you execute the script in the background?
Any chance of making the same tutorial for MAC users?
Hey Kalle, this was a super simple and down-to-earth tutorial! Fantastic Job!
How do you feel about Windows Kalle?
Hey, i have been web scraping with scrapy and python for a long time.
I want to get started as a freelance web scraper, but nobody gives a project to a newbie...any suggestions for me will help a lot.
What kind of jobs you can do with web scraping ? Any examples ?
@@splashoui3760 collecting data from website
Like... if you have to get data of restaurant of a particular place in a City from a website...
Hey can u please tell me how you maded that intro pleaseeeeee
great content, and LOVE the intro !!
Hello Kalle, I must tell you that I've tried it and it was really fun. Waiting for the next videos ....
Hi!
Is it possible to webscrape some data and use that information to make a nice dashboard, tailored only to some data we are interested in?
Topic: real estate auctions
So, how can I get data which is in a time interval? I mean what should I do for example while geting data for 2 years(2017-2019) and eliminate the weekends. I want to have only weekdays data.
Thanks.
this man is really a senior developer. He uses vim instead of an actual IDE, like vs code and pycharm or something. Ur flexing my man. I use arch btw
Thanks - was exactly what I was looking for.
Web Scraping!!! Finally, my new Favorite Video!!
He didn’t show anything tho
When are you getting Apple Silicon Macs?
can you recommend us good resources to learn python online
and can you make a series where you teach us what you know about python ???
like for kalle to see it !!!
@Rahmi Acar thx so much !!!
How do I make my default version of python on mac be 3.8.3?
Hey it worked, thanks a lot Kalle, that was so cool....
I want to learn python. where should i start?
Good job to shoot every single day. You challenged yourself and you're doing amazing.
A simple question and you can answer it in any video. Could you tell me the efficient road map to get my first job? I started with Java as you recommended that before in your videos. I started 3 months ago.
Webscraping is easy. It's overcoming the 429 Errors that's the real challenge.
instead of echo u can use md file_name.extension which is much simpler
This tutorial was made even better by the fact that he never does tutorials
how do i create a file in the windows command prompt
can you make a video on how to set up VIM?
Kalle, Forgot to miss a "/" at last in that first URL. Thanks :)
2:53 what should I use for Firefox.. ? Please help.
how can i use this on mac?
HELP i'm on mac and i installed selenium every way possible, but it still says ImportError: no module named selenium. How can I fix this?
The starting sketch caught me off guard ngl hahaha
you could use PowerShell for windows command line
Hello, Kalle I have been loving the videos and long term viewer. I was wondering if you would ever try the no-code space such as Webflow, Adalo, Bubble? Also, would you be interested in making more Cyber security/hacking videos
Cannot wait to the next video!
why when web scraping chrome logout.But only when I automate
When I type in vim web it says ‘vim’ is not recognized as internal or external command, operable program or batch file.
Is tnat the terminal?