Hi Johan 😀 You'll need to tackle this with a conditional statement, where videos would be saved under the ".mp4" extension and images under the ".png" extension mentioned in the end of the video. Let me know if you were able to figure it out! 😊if not - I can film a quick tutorial showing how to do it 😉
@@PythonSimplified please do! Also, how many images should I expect in my folder? I get a TypeError on the last for loop. TypeError: cannot use a string pattern on a bytes-like object I'm guessing its because of video format.
@@PythonSimplified Hi!! Congrats for the perfect content!! I've spent my day studying for a master data science project and you've been helping me a lot :) I had the same problem today with profiles with some specific photo types, videos and reels.. I couldn't save an image and I got the same error mentioned above.. Could you help us please? Thanks!! ;)
@@paulameneses2306 thank you so much dear! 😁 Sure, I'll look into it over the weekend and adjust the code to include a conditional statement for videos 😉 we'll be in touch!
@@TheJohanHalim Yes, you are absolutely correct Johan!😊 You get this type of error when trying to save a collection of images (video) as a single .png or .jpeg image, it's due to an incorrect format. The amount of images you should expect differs from one computer to another, depending on the size/scale of the display. The code in this tutorial would get you the number of images that results in a single scroll event. And as Instagram uses a dynamic language - the more you scroll, the more images are loaded to the page. If you'd like to include several scroll events - checkout my community post, where I include additional resources, a detailed article and code examples on how you can expend this bot: ua-cam.com/users/postUgwVQazZhNNqwdghhdh4AaABCQ I'll get back to you after the weekend with a solution to your video question 😉
by the way the whole thing of her in the right side of the screen is planned out, she has her hair there to hide that she's wearing something hoping to appear nude so people will click and it will go viral or something 😂 Wish her the best of luck though! No hate! 😘
This content is really great. Thank you for sharing it. Years ago I used to do web scraping back when there was a lot less JS and interactivity but haven't done it in a long time. This video got me back into it. Keep it up!
Thank you so much for explaining and showing every basic steps in details ! Lots of beginners like me get stuck in setup steps that can seem obvious to experienced developpers. For exemple thank you for explaining and showing all the download and setup Chrome driver steps. Even on some big websites pass quickly these basics setup steps. I was stuck but thanks to you I made it ! Thanks again !!!
Thank you for this tutorial!. I am currently learning python on datacamp and haven't learned or seen any real world applications. I am definitely going to try this out and add what I learned from this video to my skillset.!
Great content! Very clear and useful. Btw you don't need to add the local path of the webdriver as long as you have it in your Environment PATH. It looks over there by default. Also, by the end of the video you can get rid of the counter variable if you use enumerate.
Wow, thank you Yaniv!! This is fantastic - we can save it there once and never worry about it again! !!👏👏👏 I'll just sit down in shame and be impressed with your super-efficient coding skills 😂😂 אגב!! אני ממש שמחה לראות שחברה׳ ישראלים מצטרפים לחגיגה, ועוד עם כאלה עצות נדירות!! תודה רבה יניב, שיחקת אותה! 😀
Thank you so much Faizmohammad! 😁 I've just posted a brand new video on Selenium, this time webs craping Facebook: ua-cam.com/video/SsXcyoevkV0/v-deo.html It's some sort of a series! just with a bunch videos on other subjects in between 😄
Cool vid Maria! But on the days there isn't a "not now" button, the whole code grinds to a holt. To know how to add a function where if the "not now" button is present then code clicks on it and if it isn't present then the code skips to the step would be awesome!!
@@davidliu7246 thanks, I found it a while ago using try and except. now i just need help creating a chatbot. At the moment using nlp, spacy, textblob and a ton of for-loops
I'd like to say that I loved it you're amazing and please keep on it, I'm happy too because English isn't my mother language and I understood you very well 😊.
Thank you Vinicius! I'm so happy to hear that! 😁😁😁 English isn't my first language either, so I'm always trying to use simple words whenever possible (the complicated words are also much harder to pronounce, I sound very Russian when I do this hahahaha) Thank again and Merry Christmas!! 😊
Tutorial was very helpful, but I did run into that same issue you mentioned about having to hit enter more than once in the search bar. Even with multiple instances of the send_keys / ENTER command, that part wouldn't work. What I decided to do was call time.sleep(2) a couple of times between hitting ENTER, and it took a total of three instances of hitting ENTER to get by. Even then, the file grab was happening too quickly, and it took the thumbnails of the people in my Instagram stories... so I called for one more time.sleep to give the next page time to load, and it worked! Your tutorials are great - I just found them and appreciate the concise, helpful videos!
Hi Matt, thank you so much for your amazing feedback! 😁 You totally nailed it with the time.sleep() command! I actually just published a tutorial on Medium that tackles it, as many people were running into the exact same issue with the ENTER command (and would be very irresponsible of me to ignore that hahaha): medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885 Or you can skip the detailed explanations and just check out the source code on GitHub (I will update it in the description of the video shortly): github.com/MariyaSha/WebscrapingInstagram/blob/main/WebscrapingInstagram_completeNotebook.ipynb So my solution was quite similar, however, I've done this through 2 ENTER presses and 3 time.sleep(5) waits, so maybe try to extend the wait for a bit more than 2 seconds and then you can get rid of the third ENTER command :) Thank again and have a fantastic week!
@@PythonSimplified that's actually really exciting that I came to the same conclusion you did. Even better still that you found a way to cut down on clicks. Thanks again for the tutorials!
Thank you so much Saqib! 😀 I have a new Selenium tutorial premiering in 35 minutes: ua-cam.com/video/TXdgMkf9gP0/v-deo.html We're expanding the Linkedin messaging bot to seem much more human than it should be, I highly recommend to check it out! 😁
@@PythonSimplified so glad to hear back from you. I have a little query regarding scraping. And its getting horrendous. I want to scrape some details from the *info* section of the profiles in a group. Is it doable? Your response will be like a lifesaver
@@smalirizvi8026 the rule of thumb is - anything you can target with your mouse and keyboard - you can target with Selenium! 😃 Some platforms have more blockers than others, I suggest watching my Linkedin Bot tutorial where I show how to handle 3 very annoying blockers such as "click intercept" and hidden elements: ua-cam.com/video/XdFUpFUDt88/v-deo.html If you run into specific errors let me know, but as long as you're working closely with the Developer Tools and allow enough time for the elements to load with time.sleep() - the sky is the limit! 😁
@@PythonSimplified I watched your vid. It was far above my knowledge and expertise. Actually I am trying to extract every profile's information in some groups on *Facebook* for my Machine Learning work. So far, I think that I should watch your video of applying selenium on Facebook. And I will convert python code with java whenever I face some blockers there as well as use time.sleep(). Do u think that this should be fine?
You have literally earned my sub! 💎💎 Thank you so much for replying and taking notice of my problem. Looking forward to you as I am pretty much done with my attempts with it
You explained Selenium very clear. Can you also explain in a video on how to prevent to be detected as a bot? I read many post on stack overflow but Selenium still got detected as a bot, even on the first page load.
Very good tutorial. You could use in for loop, the enumerate to avoid the counter assignment. for counter, image in enumerate (images): save_as = os.[path.join(path, keyword[1:] + str(counter) + ‘.jpg’) wget.download(image, save_as)
Great video. It'd be cool to see one on Insta scraping using GET requests instead of Selenium, it's much faster. There's a good article on Diggernaut about it. Anyway, thanks, keep em coming!
@@PythonSimplified Love to see how you go with it! I've been stuck on it for the last 12 hours :'( I'm getting different static HTML returned from requests.get() than shown in the Chrome Dev Tools. Great channel btw, looking forward to more content!
@@chris_burrows Thank you for suggesting! I'll check it out :) I'll start advertising properly sometime in the near future. For now I take it easy, trying to focus on improving my filming/editing abilities before I go down that road :D
Hi Anna, there's actually a better way than double enter! Check out my community post where I included the improved code, it's better to concatenate the url to search for your term 😉 And thank you so much! 😃
Thank you! 😁 It's also available on Medium with a few improvements: medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885 I'm also currently working on a website, where I'll post even a more updated version of the Medium article, where we'll be able to scrape the full-size images rather than thumbnails, and tackle more issues with the ENTER button 😀 Stay tuned!
This video is great thanks but i haven't been able to do the first step of opening chrome and Instagram. Maria, can you please do a video explaining what you do before, with Selenium. The parts of importing etc.... I downloaded selenium but still get a traceback.
Hi Andres! :) I find that "pip install" doesn't always work, so I install everything with "conda install" when possible. If you are also using Anaconda, try: "conda install -c conda-forge selenium" If you are using another command line software let me know what it is and what kind of traceback are you getting and we can go from there 😉
@@PythonSimplified Thanks for your help! I am still struggling. I am using Selenium and here I detailed everything and still no solution www.reddit.com/r/learnpython/comments/kfn0wm/a_unique_no_module_named_selenium_problem_its_not/
Hello ,I really like you work, just started to do some web scrapping and you tutorial was of a great help for me , you are organized, your explanation is perfect clear and easy to follow, just one thing I noticed and I already fixed but I wanna know if there is other way around. the problem is the search box don't appear if the chrome screen size is big so we just get the side bar with the search symbol which need to be pressed to open the search box, which I tried to figure out how to make it but couldn't. so I just added a screen size (driver.set_window_size(740,500)) to make sure the search box will appear automatically. If you know how to fix it the normal way that would be nice, Thank you
Thank you for the video. Very clear and straight forward. Im having trouble with the searchbox.send_keys(Keys.ENTER) command. I tried ENTERing twice but still doesnt work. Any sugestions?
Thank you so much Luisgui 😁 Try the code I've just posted on my Github, you can solve it with time.sleep(seconds), it's in the "search keywords" section: github.com/MariyaSha/WebscrapingInstagram/blob/main/WebscrapingInstagram_completeNotebook.ipynb I actually just finished working on a Medium article about this where I explain everything in detail, I'm just waiting for my publication to approve it and then I'll send you a link 😉
Web scraping is perfectly legal, and many companies are actually using it to promote their business. With that said, each major website/webservice has a certain layer of protection from bots but it's not that difficult to outsmart it, even when captchas come into play. I personaly find that Facebook has far more layers of protection than Instagram and I'm suspecting it also has the ability to dynamically adjust it's code to prevent you from scraping time and again using the same script. For example, my code for scraping all the images from a certain user's account became obsolete after 2 days, during which I've scraped over 650+ images several times in a row. I had to turn the code into something more general, rather than targeting a very specific single element with "Xpath" & "starts-with" - I had to fetch all the elements of the same kind on the page and only then narrow down the list to only include the single element I wanted to target. So I know first hand that Facebook is reacting to scraping and it's reacting quite fast. With Instagram however, the same code that worked for me 3 months ago - is still relevant and works like a charm (please checkout the article on medium to see the upgraded version of the code I presented in the video): medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885 I don't think you should be worried about scraping Instagram, that's for sure 😉
Great video. Thank you so much. Mariysha you are a very good teacher. I hope that you can learn the correct accent on the specific word ( `at-tri-bute ), emphasis on the first syllable, because it is so often when speaking about Python and objects. You routinely mistakenly swap or mispronounce the verb attribute ( at-`trib-ute ) and the noun attribute ( `at-tri-bute ). I understand that you are not a native English speaker, and please try to take note of this and improve the accent on this word. In Python a class or object `at-tri-bute is a noun, aka "method" or bound function. Great video. Keep cranking them out.
Thanks for this info. I am able to download only 58 images at one go, is it possible to download all images around 2000 with this at one go ? Can you please advise.
Hi Pranshu! yes, you are able to download as many images as you need! 😀Just implement the scroll event inside a loop, just like this (scrolling 10 times to the bottom of the batch): for j in range(0,10): driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") time.sleep(5) //wait 5 seconds The more you loop, the more images you scrape! I have a video premiering soon where I explain it in detail, check out "scroll to the bottom of the page" in the timestamps if you miss it 😉
Jupyter Notebook is a very handy interface, where we can process our code cell by cell. It is very similar to Google Colab, and we can access it directly from our Anaconda terminal with a very basic command - "jupyter notebook". If you wanna find out more about this interface, please check out one of my first tutorials, where I explain all about it in detail: ua-cam.com/video/jp_3NOKHn9c/v-deo.html So generally, when you're using a traditional IDE, it runs all your code at once, while notebooks like Jupyter allow you to separate your program into sections and run each of these sections independently. I find it to be very handy when teaching/learning, and I highly recommend to give it a try 😉
Thank you Leonardo! 😀 I used my_input.clear() to make sure the input is empty before we type anything in, it's not a necessary step with Instagram but it's very handy when sending your keys to an input which already has text in it 😊 It will remove the existing text and only include the string of your following send_keys command
Absolutely! 😀 With Python and Selenium - the sky is the limit! You can potentially even build an online service which tracks engagement across many different profiles and make a good profit out of it 😉 With Selenium you can collect any piece of data, manipulate it and evaluate it. Pandas, Numpy and Matplotlib libraries would also be very handy for such a task. Good luck, and let me know if you come up with something cool! 😊
Try targeting the anchor element with the href of "/direct/inbox/": inbox_button = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[href='/direct/inbox/']"))).click() This would press the the "inbox" button on your individual account, let me know if I got you right :)
@@PythonSimplified thank you so much mam your really so friendly replying stranger with out expecting anything I am amazed mam even my teachers won't repsond like you
@@PythonSimplified import time from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.by import By from selenium.webdriver.support.wait import WebDriverWait driver = webdriver.Chrome('C:/Users/91728/Desktop/python files/chromedriver') driver.get("www.instagram.com") username = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='username']"))) password = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='password']"))) username.send_keys("#username") password.send_keys("#password") button = WebDriverWait(driver, 2).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[type='submit']"))).click() not_now = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]'))).click() not_now_again = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]'))).click() inbox_open = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class='xWeGp']"))).click() recent_message_open = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class='-qQT3 rOtsg']"))).click() message_box = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//textarea[@placeholder='Message...']"))) message_box.clear() time.sleep(5) message_box.send_keys("hi") message_box.send_keys(Keys.ENTER) #instead of recent_message_open I want to select the element using Username by adding an input field at first .mam please Kindly contact me on +917285906544 mam this is my whatsapp or I messaged you on instagram mam Please rey me I'll show you the image.
@@harsha4074 You cannot select a CSS class by class='-qQT3 rOtsg' it is a temporary class, you need something a bit more constant to rely on. Try the 'href' attribute of the anchor instead. So everywhere where you've targeted the 'class' property you'll need to change it to a different property that doesn't have any weird letters in it. If it still doesn't fix it - try sending me the error code you got.
Hi Vijaykumar! 😃 to my understanding, the value 10 in the above example represents the timeout value (in seconds). Meaning - if this element was not detected within 10 seconds, you'll get a TimeoutException. You can definitely adjust it to any value you'd like! Selenium documentation keeps it at 10, but I've seen other examples with 5 and 15 seconds, it all depends on how long it takes the given page to load 😊
Hey Mariyasha! First of all, this was an amazing tutorial. I was trying to do some scraping with the posts that open up in the explorer page in the Instagram desktop website. It seems that the elements for these posts are created in the DOM when they open after you click on them and are destroyed from the DOM when you close them. Hence, it is difficult to locate these elements and I've been getting the "No Such Element: Unable to locate element" exception. Is there a workaround for this situation in Selenium?
@@roshanshetty5661 Did you check out my video on image processing with Pillow? ua-cam.com/video/NSHsaG2a4WU/v-deo.html You can use the same principles and apply them to the scraped images if you're looking for general processing and simple transformations. If you're looking to classify images with Artificial Intelligence, the video I've sent you above is not gonna help 🤣 I'm working on some more Machine Learning projects, where image classification will be a very important part (that's why we need this cats/dogs database in the first place)... In the meanwhile, you can check out my Flower Image Classifier on Github: github.com/MariyaSha/FlowerImageClassifier This might give you a good example of the pre-processing we do to get the data ready for training. Either way, I hope it helps! 😁
These days, a bot like that is equivalent to winning the lottery!!! 🤣 I would probably hold off with sharing the code until at least I get my own PS5 hahaha (I'm dying to play Cyberpunk!!! and it's not the same game on PS4, it's more buggy than Goat Simulator!!) But from what I understand though, the stock runs out in-store before the websites get to update it, so your best bet is to know a guy in Best Buy and get him to put it aside for you when the stock arrives 😎 These bot people made it so much harder for regular folks to get new products, I strongly oppose this type of practice, I think it brings more harm than benefit.... but maybe I'm just old-fashioned 😊
@@PythonSimplified nicely put. I hate the scalping thing, however, since you basically need their technology to compete with them, I decided to make my own - just started running best buy, target, and gamestop bots. Will report back if it actually gets through! I just wanted to help a friend who's looking for one, rather than be a scalper grinch. Merry Christmas to you and fellow coders!
what happened if the post is not an image? IG lets you upload videos.
Hi Johan 😀
You'll need to tackle this with a conditional statement, where videos would be saved under the ".mp4" extension and images under the ".png" extension mentioned in the end of the video.
Let me know if you were able to figure it out! 😊if not - I can film a quick tutorial showing how to do it 😉
@@PythonSimplified please do! Also, how many images should I expect in my folder? I get a TypeError on the last for loop. TypeError: cannot use a string pattern on a bytes-like object
I'm guessing its because of video format.
@@PythonSimplified Hi!! Congrats for the perfect content!! I've spent my day studying for a master data science project and you've been helping me a lot :)
I had the same problem today with profiles with some specific photo types, videos and reels.. I couldn't save an image and I got the same error mentioned above..
Could you help us please?
Thanks!! ;)
@@paulameneses2306 thank you so much dear! 😁 Sure, I'll look into it over the weekend and adjust the code to include a conditional statement for videos 😉 we'll be in touch!
@@TheJohanHalim Yes, you are absolutely correct Johan!😊
You get this type of error when trying to save a collection of images (video) as a single .png or .jpeg image, it's due to an incorrect format.
The amount of images you should expect differs from one computer to another, depending on the size/scale of the display. The code in this tutorial would get you the number of images that results in a single scroll event.
And as Instagram uses a dynamic language - the more you scroll, the more images are loaded to the page. If you'd like to include several scroll events - checkout my community post, where I include additional resources, a detailed article and code examples on how you can expend this bot: ua-cam.com/users/postUgwVQazZhNNqwdghhdh4AaABCQ
I'll get back to you after the weekend with a solution to your video question 😉
I never imagined that python learning could have this much glamour.
hahaha
Very good comment
it was either fake Gamer Girl, OnlyFans, or this. But there is a lot of competition in those other areas so she went with this.
by the way the whole thing of her in the right side of the screen is planned out, she has her hair there to hide that she's wearing something hoping to appear nude so people will click and it will go viral or something 😂 Wish her the best of luck though! No hate! 😘
@@johnames6430 What a hater.
Me saw the thumbnail and click it
Me (after 10 mins) : ooh! It's a programming tutorial
hahahaha indeed! 🤣
lol 😂
I love your style of knowledge sharing. You made it simple enough to understand by someone like me who is just beginning to learn python. Thanks!
This content is really great. Thank you for sharing it. Years ago I used to do web scraping back when there was a lot less JS and interactivity but haven't done it in a long time. This video got me back into it. Keep it up!
Compliment from a fellow girl coder, this video was super informative and entertaining and you are obviously bright and talented!
Thank you so much for explaining and showing every basic steps in details !
Lots of beginners like me get stuck in setup steps that can seem obvious to experienced developpers.
For exemple thank you for explaining and showing all the download and setup Chrome driver steps.
Even on some big websites pass quickly these basics setup steps.
I was stuck but thanks to you I made it ! Thanks again !!!
*Lol , I Almost Forgot I came here to Learn Python! haha Stunning Looks!*
Thank you! 😆 I may have went a bit overboard on this video XD
@@PythonSimplified
Super hot supet smart :))
@@PythonSimplified Well, I think you were overdressed...
Thank you for this tutorial!. I am currently learning python on datacamp and haven't learned or seen any real world applications. I am definitely going to try this out and add what I learned from this video to my skillset.!
i don't know why i am watching it instead of listening to music but the way she teaches is real fun!!
OMG you're the best ! I was hitting my head on the wall. In reality you showed it in a way simple way. Thank you
this and arjancodes are by far my favorite channels!
learned some good web scraping practices here like waiting for elements to be clickable, clearing the input boxes, etc. thanks!
Perfect intro to Selenium! Very nice video! Thanks again Mariya!
Thank you Chiranjeeb! I told you you gonna like Selenium! 😁
@@PythonSimplified Yep! You were correct!
Mariya, I love you. Thanks, you gave me what I was looking for since 3 days
This channel is underrated, Change my mind!
thank you soo much you just saved me! im gonna rock that interview
That's awesome to hear Arthur! Good luck on your interview! 😀
You are the most intelligent and beautiful teacher of all.
I thought I didn't know English but now I think I do. Incredible articulation!!!
i started knowing why i am actually watching your video after the introduction :)
Very difficult. She's a world-class beauty, yet providing world class teaching on a very important topic.
GENIUS!!!!! I really liked your video, you were able to solve the concerns I had and no one else could solve.
THE BEST!!!
Great content! Very clear and useful.
Btw you don't need to add the local path of the webdriver as long as you have it in your Environment PATH. It looks over there by default.
Also, by the end of the video you can get rid of the counter variable if you use enumerate.
Wow, thank you Yaniv!! This is fantastic - we can save it there once and never worry about it again! !!👏👏👏
I'll just sit down in shame and be impressed with your super-efficient coding skills 😂😂
אגב!! אני ממש שמחה לראות שחברה׳ ישראלים מצטרפים לחגיגה, ועוד עם כאלה עצות נדירות!! תודה רבה יניב, שיחקת אותה! 😀
@@PythonSimplified חחח ממש לא ציפיתי תשובה בעברית! אבל באמת תודה על הסרטון זה עזר לי להבין הרבה דברים. הPATH היה סתם משהו קטן. תמשיכי כך!
@@piriwo תודה רבה, will do! :)
חשבתי אני הישראלי היחיד פה😅
wow thats amazing Mariyasha, i like your way of teaching, its very helpful for me. Both are in same boat upcoming future data scientist
Thank you so much Vikram, I'm glad I could help! 😄
I was wandering why don't you make full series of Selenium. Your Teaching is like Perfect :) I literally enjoyed a lot. Thankyou so much .
Thank you so much Faizmohammad! 😁
I've just posted a brand new video on Selenium, this time webs craping Facebook:
ua-cam.com/video/SsXcyoevkV0/v-deo.html
It's some sort of a series! just with a bunch videos on other subjects in between 😄
This is the coolest thing I have learnt today
Awesome! I'm glad I could help! 😁
OMG!!! your channel is perfect , thanks for this class !!!!
Thank you so much Matheus!! Glad you liked it! 😁
This channel has super easy tutorials on how to do it: ua-cam.com/channels/YvGiDV1JfJTpphxtKd7r_A.htmlfeatured
with yours videos I`ve been deployed my first flask app
studying with you is a excitement
This gender is always so organized.!!! A good session it was with so much clarity.
Thank you, I'm glad you found it helpful! 😃
She saved that "Purrrfect" for this moment.
Cool vid Maria! But on the days there isn't a "not now" button, the whole code grinds to a holt. To know how to add a function where if the "not now" button is present then code clicks on it and if it isn't present then the code skips to the step would be awesome!!
Omg please!!🥹
add a timeout to when waiting for "not now" to appear, and then put it in a try/catch block.
@@davidliu7246 thanks, I found it a while ago using try and except. now i just need help creating a chatbot. At the moment using nlp, spacy, textblob and a ton of for-loops
Solid tutorial! You're great at teaching!! Thanks
Thank you Adam, glad you liked it! 😀
I'd like to say that I loved it you're amazing and please keep on it, I'm happy too because English isn't my mother language and I understood you very well 😊.
Thank you Vinicius! I'm so happy to hear that! 😁😁😁
English isn't my first language either, so I'm always trying to use simple words whenever possible (the complicated words are also much harder to pronounce, I sound very Russian when I do this hahahaha)
Thank again and Merry Christmas!! 😊
@@PythonSimplified no problem, where are you from?
Could you please make a video explaining how to understand boxplot charts?
Merry Christmas 🙂
Tutorial was very helpful, but I did run into that same issue you mentioned about having to hit enter more than once in the search bar. Even with multiple instances of the send_keys / ENTER command, that part wouldn't work.
What I decided to do was call time.sleep(2) a couple of times between hitting ENTER, and it took a total of three instances of hitting ENTER to get by. Even then, the file grab was happening too quickly, and it took the thumbnails of the people in my Instagram stories... so I called for one more time.sleep to give the next page time to load, and it worked!
Your tutorials are great - I just found them and appreciate the concise, helpful videos!
Hi Matt, thank you so much for your amazing feedback! 😁
You totally nailed it with the time.sleep() command! I actually just published a tutorial on Medium that tackles it, as many people were running into the exact same issue with the ENTER command (and would be very irresponsible of me to ignore that hahaha):
medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885
Or you can skip the detailed explanations and just check out the source code on GitHub (I will update it in the description of the video shortly):
github.com/MariyaSha/WebscrapingInstagram/blob/main/WebscrapingInstagram_completeNotebook.ipynb
So my solution was quite similar, however, I've done this through 2 ENTER presses and 3 time.sleep(5) waits, so maybe try to extend the wait for a bit more than 2 seconds and then you can get rid of the third ENTER command :)
Thank again and have a fantastic week!
@@PythonSimplified that's actually really exciting that I came to the same conclusion you did. Even better still that you found a way to cut down on clicks. Thanks again for the tutorials!
Incredibly great explanations 🔥🔥. loved the video.
This tutorial is so sweet like you. Thank you so much Mariya ❤️
Thank you so much Saqib! 😀
I have a new Selenium tutorial premiering in 35 minutes:
ua-cam.com/video/TXdgMkf9gP0/v-deo.html
We're expanding the Linkedin messaging bot to seem much more human than it should be, I highly recommend to check it out! 😁
Waao.Very helpful 🙏🙏❤️More videos like these please 😊
All boys' fav teacher!
99.9% of programmers are boys anyways! 😃
@@PythonSimplified so glad to hear back from you. I have a little query regarding scraping. And its getting horrendous.
I want to scrape some details from the *info* section of the profiles in a group.
Is it doable?
Your response will be like a lifesaver
@@smalirizvi8026 the rule of thumb is - anything you can target with your mouse and keyboard - you can target with Selenium! 😃
Some platforms have more blockers than others, I suggest watching my Linkedin Bot tutorial where I show how to handle 3 very annoying blockers such as "click intercept" and hidden elements:
ua-cam.com/video/XdFUpFUDt88/v-deo.html
If you run into specific errors let me know, but as long as you're working closely with the Developer Tools and allow enough time for the elements to load with time.sleep() - the sky is the limit! 😁
@@PythonSimplified I watched your vid. It was far above my knowledge and expertise.
Actually I am trying to extract every profile's information in some groups on *Facebook* for my Machine Learning work.
So far, I think that I should watch your video of applying selenium on Facebook. And I will convert python code with java whenever I face some blockers there as well as use time.sleep().
Do u think that this should be fine?
You have literally earned my sub!
💎💎
Thank you so much for replying and taking notice of my problem.
Looking forward to you as I am pretty much done with my attempts with it
You explain each step very clearly, thanks for your effort
You are amazing in every way! Thank you for this useful tutorial.
loved your style of teaching and Accent.
Great explanation mariya👍🏻
What I mostly learned is a very good workflow to get info and use it. tnx :D
You explained Selenium very clear. Can you also explain in a video on how to prevent to be detected as a bot? I read many post on stack overflow but Selenium still got detected as a bot, even on the first page load.
simply amazing, big thanks. love you
Thank you so much my friend! 😁
Thanks a lot for all these gifts from you.
You're welcome, enjoy! 😀
u dont need to scrap data u actually scrapped my heart
I particularly like the background music. Great tutorial!
Very good tutorial. You could use in for loop, the enumerate to avoid the counter assignment.
for counter, image in enumerate (images):
save_as = os.[path.join(path, keyword[1:] + str(counter) + ‘.jpg’)
wget.download(image, save_as)
She is the reason why i programm in Python
you are amazing, best python tutor ever :)
i want to be your student 😆 you are the best teacher i have ever seen
Great video. It'd be cool to see one on Insta scraping using GET requests instead of Selenium, it's much faster. There's a good article on Diggernaut about it. Anyway, thanks, keep em coming!
Challenge accepted!! 😎
Get requests would be the next module I'll cover in the scraping lessons!
Thank you Chris! :)
@@PythonSimplified Love to see how you go with it! I've been stuck on it for the last 12 hours :'( I'm getting different static HTML returned from requests.get() than shown in the Chrome Dev Tools. Great channel btw, looking forward to more content!
also, you should make a Discord, it's a great way to consolidate a community and seems like you're building one quickly.
@@chris_burrows Thank you for suggesting! I'll check it out :)
I'll start advertising properly sometime in the near future. For now I take it easy, trying to focus on improving my filming/editing abilities before I go down that road :D
You are amazing)) really grateful for double enter tip
Hi Anna, there's actually a better way than double enter! Check out my community post where I included the improved code, it's better to concatenate the url to search for your term 😉
And thank you so much! 😃
@@PythonSimplified woohoo, thank you so much
Nice webscraping methods
Great work
Thank you! 😁
It's also available on Medium with a few improvements:
medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885
I'm also currently working on a website, where I'll post even a more updated version of the Medium article, where we'll be able to scrape the full-size images rather than thumbnails, and tackle more issues with the ENTER button 😀
Stay tuned!
😀ok
you are the best Maria Sha .LOL
When I see and listen this girl I'm fell the happiness...
Great tutorial! as always. Entertaining, useful, and a pretty teacher, as well .-) Keep up the good work...
gurl this saved my life
Попав на эту страницу, я сперва подумал что ошибся... Умные девушки восхитительны!
Спасибочки Симург! 😃
А я просто в восторге, от кудо все знают что я по-русски говору??
Наверно наши наших везде узнают! 🤣
@@PythonSimplified Да!, это магия xD
Wish I had known about this tool earlier, sounds very useful
Great tutorial, I always learn something new, thanks for sharing
You are so glamourous and after that the way you teach.
Wow impressed by this tutorial. Lots of respect from Pakistan 👌
Thank you Sualeh, I'm glad you liked it! Greetings from Canada! 😁😁😁
AttributeError: 'WebDriver' object has no attribute 'find_elements_by_tag_name' am getting this error whats the solution
Selenium removed the methods in the newer versions, you can try using an older version
This video is great thanks but i haven't been able to do the first step of opening chrome and Instagram. Maria, can you please do a video explaining what you do before, with Selenium. The parts of importing etc.... I downloaded selenium but still get a traceback.
Hi Andres! :)
I find that "pip install" doesn't always work, so I install everything with "conda install" when possible. If you are also using Anaconda, try:
"conda install -c conda-forge selenium"
If you are using another command line software let me know what it is and what kind of traceback are you getting and we can go from there 😉
@@PythonSimplified Thanks for your help! I am still struggling. I am using Selenium and here I detailed everything and still no solution www.reddit.com/r/learnpython/comments/kfn0wm/a_unique_no_module_named_selenium_problem_its_not/
Crystal clear and good pedagogy !
The most beautiful women in UA-cam who teaches python.
Can't concentrate 😂
I thought the same :D
Hello ,I really like you work, just started to do some web scrapping and you tutorial was of a great help for me , you are organized, your explanation is perfect clear and easy to follow, just one thing I noticed and I already fixed but I wanna know if there is other way around.
the problem is the search box don't appear if the chrome screen size is big so we just get the side bar with the search symbol which need to be pressed to open the search box, which I tried to figure out how to make it but couldn't.
so I just added a screen size (driver.set_window_size(740,500)) to make sure the search box will appear automatically.
If you know how to fix it the normal way that would be nice, Thank you
Thank you very much, this tutorial was very helpful and very very easy to understand, Cheers!! 🚀
Thank you girl for this excellent content! U get more one subscriber👋
Thank you so much Bruno, welcome aboard! 😀
Thank you for the video. Very clear and straight forward. Im having trouble with the searchbox.send_keys(Keys.ENTER) command. I tried ENTERing twice but still doesnt work. Any sugestions?
Thank you so much Luisgui 😁
Try the code I've just posted on my Github, you can solve it with time.sleep(seconds), it's in the "search keywords" section:
github.com/MariyaSha/WebscrapingInstagram/blob/main/WebscrapingInstagram_completeNotebook.ipynb
I actually just finished working on a Medium article about this where I explain everything in detail, I'm just waiting for my publication to approve it and then I'll send you a link 😉
you can make web-scrapping look and sound fun
Love the tut, you're a smart lady, no need to use gimmicks for clicks.
I agree but if that's what she likes and It will give her subs I dont see why not.
Great channel, great tutorial. New sub.
Thank you so much! 😀
Cool very cool, You earned My Subscribe. Keep up the good work!
Thank you so much Eyosiyas, welcome aboard! 😁
Doesn't insta have some protocol to deal with web scrappers?
Web scraping is perfectly legal, and many companies are actually using it to promote their business. With that said, each major website/webservice has a certain layer of protection from bots but it's not that difficult to outsmart it, even when captchas come into play.
I personaly find that Facebook has far more layers of protection than Instagram and I'm suspecting it also has the ability to dynamically adjust it's code to prevent you from scraping time and again using the same script. For example, my code for scraping all the images from a certain user's account became obsolete after 2 days, during which I've scraped over 650+ images several times in a row. I had to turn the code into something more general, rather than targeting a very specific single element with "Xpath" & "starts-with" - I had to fetch all the elements of the same kind on the page and only then narrow down the list to only include the single element I wanted to target. So I know first hand that Facebook is reacting to scraping and it's reacting quite fast.
With Instagram however, the same code that worked for me 3 months ago - is still relevant and works like a charm (please checkout the article on medium to see the upgraded version of the code I presented in the video):
medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885
I don't think you should be worried about scraping Instagram, that's for sure 😉
@@PythonSimplified wow more of a reply than I expected, thanks a million!
Sending an " Amazing" from Brazil here. Amazing.
Thank you so much Marcos!! Greetings from Canada! 😁
Great video. Thank you so much. Mariysha you are a very good teacher.
I hope that you can learn the correct accent on the specific word ( `at-tri-bute ), emphasis on the first syllable, because it is so often when speaking about Python and objects. You routinely mistakenly swap or mispronounce the verb attribute ( at-`trib-ute ) and the noun attribute ( `at-tri-bute ). I understand that you are not a native English speaker, and please try to take note of this and improve the accent on this word.
In Python a class or object `at-tri-bute is a noun, aka "method" or bound function.
Great video. Keep cranking them out.
How can we extract the trending images with links of the instagram post with its trending score of it.
Thanks for this info. I am able to download only 58 images at one go, is it possible to download all images around 2000 with this at one go ? Can you please advise.
Hi Pranshu! yes, you are able to download as many images as you need! 😀Just implement the scroll event inside a loop, just like this (scrolling 10 times to the bottom of the batch):
for j in range(0,10):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5) //wait 5 seconds
The more you loop, the more images you scrape!
I have a video premiering soon where I explain it in detail, check out "scroll to the bottom of the page" in the timestamps if you miss it 😉
Best video seen about web scraping and automation. But i really couldn't figure out this jupiter thing. What is jupiter?????
Jupyter Notebook is a very handy interface, where we can process our code cell by cell. It is very similar to Google Colab, and we can access it directly from our Anaconda terminal with a very basic command - "jupyter notebook".
If you wanna find out more about this interface, please check out one of my first tutorials, where I explain all about it in detail:
ua-cam.com/video/jp_3NOKHn9c/v-deo.html
So generally, when you're using a traditional IDE, it runs all your code at once, while notebooks like Jupyter allow you to separate your program into sections and run each of these sections independently. I find it to be very handy when teaching/learning, and I highly recommend to give it a try 😉
And thank you!! 😁😁😁
(sorry, I should have started from this but I got carried away with the explanation hahahaha)
Great tutorial!
You are amazing!!
Great video. it's possible to get all the comments of a post??
Very good. Congratulations!
Thank you David! 😁
Very helpful ! And you are a very good teacher :-)
Thank you! :)
Hello Mariya. Congratulation for video, I liked very much. A question: why you used .clear()?
Thank you Leonardo! 😀
I used my_input.clear() to make sure the input is empty before we type anything in, it's not a necessary step with Instagram but it's very handy when sending your keys to an input which already has text in it 😊
It will remove the existing text and only include the string of your following send_keys command
@@PythonSimplified Tanks Mariya.
You are Awesome!! Greetings from Cuba
Wow, thank you so much Yosdany!! Greetings from Canada! 😀😀😀
Can I used this to scraping Instagram engagement rates for the profile ?
Absolutely! 😀
With Python and Selenium - the sky is the limit! You can potentially even build an online service which tracks engagement across many different profiles and make a good profit out of it 😉
With Selenium you can collect any piece of data, manipulate it and evaluate it. Pandas, Numpy and Matplotlib libraries would also be very handy for such a task.
Good luck, and let me know if you come up with something cool! 😊
madam how to click on specific users indox box by his text - username in instagram
Try targeting the anchor element with the href of "/direct/inbox/":
inbox_button = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[href='/direct/inbox/']"))).click()
This would press the the "inbox" button on your individual account, let me know if I got you right :)
@@PythonSimplified thank you so much mam your really so friendly replying stranger with out expecting anything I am amazed mam even my teachers won't repsond like you
@@PythonSimplified
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome('C:/Users/91728/Desktop/python files/chromedriver')
driver.get("www.instagram.com")
username = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='username']")))
password = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='password']")))
username.send_keys("#username")
password.send_keys("#password")
button = WebDriverWait(driver, 2).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[type='submit']"))).click()
not_now = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]'))).click()
not_now_again = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]'))).click()
inbox_open = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class='xWeGp']"))).click()
recent_message_open = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class='-qQT3 rOtsg']"))).click()
message_box = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//textarea[@placeholder='Message...']")))
message_box.clear()
time.sleep(5)
message_box.send_keys("hi")
message_box.send_keys(Keys.ENTER)
#instead of recent_message_open I want to select the element using Username by adding an input field at first .mam please Kindly contact me on +917285906544 mam this is my whatsapp or I messaged you on instagram mam Please rey me I'll show you the image.
@@harsha4074 You cannot select a CSS class by class='-qQT3 rOtsg' it is a temporary class, you need something a bit more constant to rely on.
Try the 'href' attribute of the anchor instead.
So everywhere where you've targeted the 'class' property you'll need to change it to a different property that doesn't have any weird letters in it.
If it still doesn't fix it - try sending me the error code you got.
WebDriverWait(driver, 10)
What do we mean by 10. And is it constant or subject to our choice? How to decide it.
Hi Vijaykumar! 😃 to my understanding, the value 10 in the above example represents the timeout value (in seconds). Meaning - if this element was not detected within 10 seconds, you'll get a TimeoutException.
You can definitely adjust it to any value you'd like! Selenium documentation keeps it at 10, but I've seen other examples with 5 and 15 seconds, it all depends on how long it takes the given page to load 😊
thank you for this great video
i need scraping for e-commerce websites with selenium
LoL
You're welcome, I hope it helps you with your project! 😀
@@PythonSimplified juste for training LoL 😀
Thanks for the video ... super nice ... i am struggling scraping comments from instagram using selenium ... any video on that ?
Hey Mariyasha! First of all, this was an amazing tutorial. I was trying to do some scraping with the posts that open up in the explorer page in the Instagram desktop website. It seems that the elements for these posts are created in the DOM when they open after you click on them and are destroyed from the DOM when you close them. Hence, it is difficult to locate these elements and I've been getting the "No Such Element: Unable to locate element" exception. Is there a workaround for this situation in Selenium?
Stale element exception
This is such a great video!
Thank you so much Roshan, glad you liked it! :D
@@PythonSimplified Would it possible for you to make a video on image processing using the images that we scrapped in this video?
@@roshanshetty5661 Did you check out my video on image processing with Pillow?
ua-cam.com/video/NSHsaG2a4WU/v-deo.html
You can use the same principles and apply them to the scraped images if you're looking for general processing and simple transformations.
If you're looking to classify images with Artificial Intelligence, the video I've sent you above is not gonna help 🤣
I'm working on some more Machine Learning projects, where image classification will be a very important part (that's why we need this cats/dogs database in the first place)...
In the meanwhile, you can check out my Flower Image Classifier on Github:
github.com/MariyaSha/FlowerImageClassifier
This might give you a good example of the pre-processing we do to get the data ready for training.
Either way, I hope it helps! 😁
Thanks, Ma'am for this... Helps too much
Too much is my favourite quantity! 😊 Thank you, V!
Really great explanation. 👏
I'm late with this suggestion, but if you come out with a "bot to buy ps5" webscrape video right now, it could be huge!
These days, a bot like that is equivalent to winning the lottery!!! 🤣
I would probably hold off with sharing the code until at least I get my own PS5 hahaha (I'm dying to play Cyberpunk!!! and it's not the same game on PS4, it's more buggy than Goat Simulator!!)
But from what I understand though, the stock runs out in-store before the websites get to update it, so your best bet is to know a guy in Best Buy and get him to put it aside for you when the stock arrives 😎
These bot people made it so much harder for regular folks to get new products, I strongly oppose this type of practice, I think it brings more harm than benefit.... but maybe I'm just old-fashioned 😊
@@PythonSimplified nicely put. I hate the scalping thing, however, since you basically need their technology to compete with them, I decided to make my own - just started running best buy, target, and gamestop bots. Will report back if it actually gets through! I just wanted to help a friend who's looking for one, rather than be a scalper grinch. Merry Christmas to you and fellow coders!
Why I didn't have such teacher in my university for c++, I wouldn't skip any lab :)