Scrapy Course - Python Web Scraping for Beginners

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ • 421

  • @NiranjanND
    @NiranjanND Рік тому +82

    14:45 source venv/bin/activate is for the mac if youre on window
    ".\venv\Scripts\activate" use this in your terminal

    • @sampoulis
      @sampoulis 9 місяців тому +2

      on windows you just type in the name of the venv file, then \Scripts\activate as long as you are in the project folder.
      Example:
      PS D:\Projects\Scrapy> .venv\Scripts\activate

    • @johnsyborg
      @johnsyborg 9 місяців тому +2

      wow you are my hero

    • @aneeesthesia
      @aneeesthesia 8 місяців тому +6

      in case of security issues you might need this too :
      Set-ExecutionPolicy Unrestricted -Scope Process

  • @johnteles1131
    @johnteles1131 6 місяців тому +10

    I'm in part 8 and I can't thank you enough for this course! The level of given knowledge is UNREAL !!!

  • @VasylPoremchuk
    @VasylPoremchuk Рік тому +82

    The issue we faced in part 6 was that the values added to the attributes of our `BookItem` instance in the `parse_book_page` method were being passed as `tuples` instead of `strings`. Removing commas at the end of the values should resolve this issue. Once we fix this problem, everything should work perfectly without needing to modify the `process_item` method.

  • @ruzhanislam9191
    @ruzhanislam9191 6 місяців тому +18

    FYI for those who want to scrape dynamic websites, dynamic websites needs selenium which is not included in this course.
    But no cap, this is a great course.

    • @isaacmontero6577
      @isaacmontero6577 4 місяці тому

      No 🧢

    • @jamo6857
      @jamo6857 4 місяці тому

      Is it hard to add Selenium into the web scraping project from this video? Not to sure if that is a dumb question or not, still learning.

    • @codetohack8547
      @codetohack8547 3 місяці тому

      ​@@jamo6857same question did u get the answer ?

    • @destinyovirih7593
      @destinyovirih7593 2 місяці тому

      So have u found at how to scrape dynamic web..

    • @ruzhanislam9191
      @ruzhanislam9191 2 місяці тому

      @@jamo6857 not sure...but for me...I could not use the selenium driver in my pc.

  • @v_iancu
    @v_iancu 7 місяців тому +6

    at 52:00 you don't need to check for catalogue, you can just follow the url in the tag and it gives me 1000 items

  • @TriinTamburiin
    @TriinTamburiin Рік тому +35

    Note for Windows users:
    To activate virtual env, type venv\Scripts\activate

    • @gilangdeatama4436
      @gilangdeatama4436 Рік тому +1

      very useful for windows user :)

    • @entrprnrtim
      @entrprnrtim Рік тому +1

      Didn't work for me. Can't seem to get it to activate

    • @jawadlamin4047
      @jawadlamin4047 Рік тому

      @@entrprnrtim in the Terminal switch PowerShell to cmd

    • @KrishanuDebnath-vv9cs
      @KrishanuDebnath-vv9cs 10 місяців тому +1

      The actual one is .\virtualenv\Scripts\Activate

    • @Sasuke-px5km
      @Sasuke-px5km 10 місяців тому

      venv/Scripts/Activate.ps1

  • @greypeng3486
    @greypeng3486 4 місяці тому +2

    I am a python newbie without any experience in coding. With the help of this guide I am able to write a spider and fully understand the architecture. Really helpful👍👍👍They also have other guides to help you polish and functioning your spider, highly recommended!

  • @omyeole7221
    @omyeole7221 8 місяців тому +8

    This is the first coding course I followed up to an end. Nicely taught. Keep it up.

  • @falcongold2024
    @falcongold2024 Рік тому +15

    13:37 creating venv
    17:45 create scrapy project
    29:31 create spider
    33:38 shell

  • @pkavenger9990
    @pkavenger9990 Рік тому +6

    1:34:58 instead of using a lot of if statement use mapping.
    for example:
    # saving the rating of the book as integer
    ratings = {"One": 1, "Two": 2, "Three": 3, "Four": 4, "Five": 5}
    rating = adapter.get("rating")
    if rating:
    adapter["rating"] = ratings[rating]
    This is not only faster but it also looks clean.

  • @leolion516
    @leolion516 Рік тому +39

    Amazing tutorial, I've only gone through half of it, and I can say it's really easy to follow along and it does work ! Thanks a lot !

  • @flanderstruck3751
    @flanderstruck3751 Рік тому +26

    Thank you for the time you've put into this tutorial. That being said, you should make clear that the setup is different for windows than Mac. No bin folder for example

  • @hxxzxtf
    @hxxzxtf 8 місяців тому +2

    🎯 Key Takeaways for quick navigation:
    00:00 *Scrapy Beginners Course*
    01:51 *Scrapy: Open Source Framework*
    03:12 *Scrapy vs. Python Requests*
    04:24 *Scrapy Benefits & Features*
    05:21 *Course Delivery & Resources*
    06:18 *Course Outline Overview*
    08:20 *Setting Up Python Environment*
    16:38 *Creating Scrapy Project*
    20:05 *Overview of Scrapy Files*
    26:07 *Understanding Settings & Middleware*
    27:13 *Settings and pipelines *
    28:22 *Creating Scrapy spider *
    30:24 *Understanding basic spider structure *
    33:32 *Installing IPython for Scrapy shell *
    34:27 *Using Scrapy shell for testing *
    36:35 *Extracting data using CSS selectors *
    38:23 *Extracting book title *
    39:43 *Extracting book price *
    40:49 *Extracting book URL *
    41:18 *Practice using CSS selectors *
    42:02 *Looping through book list *
    43:15 *Running Scrapy spider *
    47:29 *Handling pagination *
    53:52 *Debugging and troubleshooting *
    56:12 *Moving to detailed data extraction*
    Update Next Page
    Define Callback Function
    Start Flushing Out
    Data cleaning process: Remove currency signs, convert prices, format strings, validate data.
    Standardization of data: Remove encoding, format category names, trim whitespace.
    Pipeline processing: Strip whitespace, convert uppercase to lowercase, clean price data, handle availability.
    Converting data types: Convert reviews and star ratings to integers.
    Importance of data refinement: Iterative process of refining data and pipeline adjustments.
    Saving data to different formats: CSV, JSON, and database (MySQL).
    Different methods of saving data: Command line, feed settings, and custom settings.
    Setting up MySQL database: Installation, creating a database, installing MySQL connector.
    Setting up pipeline for MySQL: Initialize connection and cursor, create table if not exists.
    01:56:31 *Create MySQL table*
    02:04:42 *Understand user agents*
    02:13:03 *Implement user agents*
    02:25:01 *Scrapy API request*
    02:26:11 *Fake user agents*
    02:27:20 *Middleware setup*
    02:33:00 *Robots.txt considerations*
    02:40:19 *Proxies introduction*
    02:42:34 *Proxy lists overview*
    02:52:17 *Proxy ports alternative*
    02:52:32 *Proxy provider benefits*
    02:53:12 *Smartproxy overview*
    02:54:44 *Residential vs. Datacenter proxies*
    02:55:27 *Smartproxy signup process*
    02:56:19 *Configuring Smartproxy settings*
    02:58:07 *Adjusting spider settings*
    03:00:23 *Creating a custom middleware*
    03:01:21 *Setting up middleware parameters*
    03:03:02 *Fixing domain allowance*
    03:04:17 *Successful proxy usage confirmation*
    03:05:00 *Introduction to proxy API endpoints*
    03:06:29 *Obtaining API key for proxy API*
    03:07:54 *Implementing proxy API usage*
    03:10:36 *Ensuring proper function of proxy middleware*
    03:12:10 *Simplifying proxy integration with SDK*
    03:13:25 *Configuring SDK settings*
    03:14:47 *Testing SDK integration*
    03:17:56 *Upcoming sections on deployment and scheduling*
    03:21:22 *Scrapy D: Free, configuration required.*
    03:21:35 *Scrape Ops: UI interface, monitoring, scheduling.*
    03:22:02 *Scrapey Cloud: Paid, easy setup, no server needed.*
    03:49:42 *Dashboard configuration guide.*
    03:51:21 *Set up ScrapeUp account.*
    03:52:48 *Install monitoring extension.*
    03:55:24 *Server setup instructions.*
    04:00:51 *Job status and stats.*
    04:01:47 *Analyzing stats for optimization.*
    04:02:42 *Integration with ScrapeUp.*
    04:18:05 *Scheduler Tab Options*
    04:19:14 *Job Comparisons Dashboard*
    04:20:15 *Scrappy Cloud Introduction*
    04:21:36 *Scrappy Cloud Features*
    04:22:20 *Scrappy Cloud Setup*
    04:25:33 *Cloud Job Management*
    04:28:57 *Scrappy Cloud Summary*
    Made with HARPA AI

  • @ialh1
    @ialh1 7 місяців тому +3

    Thanks!😀

  • @SpiritualItachi
    @SpiritualItachi 11 місяців тому +4

    For PART 8 if anybody is having trouble with the new headers not being printed to the terminal, make sure in your settings.py file that you enable the "ScrapeOpsFakeUserAgentMiddleware" in the DOWNLOADER_MIDDLEWARES and not the SPIDER_MIDDLEWARES.

    • @jonwinder6622
      @jonwinder6622 10 місяців тому +2

      He explained that in the video.

    • @SpiritualItachi
      @SpiritualItachi 10 місяців тому +1

      @@jonwinder6622 Yeah after going through it again I realized I missed that detail..

    • @jonwinder6622
      @jonwinder6622 10 місяців тому +2

      @@SpiritualItachi I dont blame you, its so easy to look over since he literally goes through so much lol

  • @lemastertech
    @lemastertech Рік тому +12

    Thanks for another great video FreeCodeCamp! This is something I've wanted to spend more time on for a long time with python!!

  • @kaanenginsoy562
    @kaanenginsoy562 Рік тому +7

    for windows users: If you get error first type Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy Unrestricted -Force
    and after that type venv\Scripts\activate

  • @seaondao
    @seaondao 11 місяців тому +5

    This is so cool! I was able to follow until Part 6 but from Part 7 I couldn't so I will come back in the future after I have basic knowledge of MYSQL and databases. (Note for myself).

  • @jonwinder1861
    @jonwinder1861 10 місяців тому +2

    I wasted 30 bucks on udemy courses and they are not nearly as good as this tutorial, thanks man

  • @nickeldan
    @nickeldan 5 місяців тому +2

    When selecting a random user agent from your list, you can do random.choice(self.user_agents_list).

  • @aladinmovies
    @aladinmovies 8 місяців тому

    Thanks Joe Kearney! Nice course of course. You are good teacher, love

  • @jackytsui422
    @jackytsui422 Рік тому +3

    I just finished part 7 and want to thanks for the great tutorial!!

  • @terraflops
    @terraflops Рік тому +8

    this tutorial really needed the code aspect to help make sense of what is going on and fix errors. thanks

  • @Felipe-ib9cx
    @Felipe-ib9cx Рік тому +1

    I'm starting this course now and very excited! Thanks for the effort of teaching it

  • @codewithguillaume
    @codewithguillaume Рік тому +1

    Thanks for this crazy course !!!

  • @ThanhNguyen-rz4tf
    @ThanhNguyen-rz4tf Рік тому +2

    This is gold for beginners like me. Tks.

  • @Autoscraping
    @Autoscraping 11 місяців тому +1

    A wonderful video that we've used as a reference for our recent additions. Your sharing is highly appreciated!

  • @Ka-kz3he
    @Ka-kz3he Рік тому +1

    Part4 54:07
    if you're wondering why the result of 'item_scraped_count' still only 40 probably href is already full url so don't duplicate its domain
    teach yourself to improvise💪

  • @Faybmi
    @Faybmi Рік тому +2

    if you have problems on 1:17:32 running >>> scrapy crawl bookspider -o bookdata.csv
    instead write >>> scrapy crawl -o file.csv -t csv

  • @utsavaggrawal2697
    @utsavaggrawal2697 Рік тому +7

    make a course to block the crypto spammers
    btw thanks for the scrapy course, i was searching for this for a while😃

  • @renbash2137
    @renbash2137 6 місяців тому +1

    such a complete course...

  • @yooujiin
    @yooujiin Рік тому +4

    the course I needed months ago 😭

    • @_Clipper_
      @_Clipper_ Рік тому +1

      did you try some other course?

    • @yooujiin
      @yooujiin Рік тому +1

      @@_Clipper_ bought two Udemy courses. the tutorials on UA-cam is limited. so is this one.

    • @_Clipper_
      @_Clipper_ Рік тому

      @@yooujiin are you in data science? I need some recommendations for ML and web scraping. I tried Jose pradila's course and it wasn't very in depth so refunded that. Please recommend only if you are in the same field or have been suggested the same by someone you know in ds/ai/ml.

    • @yooujiin
      @yooujiin Рік тому +1

      @@_Clipper_ I'm currently doing my masters in software development. I would love for some recommendations myself. I recommend a the scrapy course by Ahmed Rafik

  • @akademiker5788
    @akademiker5788 Рік тому +2

    44:33 my spider doesn’t give back the data from the html, it crawls but stops without having selected any data. I rewrote the code multiple times but it doesn’t change.
    *just solved it: had to save the bookspider code first

  • @johnnygoffla
    @johnnygoffla Рік тому +3

    Thank you so much for providing this content for free. It's truly incredible that anyone with an internet connection can get free coding education, and its all thanks to people like you!

  • @WanKy182
    @WanKy182 Рік тому +1

    1:24:48 don't forget to remove comas after book_item['url'] = response.url and all others when we add BookItem import. Because i have some values in list instead of string

    • @minhazulislam683
      @minhazulislam683 Рік тому

      Please help me, I got 2 errors from this line : from bookscraper.items import BookItem. (errors detected in items and BookItem). Has anyone faced the same issue as me?

  • @gamerneutro3245
    @gamerneutro3245 8 місяців тому +1

    They need to make a certification option to whom see all the courses, It'd be so interesting

  • @Code_Play_com
    @Code_Play_com 9 місяців тому

    Very practical and helpful video with very detailed explanation!

  • @jean-mariecarrara7226
    @jean-mariecarrara7226 Рік тому +2

    Very clear explanation. Many thanks

  • @renantrevisan2406
    @renantrevisan2406 10 місяців тому +3

    Nice video! Unfurtunelly part 6 has a lot of code without debug, so it's really hard to fix errors. Something is going worng with my code, but i can't identify

  • @TradeandoData
    @TradeandoData Рік тому +2

    thx! very hard to follow, needed a solid knowledge in python

  • @michaelday6987
    @michaelday6987 Рік тому +5

    In case anyone is having a problem activating venv in windows, use the following command. . venv\scripts\activate

  • @grzegorzkossowski
    @grzegorzkossowski 3 місяці тому

    02:47:00 - the fun part is that you could... scrap geonode for IP-s 🙂

  • @MortyVerse-2
    @MortyVerse-2 Рік тому

    oh man , he was just showing me how good
    his code is !!!!!

  • @Rodrigo.Aragon
    @Rodrigo.Aragon Рік тому +1

    Great content! It helped me a lot to understand some concepts better. 💯

    • @bubblegum8630
      @bubblegum8630 Рік тому

      CAN SOMEONE HELP ME!!!!!???? At part 3 when you create bookscraper, I don't have bookspider.py created for me. What do i do for it to be generated???? I AM CONFUSED

  • @elfincredible9002
    @elfincredible9002 Рік тому +1

    1:49:03 I am getting a black json file with only [ ] inside the file...terminal returns something to do with process_item...How does one solve this?
    sorry I'm new to this.

  • @eduardabramovich1216
    @eduardabramovich1216 Рік тому +1

    I learned the basics of Python and now I want to focus on something to get a job, is web scraping a skill that can get you a job on its own?

  • @felicytatomaszewska
    @felicytatomaszewska 8 місяців тому +4

    I watched it twice and I think it can be shortened quite a lot and better organized.

  • @pawelk7302
    @pawelk7302 7 місяців тому +1

    If you are wondering why 'process_request' does not work in part-8 make sure that you enabled 'downloader middlewares' in settings.py, instead of 'spider middlewares'...

  • @flanderstruck3751
    @flanderstruck3751 Рік тому +6

    Note that I copied the code from the tutorial page for the ScrapeOpsFakeUserAgentMiddleware, and when trying to run it I get the following error: (...) AttributeError: 'dict' object has no attribute 'to_string'.
    SOLUTION: copy the process_request function exactly as it is in the video, not like in the tutorial page.

  • @_REcon_
    @_REcon_ 7 місяців тому +1

    13:49 you created folder named 'part-2' without showing us every single detailed .. please show us everything excatly.

  • @Michael-dx8qz
    @Michael-dx8qz 5 місяців тому +1

    I think you forgot to remove the comma in parse_book_page which is why you needed to convert the tuples

  • @rwharrington87
    @rwharrington87 Рік тому +2

    Looking forward to this. A mongodb/pymongo section would be nice for data storage though!

  • @albint3r532
    @albint3r532 8 місяців тому +1

    "I have a question, does all the change of agents and proxies once we implement them in our code also reflect in the Shell?"

  • @ChristopherFabianMendozaLopez
    @ChristopherFabianMendozaLopez 7 місяців тому

    This was very helpful, thank you so much for sharing all this knowledge free!

  • @iammkullah
    @iammkullah 6 місяців тому +2

    if anyone got error during the sql part keep in mind to comment the previous feed setting which we have selected as csv format that was creating error for me.

  • @anthonyrodz7189
    @anthonyrodz7189 10 місяців тому +2

    I have an error with pylance it shows a warning when I'm importing the bookscrap.items , I guess I did something wrong creating the environment

  • @chaule7528
    @chaule7528 Рік тому +1

    I can't load data from scrapy to sql tables like he did at 2:02:01, I got the column names, but the data is empty, and no errors. Anyone knows why ?

  • @bratadippalai
    @bratadippalai Рік тому +1

    Exactly what I wanted at this moment, Thank you

  • @ndungukaranja913
    @ndungukaranja913 10 місяців тому +2

    In part 6, at the start of the process item function, despite having the exact same code as the tutorial my value = adapter.get(field_name) returns the exact value and not a tuple, so it was unnecessary to add the index in the following line, does anyone know why this is happening?

  • @bryancapulong147
    @bryancapulong147 10 місяців тому +1

    Can scrapy get data from Cloudflare-protected websites? I just want to extract a list of holidays from our country's government websites to automatically store them in a table, but they don't have an API for it.

  • @flanderstruck3751
    @flanderstruck3751 10 місяців тому +1

    Ok, so you schedule your spiders using scrapeops. But how do you consume the product of such scrapping? As far as I know it's just being stored in the virtual server. Can you retrieve these with scrapeops?

  • @T-Json
    @T-Json Місяць тому

    Great course and thank you for your efforts! But in part 11, aren't you publishing your private scrapeops-api-keys to the public? Isn't that a little bit dangerous? Or to ask differently, what would be a good way to do this instead?

  • @sarfrazjahan8615
    @sarfrazjahan8615 Рік тому +7

    Overall good video I learn lot of things but I thinks you should discuss briefly about css and xpath selectors. I am facing problem on it

  • @benjamunji1
    @benjamunji1 Рік тому +15

    For anyone having errors in Part 8 with the fake headers:
    You need to import this:
    from scrapy.http import Headers
    and then in the process_requests function you need to replace this line:
    request.headers = Headers(random_browser_header)

  • @MahdiOsali-u4d
    @MahdiOsali-u4d Рік тому

    I definitely recommend it to everyone 👌👌👌

  • @AhmedKhan-bv6gg
    @AhmedKhan-bv6gg 7 місяців тому +1

    How can I make my scrapy. Scrap a specific book in the website. If the user types a title name they will find info only about that book

  • @uzeyirktk6732
    @uzeyirktk6732 9 місяців тому +1

    I don't understand the i python shell activation part because we don't have scrapy setting file

  • @debott1111
    @debott1111 Рік тому +1

    Thank you, thank you, and once again, thank you!

  • @stuartk716
    @stuartk716 Рік тому +2

    Please help, I keep getting Crawled 0 pages and the output files are always empty

  • @olivermusgrove7616
    @olivermusgrove7616 6 місяців тому

    Great content mate really appreciate it!

  • @priyanshu-y9k
    @priyanshu-y9k Рік тому +2

    Thanks for such a wonderful web scraping tutorial. Please make a video tutorial on how to download thousands of pdfs from a website and perform pdf scraping with scrapy. In general, please make a tutorial on pdf scraping as well.

    • @haleygillenwater8971
      @haleygillenwater8971 Рік тому +1

      You don’t do the pdf scraping with scrapy- it’s designed for scraping pdfs. You can download the pdfs using scrapy (at least I imagine you can), but you have to use a pdf scraper module in order to parse the contents of the pdf

  • @lilmentor3
    @lilmentor3 Рік тому +3

    hard to follow if you are on a windows machine. 15 mins in and I am already lost. There's no bin folder?

    • @cotsrock9914
      @cotsrock9914 Рік тому

      same with me

    • @BT-te9vx
      @BT-te9vx Рік тому

      on windows you would need to activate the virtual environment by venvName\Scripts\activate where venvName is the name of the virtual environment you created.

    • @BT-te9vx
      @BT-te9vx Рік тому

      @@cotsrock9914 I know, it's always frustrating when you can't even setup the environment(no pun intended) before getting to the code. On windows, you would need to use Scripts\activate instead. Let me know if I can help, hours/days of frustration that I've had, I can totally understand ;)

    • @madhuraggarwal819
      @madhuraggarwal819 Рік тому +1

      Use below command to activate:
      .\Scripts\activate

  • @shana4430
    @shana4430 3 місяці тому

    Can you please explain why did we take [1] in li[1] @ 1:05:28 ?

  • @salimtlemcani4122
    @salimtlemcani4122 4 місяці тому

    best course ever

  • @rahmonpro381
    @rahmonpro381 Рік тому +2

    thanks for the tutorial, I have a question, which is the best choice for scraping websites , python or node ?

    • @jonwinder1861
      @jonwinder1861 10 місяців тому

      python

    • @rahmonpro381
      @rahmonpro381 10 місяців тому

      I am using nodejs it's much faster ^^ @@jonwinder1861

  • @Alex-kz1rw
    @Alex-kz1rw Рік тому +1

    Hey 👋🏻 when i use the scrapy Shell and use the view(response) command i cannot see all of the Html from the Website. I just can see the "cookies accept" window i can accept this in the browser and after that i have a blanc browser. What can i do to fetch the whole html code?

  • @Dinesh-BK-24
    @Dinesh-BK-24 4 дні тому

    If any one running into programming error for set while writing data in psql then simply change the adapter[price_key] = float(value.replace('£', '')) in process item. I suppose this would work for previous issues too.

  • @chetanmaurya8557
    @chetanmaurya8557 6 місяців тому

    33:15 what to do if a fetch(url) is giving out as ['partial'] I think it is not giving me all the html elements is there any way to handle this?

  • @ram_qr
    @ram_qr Рік тому

    I don't get the output of all the urls on 53:07 (only 20 items)

  • @deograsswidambe7803
    @deograsswidambe7803 Рік тому +1

    Amazing tutorial, I've really enjoyed watching and it helped me a lot with my project.

  • @zee_designs
    @zee_designs Рік тому

    Great tuturial, Thanks a lot!

  • @briyarkhodayar5986
    @briyarkhodayar5986 Рік тому +2

    I have a question with part4, in part 4 at first you just scraped one page but later on when we want to have all the next pages and modified it, it still shows me the first page, I'm not sure what is the reason. can you help me with that please?
    Thank you

    • @MarwanBahgat
      @MarwanBahgat 8 місяців тому +1

      i face the same do you find a solution

  • @DibyanshuPandey-dg5hh
    @DibyanshuPandey-dg5hh Рік тому +1

    Thanks alot Freecodecamp for another amazing tutorial ❤️.

  • @JBR7655
    @JBR7655 2 місяці тому

    In part 2 where you typed the activate command with bin in the path, this is not correct for windows installations. It says issue the command .\activate.bat and this worked for me

  • @CollinChrisComedy
    @CollinChrisComedy 4 місяці тому +1

    Hi, I'm trying this on VS studio, and in part 4 after running scrapy crawl bookspider, I'm not yielding any results, I even tried going to the guide and copying the exact code but it's still not yielding any results, anyone know what the issue with this is?

  • @ram_qr
    @ram_qr Рік тому +1

    (1:18) I followed all the instructions but my output includes only title, price and link.

    • @kissaspde
      @kissaspde 9 місяців тому

      same, you solved?

  • @milchreis9726
    @milchreis9726 Рік тому +1

    Thank you very much for the good work! Really appreciate the tutorial.
    I need to point out that MySQL I installed with dmg cannot be used with terminal somehow, so I ended up reinstalled MySQL using terminal.

  • @jervintravis3002
    @jervintravis3002 Рік тому +2

    in part 4 i have followed word for word ur code but on my side instead of getting 1000 item scraped count am only getting 20 help pls

  • @JYY-b4u
    @JYY-b4u Рік тому +1

    i want to know how to learn python scrapy

  • @AnnaClothes
    @AnnaClothes 3 місяці тому

    omg , soo complicated , but ill sit trough !

  • @franciscojuarez8085
    @franciscojuarez8085 Рік тому +2

    Is anyone else getting the Line 21 error "NoneType object is not subscriptable" even after fixing the code? I can't seem to get around it. Not even deleting the upc line in both the bookspider and items. I don't really know what to do lol

    • @Kofenzz
      @Kofenzz Рік тому

      i've had this problem too, in my case the problem was that in the spider at book_item["price"] I had the following book_item["price"] = response.css("p.price-color ::text").get() AND the correct way was book_item["price"] = response.css("p.price_color ::text").get(), because the price would not return anything

    • @Mohitsingh-lk7ez
      @Mohitsingh-lk7ez Рік тому

      i too

    • @Mohitsingh-lk7ez
      @Mohitsingh-lk7ez Рік тому

      after solving the bug ....also getting error same as u...

  • @milckshakebeans8356
    @milckshakebeans8356 Рік тому

    When you save to the database in 2:02:00 ; I had the error because the url was a tuple and 'cannot be converted'. If someone has a similar problem he can just index into the url like this: 'str(item["description"][0])' (instead of the code provided which is this: 'str(item["description"]') in the excute function in the process_item function.

    • @ibranhr
      @ibranhr Рік тому

      I’m still having the errors bro

    • @milckshakebeans8356
      @milckshakebeans8356 Рік тому

      @@ibranhr I found the error by looking at the what is being processed when the error happened. I saw that it was a tuple and fixed it. Try something similar too if you know the error is with converting values.

  • @robertcasey1708
    @robertcasey1708 6 місяців тому

    is there a way to extract the data in a table when the rows dont always correspond to the same fields? do we have to make some sort of mapping table?

  • @yogeshpatil1586
    @yogeshpatil1586 Рік тому

    54:36 - 18/05
    1:23:16 - 26/05
    1:44:19 - 14/06

  • @negonifas
    @negonifas Рік тому +3

    that's what i need! 👍👍👍

  • @Alien-cr1zb
    @Alien-cr1zb Рік тому +1

    Is this course enough to do scraping tasks on freelancing websites
    If it's not could anyone mention what should I do after I finish this

  • @M0hamedElsayed
    @M0hamedElsayed Рік тому +1

    Thank you very much for this great course. I really learned a lot.
    ❤❤❤

  • @kenjohn-ls8ct
    @kenjohn-ls8ct Рік тому

    god bless the internet and freecodecamp! thanks !

  • @Khan-At-Large
    @Khan-At-Large 10 місяців тому +1

    Can we scrape dynamic javascript webpages through scrappy ?

  • @ismailgrira7924
    @ismailgrira7924 Рік тому +2

    just in time, thnx tho
    I didn't knew what i will do for a project i'm working on till i watched the video
    life saver