Cleaning up 1000 Scraped Products with Polars

Поділитися
Вставка

КОМЕНТАРІ • 27

  • @JohnWatsonRooney
    @JohnWatsonRooney  6 місяців тому +2

    To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/JohnWatsonRooney/ . You’ll also get 20% off an annual premium subscription.

  • @JohnSmith-y8o
    @JohnSmith-y8o 6 місяців тому +3

    I'm parsing my scraped data into Pydantic models. Never looked into dataframes pandas etc. should I? :D

    • @JohnWatsonRooney
      @JohnWatsonRooney  6 місяців тому +1

      definitely, it's great for analysis and transforming data

  • @Rice0987
    @Rice0987 6 місяців тому +2

    I've got next message in VS:
    Missing required CPU features.
    Install the `polars-lts-cpu` package instead of `polars` to run Polars with better compatibility.
    AND it starts working!

    • @Rice0987
      @Rice0987 6 місяців тому

      This is additional option if you have such issue.
      So, if you have it you need to install that second package after installing polars.

    • @historiadeunahora6383
      @historiadeunahora6383 3 місяці тому +1

      this also helped me, thanks!!

  • @mishmohd
    @mishmohd 6 місяців тому

    Cleaning data is so passe, we up here dry cleaning data

  • @Septumsempra8818
    @Septumsempra8818 6 місяців тому

    Has anyone built an engine to mimic guman behavior? How does mone move the mouse and scroll etc to mimic human behavior. I have a site that requires a from to be filled in to change location. But it triggers captcha without fail. Does anyone have tips on mimicking human behavior or how to type something into a captch form

  • @fredde7356
    @fredde7356 6 місяців тому

    Hey John, can you please continue the scraping livestream with your test site?
    Would love to see how to handle the drop-down menus, Java script and how to handle stricter cloudflare rules
    Would be happy to hear about some news, if you plan to continue

  • @DM-py7pj
    @DM-py7pj 6 місяців тому +2

    Looks v similar to PySpark.

  • @bakasenpaidesu
    @bakasenpaidesu 6 місяців тому +2

    .

  • @crissydogg
    @crissydogg 6 місяців тому +1

    Brilliant as always

  • @prashantbhosale6745
    @prashantbhosale6745 6 місяців тому

    hi, can you show us how to extract all the data related to the title/field from the pdf file.

  • @pypypy4228
    @pypypy4228 6 місяців тому +1

    What are the advantages over Pandas?

    • @JohnWatsonRooney
      @JohnWatsonRooney  6 місяців тому +4

      It’s going to be faster but for what I do there aren’t any really - I just wanted to try it out

  • @einekleineente1
    @einekleineente1 6 місяців тому

    0:20 you promised to link to the other video wehre you got the data... now I am sad that the link is not there .. and I never would get the idea to look at your channel to see which videos you posted in the last days.. ;-)

    • @DM-py7pj
      @DM-py7pj 6 місяців тому

      might be video titled "Website to Dataset in an instant" based on quick scan of field names.

  • @gokulyc
    @gokulyc 6 місяців тому

    Code / repo link?

  • @abdulrahmanharoon3165
    @abdulrahmanharoon3165 6 місяців тому +1

    Is it faster than pandas?