Use These Data Cleaning Helpers for R from the janitor package

Поділитися
Вставка
  • Опубліковано 27 лис 2024

КОМЕНТАРІ • 33

  • @rappa753
    @rappa753  5 місяців тому +1

    If you enjoyed this video and want to level up your R skills even further, check out my latest video courses:
    📍Data Cleaning Master Class at data-cleaning.albert-rapp.de/
    📍Insightful Data Visualizations for "Uncreative" R Users at arapp.thinkific.com/courses/insightful-data-visualizations-for-uncreative-r-users

  • @siddheshpujari25
    @siddheshpujari25 27 днів тому +1

    Thank you for exposing me to this library. Huge time saver. Will start implementing this right away

    • @rappa753
      @rappa753  26 днів тому

      Awesome! Happy to hear that :)

  • @TheDataDigest
    @TheDataDigest 6 місяців тому +5

    Everything on tabyl() is blowing my mind. I used to do group_by() %>% count() %>% pivot_wider() to produce these 2x2 tables, but totals was always a bit tricky and needed manual joining or mutates. This could have saved me so much time at work :) but very cool that I learned it now! Excellent tips!

    • @ambhat3953
      @ambhat3953 5 місяців тому

      Same here, never knew about tabyl()....and always used groupby piviot wider etc

  • @AlbertoFCabreraCasillas
    @AlbertoFCabreraCasillas 6 місяців тому +2

    A most concise presentation on how to unlock the power of janitor and tabyl. It would allow me to stop using Stata to create crosstabs. Many thanks.

    • @rappa753
      @rappa753  6 місяців тому

      Nice! Glad that you got so much value out of this, Alberto 🥳

  • @CaribouDataScience
    @CaribouDataScience 4 місяці тому +1

    Thanks , that was very helpful.

    • @rappa753
      @rappa753  4 місяці тому

      Happy to help out :)

  • @muhammedhadedy4570
    @muhammedhadedy4570 6 місяців тому

    Amazing as usual. Can't wait to launch your data cleaning course.
    Greetings from Egypt.
    ❤❤❤❤

    • @rappa753
      @rappa753  6 місяців тому +1

      Thank you 🤗 happy to have you onboard 🥳

  • @cleandata_sk
    @cleandata_sk 6 місяців тому

    Very nice, up until now I used the janitor package solely to clean names :D
    If I may ask, what editor theme do you use? I like how the indentation is visually highlighted

    • @rappa753
      @rappa753  6 місяців тому

      Yeah that's my main use case too 😁 I use the rainbow indent option from RSudio

  • @Adeyeye_seyison
    @Adeyeye_seyison 5 місяців тому

    As always,
    Value loaded content!

    • @rappa753
      @rappa753  5 місяців тому

      Glad that you think so 🤗

  • @Adeyeye_seyison
    @Adeyeye_seyison 3 місяці тому +1

    Good morning sir Rapp,
    I checked out your blog on this video, the full code was there but no link for one to download the qmd file and above all_ the practice excel data you used.
    Can you help with the excel file so following along is in sync and produces same results.
    Thanks a million sir

    • @rappa753
      @rappa753  3 місяці тому +1

      Hi there, sorry my bad. I forgot to update the link in the description. The blog post is albert-rapp.de/posts/07_janitor_showcase/07_janitor_showcase and the corresponding Excel file can be found in the mentioned GitHub repo: github.com/sfirke/janitor/blob/main/dirty_data.xlsx

    • @Adeyeye_seyison
      @Adeyeye_seyison 3 місяці тому +1

      @rappa753
      Thanks a million sir for your contents and support...they are loaded with values

  • @taiwankyh
    @taiwankyh 6 місяців тому

    It really helps, thanks

    • @rappa753
      @rappa753  6 місяців тому

      Glad to hear that 🤗

  • @WahranRai
    @WahranRai 6 місяців тому +1

    In all programming languages, arguments/parameters are inside functions/methods.
    why should I put the data frame / data outside via the pipe!
    Tydiverse and pipe are only there to satisfy the dictates of Wickam and his team

    • @rappa753
      @rappa753  6 місяців тому +1

      Try using a Unix command line. The pipe concept has been there for decades. 🤷🏽‍♂️ Also have a look at Python or OOO in general. In terms of syntax, there you also have the object you're changing (e.g. a data frame) outside the method call.

    • @WahranRai
      @WahranRai 6 місяців тому

      @@rappa753 No ! Python, matlab, java, javascript...are clear and easy to understand as R(before Wickam) in the debut :
      ggplot2(data =myDataframe, aes...)
      lmHeight = lm(height~age, data = ageHeight)

    • @Adeyeye_seyison
      @Adeyeye_seyison 5 місяців тому

      Just to add to what Albert said; Dr. Hadley and his team didn't invent the pipe operator _ they only scooped it from the {Magritte} package.
      Its introduction was so successful that the base R team had no option than to invent its version _ thus we have: %>% and |> for R users

    • @Adeyeye_seyison
      @Adeyeye_seyison 5 місяців тому

      Secondly,
      Why worry and kick over operational or programming semantics rather than efficiency and effectiveness?
      Thus the pipe operator makes code more understandable _ yes!
      Thus it make code more clean and maintainable _ yes!
      Thus it make code scalable _ yes!
      Don't worry over it "weirdness"_ but its "wonders" .
      Blessings...

    • @WahranRai
      @WahranRai 5 місяців тому

      @@Adeyeye_seyison I knew that it is why i said R were easy and simpler Wickam (in one of his presentation he spoke about Magritte)
      I WROTE more than 1000 scripts without using the pipe and

  • @SergioUribe
    @SergioUribe 6 місяців тому +1

    as always nice and useful videos, but it hurts my eyes to see in 2024 the setwd, existing here()

    • @rappa753
      @rappa753  6 місяців тому

      Glad that you find the content useful 😊 how come you dislike my combination of setwd and here? 🤔 Always happy to hear nicer workflows

    • @SergioUribe
      @SergioUribe 6 місяців тому +1

      @@rappa753@rappa753 wait, wait, I didn't see you have a here inside the setwd...checking... in the meantime: how is this better than only here()?

    • @rappa753
      @rappa753  5 місяців тому

      AFAIK, here doesn't change the working directory. So without setwd() I'd have to use here() for all of my file paths. At the end of the day this is not much different though.