Tidying Data in R with pivot_longer()

Поділитися
Вставка
  • Опубліковано 2 гру 2024

КОМЕНТАРІ • 30

  • @EquitableEquations
    @EquitableEquations  Рік тому

    You can find materials supporting this vid (and others) at github.com/equitable-equations/youtube.

  • @filipenunesvicente7872
    @filipenunesvicente7872 2 роки тому +6

    Extremely helpful, it got me out of a pickle concerning a dataframe with multiple names on it! Thanks for the quality content.

  • @ingridtello-lopez1144
    @ingridtello-lopez1144 2 місяці тому +1

    Wow! Thank you very much. Very helpful and well explained :D

  • @edwardvasquez4288
    @edwardvasquez4288 2 роки тому +1

    thank you this was very straight-forward

  • @Kinglium
    @Kinglium 2 роки тому +1

    thank you so much for your clear explanation!

  • @cjspear
    @cjspear Рік тому

    Fantastic video, thank you for your help.

  • @ignvzinho
    @ignvzinho 2 роки тому

    Thanks for the simple and precise explanation.

  • @j.knetsch3413
    @j.knetsch3413 8 місяців тому +1

    Thanks a lot! Good explination!

  • @ronvave2997
    @ronvave2997 10 місяців тому

    Thankful for this video. Question. In my data, I've pivot_long columns 3:6, but I also need columns 7:8 in the same dataset as another variable/column and values. How can I do this in the same code chunk?

  • @richardmusonda3404
    @richardmusonda3404 2 роки тому

    Quality content and Quality Professor.

  • @MegaSesamStrasse
    @MegaSesamStrasse 3 роки тому +2

    Thanks for the helpful introduction! What can i do if I face following problem:
    - there are variable spread across multiple columns
    and
    - observations are scattered across multiple rows

    • @EquitableEquations
      @EquitableEquations  3 роки тому +1

      Hi! Pivot_longer is your basic tool for dealing with variables spread across multiple columns. The first tool I would consider if each observation used multiple rows would be pivot_wider.

  • @danaetapia9286
    @danaetapia9286 Рік тому

    Thank you much for taking your time explaining this. 😍😍

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 роки тому +1

    Or you could use...names_pattern = "day_?(.*)_(.*)"names_pattern = "day_?(.*)_(.*)" to split your "DAY" column into day and time. using this type of regex. I have not figured out how to get rid of am but is should not be too hard. Just have to fiddle with regex. By the way I prefer to use snake_case which can be done with janitor::clean_names(df). Nice presentation and great source of data.
    Thanks

  • @kevindave277
    @kevindave277 3 роки тому +2

    Exceptional video. I would be very glad if you could provide the link for the dataset so I can work with it locally. Much thanks.

    • @EquitableEquations
      @EquitableEquations  3 роки тому

      This is set #3 from Triola's Elementary Stats. You can download it from www.triolastats.com/es13-datasets

  • @dabinjeong9560
    @dabinjeong9560 Рік тому

    very useful video! thank you

  • @romanvasiura6705
    @romanvasiura6705 Рік тому

    Thank you!
    P.S. definitely it's hard to remember all feature, but at least I'll know where I can find good tips)) and refresh my knowledge...
    You've done amazing work 😃

    • @EquitableEquations
      @EquitableEquations  Рік тому +1

      For sure! I'm constantly googling and checking help files for functions I don't use every day.

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 роки тому +1

    Following up on my last comment: names_pattern = "day_?(.*)_(\\d+)" works! We need that extra '\' in there so that \d+ works (basically we have to escape the \ which works in normal regex by itself but need another in R).

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 роки тому

    Andrew,
    Nice presentation. I could not find the FQA data no matter where I looked. Please provide the link to data when you use external data. I would recommend using data from two sources in these exercises. First, simply the in-house (available in a package or on e of the common r data sets) and second, external data. The external data needs either to be referenced properly. You could also turn into RData that can then be downloaded from for instance Github.
    The blocks data does not have to be downloaded. It is already available in the "GMLsData" R package.
    Thanks and keep up the good work!
    P. S. I attended grad school in the Chicago area (that is a small university in Hyde Park).

    • @EquitableEquations
      @EquitableEquations  2 роки тому

      Hi! The pre-loaded data sets are lovely but very well-explored elsewhere, especially the tidyr and dplyr sets, so I chose to avoid them here. You can find FQA data here: universalfqa.org/

    • @haraldurkarlsson1147
      @haraldurkarlsson1147 2 роки тому

      @@EquitableEquations Thanks.

  • @PaulYoung-r8g
    @PaulYoung-r8g Рік тому

    This was very helpful

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 роки тому

    P.S. If you are looking for "messy" data then the billboard data that comes with tidyr is perfect.

    • @EquitableEquations
      @EquitableEquations  2 роки тому

      That's true! Anyone interested can see how to pivot this one with ?pivot_longer.

  • @hugobarrera771
    @hugobarrera771 2 роки тому

    Love u men!!!