Hadley Wickham's "dplyr" tutorial at useR 2014 (2/2)

Поділитися
Вставка
  • Опубліковано 10 лют 2025
  • Part 2/2 of the dplyr workshop held at UCLA during the useR 2014 conference.
    dplyr is the premier data manipulation tool for data analysts who work in the R language. This package makes it easier than ever to sort, manage, and clean your dirty data with speed and efficiency.
    Visit Hadley's Github at github.com/had... for more information, and also check out other related packages at www.Rstudio.com.
    topics covered: Grouped Mutate/Filter, Joins, Do, Databases
    Scripts and data from this tutorial can be accessed here www.dropbox.co...

КОМЕНТАРІ • 5

  • @sethchandler7903
    @sethchandler7903 10 років тому

    Excited about this video but I am having two initial problems. The nycflights2013 data.frame does not have a plane column. It has a tailnum colum, which appears to be the same thing, but some renaming needs to be done. Also, when I run the code at 1:30 I get an error "Error in n() : This function should not be called directly" I'm not sure what this is about. I am running R 3.1.1 in RStudio 0.98.1028 on OSX .

  • @ondrejplachy297
    @ondrejplachy297 9 років тому +2

    dropbox does not work, i also could not find the airports dataset.

    • @senthilramalingam9500
      @senthilramalingam9500 9 років тому +1

      Malory Knox Its here www.dropbox.com/sh/i8qnluwmuieicxc/AAAgt9tIKoIm7WZKIyK25lh6a and you can find all the datasets.

  • @sethchandler7903
    @sethchandler7903 10 років тому

    Darn. Ignore second problem. User error. Very sorry I posted before checking more carefully.

  • @zakkyang6476
    @zakkyang6476 6 років тому

    I think the z score part is not correct.
    Should be like this:
    planes_z %
    filter(!is.na(arr_delay)) %>%
    group_by(plane) %>%
    filter(n() >30) %>%
    mutate(z_delay =
    (arr_delay - mean (arr_delay))/sd(arr_delay)) %>%
    filter(z_delay >=3) %>%
    select(plane, z_delay) %>%
    arrange(desc(z_delay))
    View(planes_z)