Normal Probability Plots Explained (OpenIntro textbook supplement)

Поділитися
Вставка
  • Опубліковано 16 жов 2024
  • Our accompanying textbooks on openintro.org/b..., all of which are free to download. Hard copies are also priced to be affordable for students. (We price our books in a way that we hope ensures all students can get a hard copy if they want one.)
    Topics covered in this video:
    Probability basics
    Disjoint / mutually independent
    Probability Distributions
    Complement
    Independence and probability
    Video author, voice, and editor: David Diez.

КОМЕНТАРІ • 41

  • @navjotsingh2251
    @navjotsingh2251 Рік тому +4

    I loved this video. A nice follow up, would be a video where you go much deeper into the theory and explain the math behind these kind of plots. Thank you.

  • @Mahmoud-li2xn
    @Mahmoud-li2xn 3 роки тому +9

    Best explanation on UA-cam for this topic, thank you.

  • @rffairchild
    @rffairchild 3 роки тому +1

    I agree with other comments. This is the best explanation of this topic on UA-cam

  • @aCllips
    @aCllips 5 років тому +6

    Right vs. left skewness is depicted the opposite way. The picture on the left is skewed to the left, and the picture on the right is skewed to the right.

    • @OpenIntroOrg
      @OpenIntroOrg  5 років тому +2

      Are you talking about the plots at about 8:30? The left plot has fewer observations strung out at higher values, which corresponds to right skewed (skew goes in the direction of the long tail). The reverse is true for the plot on the right.

    • @aCllips
      @aCllips 5 років тому +1

      @@OpenIntroOrg Thanks for the response. I am sorry, I was wrong. It seems, one cannot decide skewness from the histogram which could be drawn based on the first examples in this video. Because the value axis goes from high values to low in those histograms. They would need to be "mirrored" first in order to decide skewness.

    • @maxrkmrose
      @maxrkmrose 5 років тому +1

      @@OpenIntroOrg Skewness specifically indicates that the MEAN of the data set is not equal to the MEDIAN of the data set. Side note for others: on the histogram, lower values of the data are to the left with higher values of the data to the right.
      So a RIGHT skewed data set means that the MEAN of the data set is higher that the MEDIAN of the data set. There will be a higher density of observations to the left on the histogram. This concept seem opposite of what the histogram looks like, but the skewness is determined by the calculations from the data set.
      A LEFT skewed data set means that the MEAN is lower than the MEDIAN. There will be a higher density of observations to the right on the histogram.
      In a perfectly normal data set, the MEAN and MEDIAN will be approximately equal.

    • @ilanlivne4472
      @ilanlivne4472 4 роки тому

      @@aCllips Thanks for this explanation

  • @riccardomattea1240
    @riccardomattea1240 4 роки тому +3

    Probably the best explanation video out there

  • @gunning6407
    @gunning6407 7 років тому +8

    Recipe for QQ-plot (quantile-quantile) in R:
    ## In R, a key observation is that the "pnorm" and "qnorm" functions are inverses of each other.
    ## To construct a QQ-plot of N observations (random samples here):
    ##
    ## Number of observations
    nn

    • @RobvanMechelen
      @RobvanMechelen Рік тому +1

      Here is exactly what I was looking for. Thank you very much!

  • @Outlier_G
    @Outlier_G Рік тому

    I can't explain how best the video was. thanks 😊

  • @shokhrukhabduahadov3985
    @shokhrukhabduahadov3985 5 років тому +3

    so why it is so? why dont u explain the reason for not fitting the line

  • @ankmeyester
    @ankmeyester 7 років тому +4

    so, the x axis here is the z score values and the y axis is the actual values? and plotting it against one another as seen here, we should see how it lines up? the better the linearity, the more 'normal' the distribution?

    • @OpenIntroOrg
      @OpenIntroOrg  7 років тому +9

      Basically yes :) The x-values are the Z-scores we expect if the population and sample are as perfectly normal as it could be. So the straighter the line, the more encouraging that the data are nearly normal. That said, no population is perfectly normal, and even a sample from a truly normal distribution will not be perfectly straight just due to random sampling. That said, the main goal of this type of plot is often as a basic check to ensure nothing too wonky is going on and the population is roughly normal.

    • @ricardofraser4243
      @ricardofraser4243 6 років тому

      seems like it

  • @Valerie-ws3zr
    @Valerie-ws3zr 3 роки тому

    just what I was searching for ....... Nice job !!

  • @gunning6407
    @gunning6407 7 років тому +2

    In the textbook, I found the QQ-plot explanation to be lacking. Here, too, a number of key attributes are missing. First off, we must order the empirical observations (y-axis), as noted in previous comments. An explicit definition of "quantile" in earlier lectures would set the stage here, motivating "theoretical quantiles": the quantiles of the standard normal associated with the empirical probabilities (e.g. regularly spaced probabilites).

    • @DavidDiez
      @DavidDiez 7 років тому +2

      Hi Gunning, thanks for the feedback. In short, this is a "special topic" that isn't covered in most intro stat courses (though some do cover it), so we breeze through on theory here and get to the practical application of the method. We don't expect anyone to walk away from this video able to reconstruct this type of plot -- only be able to read one.

    • @gunning6407
      @gunning6407 7 років тому +1

      I updated my comment to put the "recipe" in a separate comment for curious readers. For context, I'm currently using the text to teach intro stats. This is my first semester with the department, but the department has used this text for several semesters.
      I absolutely understand the concern about special topics and coverage. My *personal* feeling is that the text should either include a discussion of QQ-plot along with 3-4 sentences of discussion of construction, or omit it altogether. That said, I would argue that understanding how the plot is constructed is critical to correctly reading it!

  • @kittyxing
    @kittyxing 4 роки тому

    Thanks for the video. How to generate the line for non-normal distributed data? I can understand that for the normal distributed data, the line has slope of STD and intercept of mean, then the x axis value is z score and y axis value is the actual data value. But how about the non-normal data set? how exactly to calculate the x axis value for each data point? how to calculate the y values for the straight line?

  • @rishisingh6111
    @rishisingh6111 2 роки тому

    Simply awesome! Thanks for shring this!

  • @robert8552
    @robert8552 4 роки тому

    So, my data is skewed and non-normally distributed - What's to be done?
    Do I perform some transformation to force normality, or do I rather just perform non-parametric tests?

    • @OpenIntroOrg
      @OpenIntroOrg  4 роки тому +2

      Unfortunately, it's easier to say "something might be risky or broken here" than it is to say "this is how to fix it". What is required will be highly dependent on the circumstances, both the data and what the goals are of the analysis:
      - If the sample is large enough and / or the skew isn't severe enough, then non-normality will not matter for some statistical methods. For example, if all your observations are within ~4 SDs of the mean, there are 30+ observations, and the method being applied is a t-confidence interval for the mean, then the skew isn't much of a concern because the Central Limit Theorem will have kicked in to the point the skew won't matter much.
      - A more robust method might help. However, be aware that not "nonparametric" does not automatically mean "robust". For instance, the bootstrap percentile method is less robust than t-distribution methods when the sample size is relatively small (

  • @m7mdsaleh523
    @m7mdsaleh523 3 роки тому

    Can we use the slope of the probability plot to measure the population variance of a sample?

    • @OpenIntroOrg
      @OpenIntroOrg  3 роки тому

      The line doesn't quite represent this, especially when the distribution has longer tails than a normal distribution, so it is good to calculate the sample variance separately.
      Also, sorry to nitpick, but a clarify to avoid confusion for others: we'd describe "population variance of a sample" as "sample variance", and to further remove any ambiguity, we divide by (n-1) when computing the sample variance (while population variance is often computed by dividing by n).

  • @mustafizurrahman5699
    @mustafizurrahman5699 Рік тому

    Simply splendid

  • @StellaNimas
    @StellaNimas 3 роки тому

    im doing my thesis rn, and the data is not normal, what to do with this? 😭😭

    • @OpenIntroOrg
      @OpenIntroOrg  3 роки тому

      Data is never perfectly normal, so you're in good company. Check out OpenIntro Statistics Section 7.1, which offers a couple of rules of thumb on the bottom of the first page of that section. The book is free online as a PDF from our website, see:
      www.openintro.org/book/os

  • @sunilkumarsamji8871
    @sunilkumarsamji8871 4 роки тому

    Well, The name is Normal probability plots. a) Why are they called Probability plots? b) Why the plot between the observed data and z score is supposed to be a straight line? Well I can understand if the data fits well its a measure of goodness of the fit, however, I dont understand why this has to be a straight line

  • @vladimirtorres1181
    @vladimirtorres1181 2 роки тому +1

    Very useful!! Thank you

  • @tule9213
    @tule9213 3 роки тому

    so touching for an excellent video

  • @harrygroundwater2590
    @harrygroundwater2590 4 місяці тому

    Very Helpful

  • @mxaraujo
    @mxaraujo 3 роки тому

    Very instructive, thanks

  • @allanmuganga7075
    @allanmuganga7075 4 роки тому

    Thanks for the video, it's been helpful. Kudos

  • @pandesal2022
    @pandesal2022 11 місяців тому

    thank you

  • @maxtok414
    @maxtok414 4 роки тому

    Thank you!

  • @bensonmathew8679
    @bensonmathew8679 6 років тому

    Very helpful!

  • @gentle2005phir
    @gentle2005phir 5 років тому

    Good one