Python for Data Analysis: Hypothesis Testing and T-Tests

Поділитися
Вставка
  • Опубліковано 11 гру 2024

КОМЕНТАРІ • 26

  • @martyzeenyc1210
    @martyzeenyc1210 3 роки тому +19

    I want to thank you for these videos as I'm struggling in my college data science course. This has helped me massively!

  • @grainofsalt2113
    @grainofsalt2113 2 роки тому +7

    you explained this concept in the simplest way i have ever seen

  • @florenciaortega6543
    @florenciaortega6543 3 роки тому +2

    Thank you very much!!! I couldn't find anywhere else this test as well explained as you did it. So accurated explanation. Thank you! A+ !

  • @michaelolubode6168
    @michaelolubode6168 2 роки тому

    Thanks for this video. I knew the theory of hypothesis testing and can do it on paper but it wasn't easy to do the same in python until I saw your video. Thanks for the simplicity.

  • @marinastolet7799
    @marinastolet7799 2 роки тому +1

    This is exactly what I needed, thank you.

  • @kits1111
    @kits1111 Рік тому

    you are so awesome !! you explained so well ...

  • @neelroy3
    @neelroy3 2 роки тому +1

    which statistical test can be used to find difference between two groups' percentage values?

  • @jongcheulkim7284
    @jongcheulkim7284 2 роки тому +1

    Thank you so much. This is very helpful.

  • @arashkashefian1735
    @arashkashefian1735 10 місяців тому

    thank you very useful video. just wondering for two sample or paired tests, is there a way to test if null hypothesis is not just 0 but some none-zero value. For example if S1 is the first sample and S2 is the second sample, then how do we test the hypothesis that S1 - S2 > 1

  • @durgabhavanikonamarthi6448
    @durgabhavanikonamarthi6448 3 роки тому

    what do we do to our model if we accept an alternative hypothesis?

  • @atom8926
    @atom8926 Рік тому

    How you have learn statistics? Please mention some good resource to learn

  • @valda313
    @valda313 4 роки тому +4

    Shouldn't normality testing be done before performing ttests?
    (Otherwise, great video, thanks 👍🏻)

    • @DataDaft
      @DataDaft  4 роки тому +4

      That is a good point Valda. The distributions should be normal for the t-test, which can be checked through informal means like inspecting a histogram or normal qqplot or more formally with a test like scipy.stats.shapiro(). If the sample data is large enough though, say 50+, that might be adequate for the sample data due to the normality of the sampling distribution via the central limit theorem, but I'm not sure there's a good hard-and-fast rule as to when things are "not normal enough." It is probably a good idea to also run a non-parametric test like the Mann-Whitney test for independent samples and Wilcoxon Signed Rank test for paired samples if normality questionable.

    • @valda313
      @valda313 4 роки тому +1

      @@DataDaft thanks for response! When I compute these types of statsicial testing, I always firstly do normality test (scipy.stats.shapiro). Based on the result of normality testing I choose either from parametric ttests or nonparametric tests (like Wilcoxon or Mann-Whitney).

    • @DataDaft
      @DataDaft  4 роки тому +2

      @@valda313 Thanks for the input! it is helpful to have knowledge viewers fill in gaps (or make me aware of errors). Helps everyone learn.

  • @ayush9psycho
    @ayush9psycho 3 роки тому

    quality material!!

  • @svitirur1665
    @svitirur1665 3 роки тому

    Do I need hypothesis testing in machine learning modeling,? or lets say when should i do hypotheses testing in dataset , as a data scientist

    • @DataDaft
      @DataDaft  3 роки тому +3

      Hypothesis testing is a core statistical idea that plays a role in many other concepts in data science and machine learning. Basically any time you have a situation where you want to investigate whether one sample of data differs another (or from a population), hypothesis testing is something to consider. For example, it is at the core of A/B testing which is used to choose between two different options, like which version of an ad or website attracts more clicks.

  • @janabark6415
    @janabark6415 3 роки тому

    Please, what if I have a different number of records for each of the testing group.
    For instance 2000 records for control and 2050 for test group. Can I use python function :
    t_stat, p_val= ss.ttest_ind(df_cnt.exp_rev,df_trt.exp_rev)?
    I got result:
    T-score = 0.16434444604672976
    # There is 16 % deviation from H0 mean
    # p-value = 0.8694662602367074
    # p-value is > than significance level i.e. 0.05
    # Therefore I am rejecting H1 the treatment did not performed better than the control
    Can I interpret it like this? Thank you very mucho in advance.

  • @johnnybastos3390
    @johnnybastos3390 2 роки тому

    can i say that a p-value = false positive probability?

  • @kartiksharma-yw7qf
    @kartiksharma-yw7qf 3 роки тому +1

    You are damnnn good m loving it to study with you.

  • @iddymanhunter1
    @iddymanhunter1 3 роки тому

    Amazing!!!

  • @atom8926
    @atom8926 Рік тому

    Awesome

  • @forbesavila8006
    @forbesavila8006 2 роки тому

    Why do you set degree of freedom to 49?