Maths: Simpson's Paradox

Поділитися
Вставка
  • Опубліковано 20 бер 2010
  • A back to basics video.
    Simpson's Paradox is more than just a curiosity, it illustrates how important it is to interperet your data correctly. If you are not careful it is still possible to have good data and bad conclusions.

КОМЕНТАРІ • 1 тис.

  • @ZhouJi
    @ZhouJi 7 років тому +183

    I am not the only one hearing him saying "kills" all the time

    • @mrplayitcoolowski349
      @mrplayitcoolowski349 7 років тому +13

      I heard him once saying "cures", but I thought that was a mistake, so I accepted "kills" was what he wanted to say.

    • @DavidJJames
      @DavidJJames 7 років тому +5

      Yeah. He has these jugs that kill people.

    • @AndSawMir
      @AndSawMir 7 років тому +4

      I do not spoke English, so I have enabled captions with automatic English translator - and there was written "kills". As Google knows everything, he definitely must said that word ;)

    • @icusawme2
      @icusawme2 7 років тому

      lol, no till you pointed it out!

    • @deebadubbie
      @deebadubbie 7 років тому

      Yes, me too!

  • @Catlord98765
    @Catlord98765 9 років тому +354

    Am I the only one who occasionly hears him say kill instead of cure?

    • @ChairyCrasher
      @ChairyCrasher 9 років тому +6

      no. I heard it too.

    • @raydeen2k
      @raydeen2k 9 років тому +21

      Am I the only one who occasionally hears him say joke instead of drug? I'm thinking 'The Simpsons' so I'm inclined to hear 'joke'. And I'm drunk so I really can't tell my ass from my elbow.

    • @BeeBumper
      @BeeBumper 9 років тому

      No, I thought he said it too did a double take and then realized it was cure not kill.

    • @Igor_054
      @Igor_054 9 років тому +3

      raydeen2k This explains a lot. I was hearing joke all the time, and never realized it was supposed to be drug.

    • @awawpogi3036
      @awawpogi3036 6 років тому

      Nope.

  • @Kaldor-Draigo-h6q
    @Kaldor-Draigo-h6q 10 років тому +155

    So basically, if you don't know how to interpret data correctly, you'll be wrong and noob it up badly.

    • @Criterionx1
      @Criterionx1 9 років тому +11

      ***** Or if you don't examine studies closely you can be easily misled because of the way pecentages are manipulated.

    • @ChilledfishStick
      @ChilledfishStick 4 роки тому +2

      @@Criterionx1 I think that if a study is published a reputable journal, you'd probably not find such idiotic mistakes. At least I hope not.

    • @m136dalie
      @m136dalie 3 роки тому +2

      @@ChilledfishStick FYI the study that started the entire anti-vax movement was published in the Lancet

    • @ChilledfishStick
      @ChilledfishStick 3 роки тому

      ​@@m136dalie The paper didn't say that people should be wary of being vaccinated. It merely showed a correlation with autism which for anyone who knows anything, doesn't mean jack. At best it's something to look into, which scientists did and found nothing.
      People misuse and misunderstand papers all the time. Even people who should know better. Especially the media.
      Unless something goes terribly wrong in the peer review process (which can happen), a paper in a reputable journal won't make any conclusive claims based on correlation alone. At best they'd say that they found something that should probably be looked at.

    • @m136dalie
      @m136dalie 3 роки тому +1

      @@ChilledfishStick Your point on correlation is valid however it has to be within the context of proper scientific research.
      The paper in question wasn't example of that. Essentially the authors just took a sample of some 20-30 children who exhibited gastro/neuro symptoms who happened to receive a vaccination some weeks prior. The symptoms starting from vaccination according to... The parents. Should be obvious why this doesn't qualify as proper evidence.
      I think it's a good example of how extremely low quality research can still make it through the peer review process, even in big publisher journals.

  • @ffggddss
    @ffggddss 7 років тому +33

    "If you are not careful it is still possible to have good data and bad conclusions."
    Yes!
    The flip side of which is, how to lie with statistics.

    • @andrewharrison8436
      @andrewharrison8436 3 роки тому +1

      "how to lie with statistics" by Darrell Huff an excellent book on how statistics is abused (and hence how to do it properly)

  • @apeirogonInvesting
    @apeirogonInvesting 8 років тому +60

    Hi James,
    Love your videos but on this one the explanation suffers from Bertrand's paradox.
    1. If each test is taken by 100 people split unevenly over 2 days then the results are the ones you give.
    2. If the test is taken over 2 days and the fractions shown are just a ratio of success, it becomes combinatorial and the result is the opposite.
    Once you give your result we know it's case 1 but better context upfront removes any ambiguity, which we know leads to some of the comments here such as "you can make statistics say anything".
    Thanks again for all you do, it's really wonderful.

    • @NetAndyCz
      @NetAndyCz 7 років тому +3

      I still think that data can be interpreted as the ttreatment is less effective on day 2 (possible explanation is quickly developing resistance, though it could be that the pathogen responds to weather or something). If that is the case then treatment B is more effective and uneven distribution of patients is what makes the A drug look better.

    • @dylansmith5627
      @dylansmith5627 Рік тому

      @@NetAndyCz It is certainly helpful to think about how conditions or people samples might be different between the two days! This explanation only holds if the populations and conditions are equivalent on Day 1/2 and in treatment groups A and B.

  • @atsay714
    @atsay714 9 років тому +95

    Was that a swing at acupuncturists? lol

    • @U014B
      @U014B 8 років тому +36

      I think that was the point.
      (teehee)

    • @strangelyjamesly4078
      @strangelyjamesly4078 8 років тому +31

      +atsay714 It seemed to be he was taking a poke at them.

  • @gibbytravis
    @gibbytravis 10 років тому +24

    Nice little jab at acupuncture at the end. lol

  • @andrewweirny
    @andrewweirny 7 років тому +3

    I'm not sure if it's considered the same paradox, but I ran into a similar issue years ago when I was developing a school website with built in grading and averaging. I built it so that the teacher could configure their class like, "Tests are 50% of the final grade, Quizzes 20%, Projects 30%." But I didn't require them to say how many of each they'd have by the end of the term, so it was impossible to tell how much a particular project would be worth - just the project average. So a kid had an average of 85, got a 90 on a project, and saw his average go down to 82, which led the teacher to report it as a bug. But it was because he'd gotten 100 on his previous project, so the 90 brought his projects average down.

  • @leOkwardGuy
    @leOkwardGuy 8 років тому +4

    That's a very interesting paradox! I love learning new ways to see through the trickery some people and companies are trying to play on you! That's definitely a good one to look out for because other statistical misleadings are relatively obvious once you actually take a moment to look into it but this paradox would easily go unnoticed if you wouldn't know about it. I really hope they are not making this mistake with grading exams though because I doubt they would listen to you and change it if it is one of the bigger exams that are taken nationwide. Looking forward to finding one of those paradoxes in the real world.

  • @enochliu8316
    @enochliu8316 4 роки тому +5

    I think this video shows a Reverse Simpson Paradox(a name I coined), where if you segment the data with an irrelevant factor, the aggregated data is often better for drawing a conclusion than the segmented data is. In this case, which day the treatment was administered was an irrelevant factor to the test of the drug, since we assume the success rate does not depend on the day, as opposed to the Berkley example which had the departments as a relevant factor, since the admission rate did depend on the department.
    I think that the Reverse Simpson Paradox is not discussed very often on UA-cam, in fact most videos I watch about Simpson's paradox do not discuss the Reverse Simpson Paradox.

  • @singingbanana
    @singingbanana  13 років тому

    @RUL1S88 Quickly. I am not adding the fractions which is why I do not say the word 'add'.

  • @TheZombiegoth
    @TheZombiegoth 10 років тому +17

    I always go for the answer that I think is wrong, and I'm 100% right.
    Now there is a paradox.

  • @kimkatsu1453
    @kimkatsu1453 9 років тому +7

    It's called Simpson's Paradox, but it's not a paradox (it's just unintuitive) and it was discovered not by Simpson. Always like those little details.

    • @mathaha2922
      @mathaha2922 3 роки тому +2

      A so-called "veridical paradox".

  • @idlingdove
    @idlingdove 9 років тому +9

    Simply put, percentages are not comparable if they use different amounts as populations or bases.
    A dramatic example of this is the investor’s paradox (I just invented that term...): If your investment of $100 goes up 20% one year and then down 20% the next, do you have the same amount that you started with? NO! Because the increase of 20% is based on $100, and the decrease of 20% is based on $120 (so you end up with $100 + $20 - $24 = $96)...
    So in this example, we must be careful not to try to compare percentages from different populations. If we compare percentages of the LARGEST BASE (OR POPULATION) in each case (90) then we immediately see that drug A is better than drug B (70% vs. 50%). We can almost ignore the other two cases where the base (or population) is tiny (only 10), because these percentages will influence the overall result very little.

    • @tgwnn
      @tgwnn 9 років тому

      idlingdove a different example of the same concept is let's say we both have 100 bucks, on day 1 I give you 10 and on day 2 you give me 10. Then you will have gained 10% on day 1 and lost 9% (1/11=9.0909%) on day 2 whilst I will have lost 10% on day 1 and gained 11% (1/9=11.1111%) on day 2 so we both apparently got richer! :)

    • @Musicrafter12
      @Musicrafter12 9 років тому

      tgwnn Again, percents are deceving. Giving the other person $10 would result in you giving them 10% of their current amount. 10% of $100 is $10. The other person gives you $10 out of your now $90, which might be 11%, but 11% of 90 is $10. I assume you already knew that though.

    • @tgwnn
      @tgwnn 9 років тому

      Musicrafter | Minecraft yes I did! but it's such a cool example, I always like to bring it up.

  • @Emberr9
    @Emberr9 11 років тому +1

    I love how you always say "if you have been, thanks for watching" Ofc we're watching this stuff is awesome!

  • @singingbanana
    @singingbanana  14 років тому

    @theboombody Meaning?

  • @chrissytinalalala
    @chrissytinalalala 13 років тому +5

    I really enjoyed this one! I can't tell you how many medical based statistics I've seen over this past year. This video put a smile on my face. =) Thanks!

  • @chrisanderson1513
    @chrisanderson1513 8 років тому +7

    Pages 49-51 of this: web4.cs.ucl.ac.uk/staff/D.Barber/textbook/181115.pdf seems to suggest that the overall one isn't necessarily the best choice, right?

    • @singingbanana
      @singingbanana  8 років тому +8

      Correct.

    • @ffggddss
      @ffggddss 7 років тому +1

      The nature of statistics/stochastic processes.
      There's always some chance that if you do it again, it'll turn out differently.

    • @gorgolyt
      @gorgolyt 6 років тому

      It depends on whether the two days were the same or different. If they were different then these data show that drug B is better in either situation and therefore the better drug to take. This video is a pretty bad/incomplete explanation.

  • @austinbryan6759
    @austinbryan6759 8 років тому

    Hi James! Really enjoyed this video and all your ones on Numberphile, thanks for all the knowledge:)

  • @johnd.5601
    @johnd.5601 3 роки тому

    Your past efforts are still making positive progress. Thank you for sharing

  • @BrentDeJong
    @BrentDeJong 10 років тому +4

    Our stats book introduced this with a different example where Simpson's Paradox worked the other way. In a certain semester, 75% of men and 45% of women were accepted as graduate students to a certain university, and that university was accused of sexual discrimination. However, in program A, 3 out of 10 men and 37 out of 90 women were accepted, while in program B 72 of 90 men and 8 of 10 women were accepted. Was there really any sexual discrimination?

    • @RipleySawzen
      @RipleySawzen 10 років тому +1

      Yes, this is the proper way to do this paradox.

    • @Violet-tb8xo
      @Violet-tb8xo 5 років тому +1

      I'm going to be happily surprised if that stats book is still read in 2019 (because opinions > literal math now)

    • @snowfloofcathug
      @snowfloofcathug 5 років тому

      Violet explain?

    • @ratlinggull2223
      @ratlinggull2223 2 роки тому

      Yeah, this is the better example than the medicine one. That one doesn't make sense at all.

  • @pisser98
    @pisser98 10 років тому +4

    well, this is not really a mathematical problem, its just deceiving.
    if you put same-sized samples next to each other (90 to 90 and 10 to 10) it becomes pretty obvious right away

    • @rogerwilco2
      @rogerwilco2 10 років тому +2

      This is a relatively simple example, but the point is that people make these kids of mistakes all the time because a lot of people get fooled as soon as percentages enter the discussion.

  • @tomschang2225
    @tomschang2225 7 років тому +1

    This also works the other way around.
    Hospital A
    Mild illnesses cured: 63/90
    Serious illnesses cured: 4/10
    Overall patients cured: 67/100
    Hospital B
    Mild illnesses cured: 8/10
    Serious illnesses cured: 45/90
    Overall patients cured: 53/100
    Which hospital would you like to go to? The answer in this situation is clearly B. A did cure more people overall, but that's because more of its patients had a mild illness. The success rates for curing mild and serious illnesses separately are both higher at Hospital B.

  • @dheyabsalehasadaliabubaker540
    @dheyabsalehasadaliabubaker540 3 роки тому +1

    This is literally the only video about simpson's paradox I understood. Nice job :)

  • @Siuwajansiwa
    @Siuwajansiwa 8 років тому +3

    The very first example had an easier understandibg? The data could be rearranged into 63/90 vs 45/90 and 4/10 vs 8/10 which would then become obvious.

  • @thecsslife
    @thecsslife 9 років тому +3

    You can't add up the day 1 and 2 groups together and include them as if it's 1 big group. The day is a factor otherwise the cure rates wouldn't change. If you want to have a 1 big group, you must take an average of the percentages to produce a hypothetical group 3 that tested 100 people. Doing this and you see group B is better.

    • @seanlegge3854
      @seanlegge3854 9 років тому +2

      Imagine that you had the choice of being one of the 100 people who took drug A over the two day period or one of the 100 people who took drug B over the two day period, and that you didn't get to choose the day on which you were given the drug. To what group would you rather have belonged?

    • @basrikartal
      @basrikartal 9 років тому +1

      I don't agree with your conclusion but that is a very good point. I mean indicating "day" as a factor. It is the key here in this paradox!!! If we know that "day" is a factor which has a particular effect on number of people cure, your method and conclusion is right; B would be better. However, it's not given. We don't know if it is because of the day factor. Maybe It is just because the samples are different; and indeed day has nothing to with the number of cures. In that case A would be better.
      So to make a conclusion, you need to have an assumption. What are you testing for which factor? Without this assumption saying A or B better is senseless.
      On the other hand, in overall, since I think this small example research is trying to compare drug A and B based multiple observations, and nothing is indicating day as a factor; only the drug factor should be considered as test factor. Then by assuming that, an overall sum shows that drug A is better.
      Thank you for pointing out the factor issue. It was inspiring for me.

  • @juliakim1208
    @juliakim1208 4 роки тому

    What is the lurking, or confounding, variable for this distribution of data?

  • @paulbottomley42
    @paulbottomley42 10 років тому +2

    Your t-shirt. Awesome.
    Also I just discounted the studies with 10 participants because they're far too tiny to show meaningful data, and got that drug A was best. :)

  • @rikschaaf
    @rikschaaf 7 років тому +5

    I know this is a made up statistic, but when looking at the numbers I would ask myself: why does the drug work better on day one? There seems to be a hidden variable that changed between days that isn't taken into account. Assuming that either the individual days or the total result is correct would be a mistake before looking at this hidden variable.
    It is also better to design an experiment with groups of the same size.

    • @JGMeador444
      @JGMeador444 7 років тому +2

      Rik Schaaf I would think that day 1 was simply an unlikely result, and that with many more trials on day 2, the results came much closer to evening out. Same for A on day 1. It was unlikely that A would have such a low success rate on day 2, but due in part to a much smaller sample size, and it being essentially random if the drug works or doesn't work (think of generating a random number from 1 to 100: if 1

    • @htmlguy88
      @htmlguy88 7 років тому

      or sets of the same cardinality in math speak a group is another object in math with a different meaning.

    • @NetAndyCz
      @NetAndyCz 7 років тому

      Well, sadly the point of this video is very msileading without more data. The real point should have been if yo uwant to test something, make the sample sizes comparable to account for hidden variables.

    • @philippenachtergal6077
      @philippenachtergal6077 7 років тому

      Well, you often design experiments with groups of the same size at the start but then you have to remove some people from the experiment for whatever reason. Admittedly, that wouldn't lead to a 9/1 ratio.
      And then often, you don't actually get to choose your data set.

    • @slycordinator
      @slycordinator 5 років тому

      He doesn't explain the paradox well here at all. It is possible (and common) for the data that is separated into groups to represent what is correct instead of the unseparated data. You have to analyze/determine which stats/data best explains the situation, figuring out which of the two (separated VS unseparated) makes more sense given what is known.
      Ex:
      NYC has data they produce every year on lots of stuff related to public health. One study looked at over a decade of numbers for the mortality rate VS pollution levels. Over that time period, they produced a graph of the mortality rate VS pollution levels. The graph slope went down, meaning that as the pollution levels fit higher, less people died, which goes against what you'd expect. When the same data was separated by season (fall, winter,...), each seasons' graph went up indicating that more people died when the pollution went up.
      And unlike the video's example, the data separated into groups was correct. It more accurately represented/explained reality.

  • @fenderify1592
    @fenderify1592 8 років тому +4

    The key to understanding how the paradox appears is to realize that on day 2, BOTH drugs fared much worse than on day 1. Drug B appeared to get better results because its sample size was much smaller on the ill-fated day 2. In other words, you can interpret the results incorrectly if you fail to see that the difference between day 1 and day 2 is much more significant than the difference between drug A and drug B.
    Great video!

    • @klmnopq
      @klmnopq 4 роки тому +1

      Actually, drug B is better than drug A because drug B's sample size is larger on the ill fated day i.e. drug A got lucky, its sample is more in the first day(a good day).

    • @donalobyrnewells
      @donalobyrnewells 2 роки тому

      I think the explanation is poor, say " these things are weighted, and your not comparing like with like" are really just half explanations. They assume the audience can work out what exactly this means, in which case they probably would have figured it out on their own. This also has the effect of being very discouraging for anyone who doesn't understand your explanation.

  • @NoriMori1992
    @NoriMori1992 8 років тому

    Oh my God your t-shirt is ADORABLE! I think this is the first time I've seen you wear something that casual! So cute!

  • @myshowsanchez
    @myshowsanchez Рік тому

    Excellent explaining with a great pace. All other videos were convoluted along with their images. Your explanation was straight forward.Thank You!

  • @albertoacquaroni4734
    @albertoacquaroni4734 7 років тому +5

    there's no paradox, it's just the fact that you can't add percentages

    • @GillesF31
      @GillesF31 5 років тому

      Yes you can, here after is an example:
      a) 2/3 = 67%
      b) 1/2 = 50%
      c) 2/3 + 1/2 = 4/6 + 3/6 = 7/6 = 117%,
      d) 66% + 50% = 117%
      e) conclusion: 2/3 + 1/2 = 67% + 50% = 117%
      Simpson's paradox could be summarized as the following (my opinion): (a/b) + (c/d) is not, mathematically speaking, equal to (a+c)/(b+d)
      And Simpson said: don't make confusion between > and >.
      This confusion may create a paradox (false interpretation on results).
      Am I right? :-))
      (I'm French, hoping my English is ok)

    • @shreyashdeogade8869
      @shreyashdeogade8869 5 років тому +1

      @@GillesF31 fractions and 'percentage of something' are two different things. For eg, 60% of a substance + 60% of the remaining substance does not equate to 120% of the substance. It actually is 60% + 24% = 84%.

    • @GillesF31
      @GillesF31 5 років тому

      ADDING PERCENTAGES
      both percentages and fractions are ratios
      a percentage is just a fraction ... with 100 at the denominator:
      - 5/100 = 5%
      - 4/25 = 16/100 = 16%
      - 1/3 = 33.33/99.99 = 33% (round number)
      - then behavior of fraction goes as fraction behavior
      the addition of 2 percentages:
      - you eat 25% of an apple pie (= 1/4)
      - then, still hungry, you eat another 25% of the apple pie
      - at the end, you ate 25% + 25% = 50% (= 1/2) of the apple pie
      - we can add percentages

      a percentage on a percentage is NOT an addition:

      - you cut 25% of an apple pie (= 1/4)
      - you are not so hungry, then only you eat 80% of the cut part
      - at the end, you ate 80% of 25%: 80%*25% = (80/100)*(25/100) = 2000/10000 = 20/100 = 20%
      - verification:
      1) 80%*25% = 20% then 80% = 20%/25%
      2) 20%/25% = (20/100)/(25/100) = (20/100)*(100/25) = 2000/2500 = 80/100 = 80%
      :-))

  • @emp5352
    @emp5352 10 років тому +3

    ... Was it just me or did this not seem like a paradox at all?

    • @alxjones
      @alxjones 10 років тому +1

      The 'paradox' is that Drug B did better every day, and yet Drug A did better overall.

    • @emp5352
      @emp5352 10 років тому

      Maybe it's immediate hindsight bias on my part, but I still couldn't see it as a paradox, knowing that the conditions for day one and two were not equal for both drugs. Oh well.

    • @edugal03
      @edugal03 10 років тому

      Enoch Park Now take that to literally every other statistic you see that isn't being tested in a controlled lab.

    • @elliottmcollins
      @elliottmcollins 10 років тому

      "Paradox" is a widely abused word. It starts out meaning a problem with no apparent solution. Then when a solution is found, people mislabel it as a "Solved Paradox" rather than as a "Momentary Confusion".

    • @KasabianFan44
      @KasabianFan44 10 років тому +1

      You are right. A paradox is a puzzle in which *all the possible answers contradict themselves*. For instance: "This sentence is false. True or False?" is a proper paradox because if your answer is True, then the sentence is actually false and so it's false, too. Likewise, if you say False, then the sentence is *not* false and so it's true. So the two answers contradict themselves.
      While the Simpson's paradox has two possible answers which do not contradict *themselves*. They contradict *each other*. Drug A is better because its overall result is higher. Drug B is better because its daily result is higher. The two answers are not self-contradicting in any way. Therefore it is not a paradox.

  • @EN-od8tn
    @EN-od8tn 9 років тому +1

    This seems to be related to the fact that the experimental probability approaches theoretical probability with higher number of trials, thus explaining why drug b had an abnormally higher percentage on day 1 and drug a had an abnormally lower percentage on day 2 due to the lower number of trials

  • @BadL15
    @BadL15 14 років тому +1

    "And if you happen, thanks for watching"
    Same ending like your past videos but it's still an epic ending !!

  • @kroon275
    @kroon275 10 років тому +11

    how da f . . . is this a paradox??
    its just simple maths!!

    • @singingbanana
      @singingbanana  10 років тому +11

      Because if you know they came in on a specific day you would recommend drug B, but if you don't know what day they came in you have to recommend drug A. The paradox is more stark if I had used Men and Women instead of Day 1 and Day 2. If I know the gender of the patient I recommend drug B for Men and Women, but if I don't know the gender I recommend drug A.

    • @kroon275
      @kroon275 10 років тому +1

      firstly, what sort of clinical drugs tests would ever be based on findings over 2 days of trials, secondly, it is not the data relating to days that is important, it is the data relating to the different drugs, thirdly it IS simple mathematical interpretion, and not complicated. it is secondary school stuff, not some 'mystical' paradox

    • @kroon275
      @kroon275 10 років тому +1

      singingbanana p.s. the overall data is deliberately vague, e.g. day 2 isnt stated as being the second day of drug consumption, or the 2nd batch of first day trials, or whether either possibility is using the same trialists as day 1.
      starting any statistical data gathering process on such unclear grounds is destined (at best) to lead statistical vageries at the end.
      the whole 'paradox' exercise is data which has been self-sabotaged for effect, and is really quite boring now

    • @Garfie489
      @Garfie489 10 років тому +5

      K Roon Its a paradox because it neither has a correct or incorrect answer. Remember to become a Doctor you effectively only need a Pass in GCSE Maths (its not that much an ask).
      Yes most trials arnt taken over 2 days, but this is a much simplified example. The usual problem is in these enviroments is a form of tunnel vision. Doctor goes "Look at the rate on this day, and this day. They are much better every day we tested it" yet he doesnt look at the bigger picture. Its actually quite common human behavior to ignore sound logic due to the tunnel vision of the data you are presented - theres lies, god damn lies, and statistics.
      An example from Wiki is that babies from smokers are more likely to survive being underweight than those of non smokers. To someone with no medical knowledge that fact alone seems good enough reason to smoke during pregnancy. However the reason the mortality rate is lower is because otherwise healthy babies are being born underweight, and its only when you consider all babies born do you realise the fact that it increases mortality.
      If there was a research paper on mortality of underweight infants, it would recommend that women smoke. Because its too narrow a range to actually get the full truth. Think this doesnt happen? Big companies do it all the time in order to try and make scientific claim to endorse their products. Common example is energy drinks, who sponsor research into weather their drinks are better than water. By ignoring the data which disagrees with them, they can manipulate the result to give the finding that they want. All the data they present is truthal, but its not the full truth.

    • @shune84
      @shune84 9 років тому

      K Roon The idea is to promote a way of thinking about maths and any instance presented in the video to me seems purely hypothetical.

  • @smiguli8851
    @smiguli8851 7 років тому +14

    So... why is this a paradox?

    • @danielrocks00
      @danielrocks00 7 років тому

      Mihkel Majas the definition of a paradox is two contradicting statements and the two contradicting statements are drug B is better when comparing each day's percentage but drug A is better overall, thus resulting in a paradox

    • @smiguli8851
      @smiguli8851 7 років тому +1

      Daniel Nielsen​ "this paint is blue and red". that definition doesnt hold up for anything unless you agree that I just created a new paradox.
      Also effectiveness on day scale and overall effectiveness are two different things so they can't be contradicting wjen they don't even talk about the same thing.

    • @htmlguy88
      @htmlguy88 7 років тому

      they do effectiveness ....

    • @goose_clues
      @goose_clues Місяць тому

      @@smiguli8851ooooooohhhhhhh!!!
      *Huge GALAXY BRAIN TIME*
      Mom get the camera!!!

  • @CalvinHikes
    @CalvinHikes 11 років тому

    Is he saying cure or kill?

  • @teckyify
    @teckyify Рік тому

    I'm not sure if I understood this. Even if the ratios are different, why isn't it an approximation of the final result? Do I just need to rescale/change weight of each term to fix this?

  • @jmdj530
    @jmdj530 10 років тому +4

    ACUPUNCTURE BURN!

    • @Aycheffe
      @Aycheffe 10 років тому

      pretty sure he was being sarcastic. if anything id say hes pro holistic and/or natural healing

  • @pipuk3
    @pipuk3 10 років тому +21

    Hummmmmmmmmmmmmm (3 minutes 29 seconds later) mmmmmmmmmmm.

  • @31173x
    @31173x 11 років тому

    Just learned about this the other day in my stats class, so glad I allready found out about it from you Dr. Grime.

  • @sanakhanam9394
    @sanakhanam9394 3 роки тому +1

    This 10 year old video saved me in my job interview!! THANK YOU!😄

  • @bonclay7763
    @bonclay7763 7 років тому +23

    all i hear is ... DUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
    ...
    background op ...

  • @sloandaddyslim
    @sloandaddyslim 8 років тому +3

    poorly worded. not too fun sadly

    • @NoriMori1992
      @NoriMori1992 8 років тому

      How so?

    • @sloandaddyslim
      @sloandaddyslim 8 років тому

      should have made it clearer that the number of people being given the drug was not even on day 1 (and 2).

  • @redrooster241
    @redrooster241 11 років тому

    how do you add fractions with different denominators together?

  • @singingbanana
    @singingbanana  13 років тому

    @uschi17 That's normal in the UK. I believe 7 with a strikethrough is more common on the continent.

  • @AlchemistOfNirnroot
    @AlchemistOfNirnroot 9 років тому +6

    A. 63/90+4/10 = 7/10 + 4/10 = 11/10 or 99/90
    B. 8/10+45/90 = 8/10 + 5/10 = 13/10 or 117/90

  • @Eyrok
    @Eyrok 9 років тому +6

    it's logical. An highest % on an highest ammounth of ppl vs an highest % on a lowest ammounth of ppl...
    20% of 100 ppl is still more than 100% of 10 ppl :)

    • @RFC3514
      @RFC3514 9 років тому +3

      If that was your _only_ sample, then the drug that cured 100% of 10 people would still be the winner. Simply giving the drug to more people doesn't mean it's more _efficient_ (which is what you're trying to determine). They key to the test in this video is that you have _multiple_ samples, and therefore must look at the totals (not each day individually).

    • @NetAndyCz
      @NetAndyCz 7 років тому

      But what if the drug/pathogen depends on weather conditions and on day 2 it was less effective because of the rain and lower ambient temperature or something...

  • @CannedVideos225
    @CannedVideos225 10 років тому +2

    Is this the new way to add fractions?

  • @aleksandarkaraivanov4934
    @aleksandarkaraivanov4934 9 років тому

    you always make the best explanations! Thank you!

  • @Will14295
    @Will14295 11 років тому

    Ha! loved the acupuncture joke, i wish there were more people around like you James, keep up the fantastic work. :)

  • @andrewfullerton1379
    @andrewfullerton1379 8 років тому +1

    I'm feeling marginally proud that I figured out the answer before it was explained.
    ...Then I realized that I'm laying in bed watching videos abour math when I have class in 20 minutes and my pride dissolved like it was submerged in acetone.

  • @shruggzdastr8-facedclown
    @shruggzdastr8-facedclown 4 роки тому

    My favorite part of this video, James, is the Danger Mouse logo t-shirt that you're wearing in it!

  • @OlalOop
    @OlalOop 9 років тому

    If you had numbers that were 100 times as big, i.e. 6300/9000, 800/1000 etc. would you still believe that you should look at the total?

  • @antoniomonreal3199
    @antoniomonreal3199 9 років тому

    I think I understand this but what happens if we change the problem a little bit.
    Lets assume that there are two types of this specific illness: type 1 and type 2.
    Lets use the same numbers you used for the sake of simplicity, but this time the results of day one are the results for the illness type one, and the second day the illness type 2.
    Now you know that you have the illness but dont know which type.
    which one would you take?
    Now if we look at the percentages of a specific type we would obviously go for drug a, but if we instead we look at the overall number of lives saved it makes more sense to take b.
    Now, you dont know which type you have so it makes sense to pick b, but also you have to have either type 1 or 2 and drug A was shown to be better for both cases.
    Which one would you take?

  • @shune84
    @shune84 9 років тому

    this is why statistics are most often presented in percentage form in order to convince you of an idea just because "the maths adds up", i've learned so much more about maths since i left school.

  • @WhatforNameIsThat
    @WhatforNameIsThat 14 років тому

    Good to see a video about the basics. For my job I'm only using basic maths.
    Also nice about this video is seeing I was doing it ok all along.

  • @2wicetheWise
    @2wicetheWise 7 років тому

    how did you add two unlike fractions

  • @llwk62
    @llwk62 10 років тому

    Could we use statistical significance to evaluate each drug on each day? O.0

  • @singingbanana
    @singingbanana  14 років тому

    @miniwars123 Haha. You're right though. Percentages can be misleading :)

  • @tgwnn
    @tgwnn 9 років тому +2

    There's more to this - sometimes the point is actually reversed. Actually sometimes drug B is better if instead of day 1 and day 2 you have subcondition 1 and subcondition 2 (in this example, condition 1 would be easier to cure). Maybe your subconditions are not easy to identify and you will just take the overall % you will wrongly get to the conclusion that A is better than B even though A is performing worse than B on both subconditions. Of course normally you would have to be quite unlucky to get such a disproportionately different sample of subconditions for the two drugs (if you assign the patients randomly) but ok that's one of the necessities for Simpson's paradox to work.

    • @gorgolyt
      @gorgolyt 6 років тому

      There's also the very real possibility that day 1 and day 2 differed in some important way. The explanation in this video ignores that possibility and is therefore wrong.

  • @Aznkrusdr
    @Aznkrusdr 11 років тому

    If the days for drug B were flipped, would you have then chosen B as the better drug since you would be comparing like denominators?

  • @Akarashl
    @Akarashl 11 років тому

    shouldn't you have to get common denominators before adding the 2 trial results together?

  • @supergreatsuper
    @supergreatsuper 12 років тому

    Is it possible to make all sets of data give the wrong conclusion? If not, which ones can't be?

  • @JerenVelletri
    @JerenVelletri 11 років тому

    I know :\ setting it to HD helps a little, but there is still a buzzing and the balance feels off in my headphones.

  • @ntdonat
    @ntdonat 10 років тому

    so it just depends on where you focus then?

  • @singingbanana
    @singingbanana  11 років тому

    You've mis-quoted me. I said infinity is not a number, and that you can't write 1/0 = infinity = 2/0 because that usually implies 1=2.

  • @tariqkhasawneh4536
    @tariqkhasawneh4536 5 років тому +1

    Can't we simply say that we don't have sufficient data on the performance of drug B to extract a meaningful probability based on the frequency approach?

  • @singingbanana
    @singingbanana  14 років тому

    @theboombody That is what it's called. If you made a snap judgement based on the title and the video wasn't the same as the thing you made up in your head, that is nothing to do with me.

  • @UberDragon
    @UberDragon 10 років тому

    it's hard to tell which is the better drug, does it cure or kill them?

  • @Pete-Prolly
    @Pete-Prolly 6 років тому +1

    You do not add denominators in fractions.
    We wouldn't say
    63/90 + 4/10 = 67/100 in any other scenario, but it's acceptable here because they are "like terms" and they were tallied at the end.

    • @singingbanana
      @singingbanana  6 років тому

      The fractions 63/90 + 4/10 = 99/90. But that is clearly not the correct answer in this situation. In this case 90 and 10 represent 100 different people. 67 were cured.

    • @htmlguy88
      @htmlguy88 6 років тому

      look up farey addition there are scenarios for it.

  • @willidoex
    @willidoex 10 років тому

    This is an interesting crossover between math, statistics, and experimental methodoloy. 1.You are right,inferential statistics tells us that 10 cases are small, however you cannot just disregard these 10 cases. Weight them accordingly. 2.Methodology tells us that circumstances are vital in attributing causality / which drug is the better one, or if the day matters.
    So all in all, any causal conclusion given this data is probably debatable both on statistical as well as on methodological grounds.

  • @jgallantyt
    @jgallantyt 8 років тому +2

    Love the Danger Mouse shirt. I used to love that show. Nickelodeon used to import it to the states in the 80s with all its whacky British phrases, half the fun was trying to understand them.

  • @blah8168
    @blah8168 2 роки тому

    Thank you! Your breakdown is perfect.

  • @AceMan721
    @AceMan721 11 років тому

    Thank you you made it so simple to understand. You helped my maths assignment :)

  • @dorospatz
    @dorospatz 13 років тому

    Thank you sooo much! You explain the Paradox so easily, I understood it immediately. Neither my book nor google was able to to that ;)

  • @Error081688
    @Error081688 14 років тому

    This was a pretty simple one, but fun just the same. It's a good illustration though to show the importance of weighting components in a calculation. Comparing apples to apples so to speak.
    Great vid again. Thanks!

  • @Quintinohthree
    @Quintinohthree 12 років тому

    @Quintinohthree
    I must then ask the question, what about people not cured the first day? They'll come back, and they could survive that long, so why are they not counted?

  • @thomasmo90
    @thomasmo90 2 роки тому

    I dont understand quite.
    What if they tested the cure on the same amount of people on day 1 and 2?
    Then A= 63/90 =70% and B=72/90= 72%
    On day 2 A=4/10=40% and B= 5/10= 50%.
    Then B would be the best choice?

  • @singingbanana
    @singingbanana  11 років тому

    It's not a trick. I am not adding fractions, I am adding people. 63 out of 90 people, plus 4 out of 10 people equals 67 out of 100 people.
    Taking an average percentage just gives you the same, wrong, conclusion as B beats A on Day 1 and B beats A on Day 2.
    Don't feel bad if it caught you out. That's why it's a famous problem.

  • @devarshsheth6299
    @devarshsheth6299 3 роки тому +1

    Even after over 10 years
    This Man saves our Students Career

  • @JuliePhelan
    @JuliePhelan 14 років тому

    WOW!
    I never thought of it that way. That was extremely amazing. Keep up the good work!
    MJGL99

  • @sheher1994
    @sheher1994 11 років тому

    Where does the name Simpson's Paradox come from?
    And who came up with the Paradox?

  • @singingbanana
    @singingbanana  14 років тому

    @CogitoErgoCogitoSum There are loads of studies disproving it. Shed-loads. And then some more. It's been done.

  • @rohan6206
    @rohan6206 3 роки тому

    Someone correct me if I’m wrong. Isn’t it just because less people are treated on day one for B? Both are better on day one, therefore if you treat equal numbers of people on day 1 & day 2 then B is better. It’s just because you are testing more people on the first day with A that it seems better. It’s inherently biased sample , no?

  • @gazzyw85
    @gazzyw85 11 років тому

    To calculate the fractions given in B, I would suggest 80% + (50% x 20%) ???

  • @johnbarron4265
    @johnbarron4265 11 років тому

    Sample size plays a role in Simpson's Paradox.
    Drug A's 1st day proportion of cured subjects (.70) carries 90% of the weight in its overall proportion whereas Drug B's 2nd day proportion of cured subjects (.50) carries 90% of the weight in its overall proportion. What this means is that although Drug B had higher daily proportions of cured subjects, Drug A's overall proportion reflects the fact that Day 1's proportions were far better than Day 2's. A lurking variable may also be at work.

  • @Blignorance2
    @Blignorance2 12 років тому

    Could someone explain this to me? I simply can't understand it. Isn't the percentage a representative of the overall population?

  • @JoelFeinstein
    @JoelFeinstein 12 років тому +1

    I still find Simpson's paradox amazing. I use it as an example that intuitively obvious statements can be false when trying to motivate proof.
    I like the example here (partly because the percentages come out nicely without a calculator!). But with such inconsistent performances, I'd want to know if there was some hidden reason why the drugs had both done so much better on day1 than on day 2. If there was, then the overall figures would be unfair on drug B.

  • @daeronb
    @daeronb 11 років тому

    Where is cronbach's alfa and statistical power p in all of this? Assuming they are all the same for all 4 experimental trials, then when distributed over a large population of people, Type B would clearly be advantageous. Am i missing something? I sincerely want to know.

  • @sjuas690
    @sjuas690 9 років тому

    Was that video shot in Cambridge?

  • @bansheeflier1015
    @bansheeflier1015 12 років тому

    @Brandon101Realize As he explains in the video, it's not about each individual day, it's about both days together.

  • @djsyntic
    @djsyntic 2 роки тому

    I'm confused here... sometimes I hear him saying it Cures people other times I hear him saying it Kills people. So are these drugs curing people or killing people?

  • @singingbanana
    @singingbanana  11 років тому

    Are you sure about that? Let's add the figures for drug B the way you describe: 8/10 + 45/90 = 72/90 + 45/90 = 117/90 = 130%. Do you really think drug B cures 130% of people?
    Adding fractions is not the correct thing to do here (and I never said we were adding fractions). Instead, we are looking at the total people cured (53), divided by the total people in the study (100). This is the correct figure to use.

  • @jamesl8640
    @jamesl8640 3 роки тому

    Loved the two references at the start and one at the end

  • @m3m3sis
    @m3m3sis 10 років тому

    I'd say that don't listen with studio monitors either. The same effect of partial defness.

  • @DFPercush
    @DFPercush 11 років тому

    What's the DM logo?

  • @PvblivsAelivs
    @PvblivsAelivs 11 років тому

    The first thing that comes to my mind is: Was there some difference in procedure that might account for the differing results on the different days? Also, clearly, the sample size is too small. But, if there is some unknown factor that caused both drugs to perform better in the first trial drug B may still be better.