7. Confidence Intervals

Поділитися
Вставка
  • Опубліковано 21 тра 2024
  • MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
    View the complete course: ocw.mit.edu/6-0002F16
    Instructor: John Guttag
    Prof. Guttag continues discussing Monte Carlo simulations.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

КОМЕНТАРІ • 61

  • @leixun
    @leixun 3 роки тому +44

    *My takeaways:*
    1. Generating normally distributed data in code 0:45
    2. Probability density function for distribution 8:10
    3. Not everything is normal distributed 20:29
    4. The central limit theorem 22:50
    5. Pi is calculated using Monte Carlo Simulation 32:21
    - Standard deviation gets better with more samples 42:00

  • @andreafavero71
    @andreafavero71 4 місяці тому

    "This is really amazing that this is true, and dramatically useful."
    This statement, at 25:19, it not only true CLT ... but also for this couse: Thank you!

  • @annakh9543
    @annakh9543 5 років тому +7

    statistics are basics indeed but this course really helps me learn doing these stuff in python, thank you Mit

  • @Syncromatic
    @Syncromatic 6 років тому +88

    Hmm, of the 43k people who watched "6. Monte Carlo Simulation", only 6k bothered to watch confidence intervals.
    Estimate the amount of the 43k who are gamblers trying to beat the system.

    • @shivanshraj6571
      @shivanshraj6571 6 років тому

      Hahahahahah

    • @MeggaMortous
      @MeggaMortous 4 роки тому +3

      **commences needle dropping**

    • @andersduck
      @andersduck 4 роки тому

      Which is a shame since CI is wrongly explained in that lecture

    • @AaronBrand
      @AaronBrand 3 роки тому

      How can one use the Archimedes method for calculating pi in a Monte Carlo simulation and find a CI? Is seems like this method is a more straight forward way of finding pi.

  • @waseemislam6646
    @waseemislam6646 5 років тому +1

    Great lecture. Pretty sure

  • @sharonchan5052
    @sharonchan5052 5 років тому +2

    Thank you! This course teaches soooo much better than the lecture provided in my university!

  • @jongcheulkim7284
    @jongcheulkim7284 2 роки тому

    Thank you.

  • @idragonb
    @idragonb 2 роки тому +4

    Just thought you might be interested - the example that you gave from the book of Kings that seems to estimate pi as 3 has an interesting tradition associated with it. There is a concept of 'written' and 'read' in the reading of the scriptures and in this case 'line' is read and 'the line' is written. Hebrew has values associated with each letter and if we use the value of 'the line' you get 111 and 'line' is 106. If you use this as a factor to multiply the apparent 3 - you get a pretty good estimate of pi...
    הקו = 111
    קו = 106
    111/106*3= 3.1415

  • @madinasaidova3648
    @madinasaidova3648 5 років тому +1

    37:07 I am confused with the equation needle in circle/needle in square = area of circle/area of square

  • @sibinh
    @sibinh 7 років тому +15

    Great lecture! Like professor's humor, particularly this 34:35 :)

    • @andrei-un3yr
      @andrei-un3yr 6 років тому +2

      I didn't get the joke. Could you tell me the context?

    • @rpaddy93
      @rpaddy93 6 років тому +3

      Mike Pence is a fundamentalist

    • @Guinhulol
      @Guinhulol Рік тому +1

      @@andrei-un3yr Well, Mike Pence Voted Against Recognizing Pi back in 2009 that is why

  • @RupertBruce
    @RupertBruce 2 роки тому +4

    The weights in that Python code will be 1.0 always. From the description, it ought to be the count of each 'x' in a bin, divided by the number of values in the bin (which could be zero...). Discard the weights!

    • @isaacshuman5962
      @isaacshuman5962 2 роки тому +1

      Thanks man, I thought I was going crazy.

    • @geegee1014
      @geegee1014 2 роки тому +5

      If you are talking about this line of code:
      weights = [1/numSamples]*len(dist)
      weights is actually equal to a list 1,000,000 items long of 1/1,000,000
      ie: [1e-06, 1e-06, 1e-06, . . . ,1e-06]
      As in python [5]*5 == [5, 5, 5, 5, 5], not [25]
      you can test it by adding a print statement for part of the weights list after it is defined, like this:
      print(weights[:10]) #Prints first 10 items in weights list.
      (Don't try and print the whole list, its 1 million items long!)
      If you were talking about somthing else im sorry!

  • @standman007
    @standman007 Рік тому +1

    I would like to know when doing a monte carlo simulation why do we use Normal Inverse function in Excel?

  • @rastislavsvoboda4363
    @rastislavsvoboda4363 3 роки тому

    8:55
    PDF formula in red rectangle is missing /
    in code, factor2 is correct

  • @DoNotBeASIMP
    @DoNotBeASIMP 7 років тому +2

    I did not get the weight parameter in the formula shown at the beginning. It says [1/numSamples]*len(dist). However, numSamples is 1000000 and dist has always a length of 1000000 as well, so the weight will end up as 1. Am I missing something?

    • @mohamedelsawi5646
      @mohamedelsawi5646 6 років тому +1

      What is missing in the formula is to use float 1.0 instead of just 1 in the expression [1.0/numSamples]*len(dist). Otherwise you will get zeros for all weights list members.

    • @absolutelyharmlesss
      @absolutelyharmlesss 6 років тому +11

      mind the square brackets around [1/numSamples] - this is a list of length =1
      Multiplying this by len(dist) gives you a list of length = len(dist). Example:
      [.2] * 5 = [.2, .2, .2, .2, .2]

    • @DoNotBeASIMP
      @DoNotBeASIMP 6 років тому

      absolutelyharmlesss Ah, got you! Thank you!

  • @newbie8051
    @newbie8051 Рік тому

    Would be great if sir you could also show the plots for more number of trials, so that we could observe the trends becoming gaussian :)

  • @stephenadams2397
    @stephenadams2397 4 роки тому +5

    Didn't you get a better estimate going from 1000 needles to 2000? Isn't 3.139 closer to Pi than 3.148 so it's an improvement isn't it? But it looks still be true from your samples that the simulations are not monotonically getting better.

    • @SKyrim190
      @SKyrim190 3 роки тому

      Yes, 3.19 is closer to Pi than 3.148. He was probably just truncating in the last correct digit and that is why he though it was the other way around, because 3.19 has "two correct digits" and 3.148 has "three correct digits". Of course that is not the same as being closer to pi as this very example demonstrates

  • @bibop224
    @bibop224 5 років тому +1

    46:31 The slide says "both are factually correct". But i don't understand how the 2nd statement is true. Is it correct to say that the value of pi is between X and Y with probability 0.95, when in fact we know that the value of pi is between those X and Y with a probability of 1 ? The 2nd statement implies that the value of pi is not between X and Y with a probability of 0.05, which is false.

    • @devdew6407
      @devdew6407 3 роки тому

      Once the confidence interval for an unknown parameter is constructed, the probability that the confidence interval contains the true value of the parameter is either 0 or 1. It cannot be 0.95.

  • @AmanPratapSinghBITsindri
    @AmanPratapSinghBITsindri 3 роки тому

    what is a bin? 3:15

  • @o3bvv
    @o3bvv 3 роки тому

    Could somebody please explain why the precision is chosen to be .005 for the estimation of Pi? And what did he mean by saying "should probably use 1.96 instead of 2"? There are two "2" in the code, which one he meant? The whole lecture is titled "Confidence Intervals", but the actual topic is just skimmed in a couple of sentences 😳

    • @xplodnow
      @xplodnow 3 роки тому

      You should probably watch the 6th lecture. All the qns u have are answered there.
      The Empirical Rule states that :
      68% of the data is within 1 stdev of the mean
      95% of the data is within 1.96 stdev of the mean (he used 2 instead of 1.96 for simplicity)
      99.7% of the data is within 3 stdev of the mean

    • @EOh-ew2qf
      @EOh-ew2qf 3 роки тому

      0.005 is number he chose as an acceptable range of error
      (Since exact value of pi = 3.141~ , we want estimates to lie between 3.136 ~ 3.146 with high confidence)
      Consider one simulation result where
      Estimate = 3.141556
      Std.dev. = 0.0021
      by the emperical rule
      there is 95% of chance that the actual value of pi will lie between
      3.141556 - 2*0.0021 ~ 3.141556 + 2*0.0021
      (there is 95% chance the estimate is correct within 0.0042(

  • @logosfabula
    @logosfabula 6 років тому +1

    11:44 could you expand on "the probability of any particular point is 0"?

    • @diogosesimbra
      @diogosesimbra 6 років тому +14

      (Finally, my time to shine has arrived :) ) In a continuous variable, any real value inside an interval is possible. For example, between 0 and 1 we have infinite real numbers. The probability of sampling any of those particular values is 0 because there is an infinity of them.
      I hope it was clear.

    • @alizasiff
      @alizasiff Рік тому

      Because there are an infinite number of possibilities

  • @lee_badda
    @lee_badda 2 роки тому

    Can someone tell me the code v[0][30:70] means? 6:14

    • @lee_badda
      @lee_badda 2 роки тому

      total area of 40 bins is what i have concluded but why is that the "fraction within ~200 of mean"??

  • @ShaunPatterson
    @ShaunPatterson 3 роки тому

    Did anyone call Pence and verify?

  • @adiflorense1477
    @adiflorense1477 3 роки тому

    24:23 in conclusion the subset is in the set

  • @seanpitcher7150
    @seanpitcher7150 6 років тому +6

    I'm going to have my tutoring students watch these videos. You, sir, are an amazing teacher. And you are wrong about Mike Pence thinking pi is 3. He would never defile his mind with thinking of the value of pi. He knows this kind of unnatural fiddling with numbers is the devils work and would never participate in knowing of any part of it.

    • @NazriB
      @NazriB 2 роки тому

      Lies again? Cock it

  • @jshellenberger7876
    @jshellenberger7876 4 місяці тому

    #POW

  • @ronaldvalenta493
    @ronaldvalenta493 4 роки тому +8

    34:40
    „3, and I‘m sure that‘s what Mike Pence thinks it is...“
    ...statistics can be fun too!
    (Religion as a Question of Precision, nice...)

  • @quocvu9847
    @quocvu9847 11 місяців тому

    20:23

  • @nallisanketh
    @nallisanketh 10 місяців тому +1

    This lecture is not about confidence intervals

  • @goe54
    @goe54 4 роки тому +2

    A lot more knowledge can be transmitted about the subject and much more better explained using only the chalk and the blackboard. We are upgrading computers and software, but we are downgrading our mind and intellect.

  • @MrArmas555
    @MrArmas555 2 роки тому

    ++

  • @user-zd6tu9zw2z
    @user-zd6tu9zw2z 2 роки тому

    Ohh gosh, why all statistics teachers look and act the same boring way with a hint of attitude? The same in my university I never could follow the lecture cause of complete boredomness. I know it's my fault not the teacher's but does anyone agree? I watched lectures of analysis 1 2 complex for many hours no breaks and passed the exams no problem. This lecture I can never focus it's torture. However I'm very thankful because it's free and I appreciate that.

  • @aminsalehi290
    @aminsalehi290 5 років тому +2

    “....named after the astronomer Carl Guass...”.
    Carl Gauss was a major mathematician and physicist, as significant as Isaac Newton. This MIT professor clearly does not know who Carl Gauss is. Get your facts straight MIT.

    • @1flovera
      @1flovera 5 років тому +1

      minor mistake though

    • @AndCaffeine
      @AndCaffeine 4 роки тому +20

      Gauss was an astronomer. He's a great mathematician, but worked as a professor of astronomy and was the director of an astronomical observatory. Do you seriously think John Guttag, former head of MIT EECS, doesn't know who Gauss is?

    • @nelkilimo
      @nelkilimo Рік тому

      These guys were Polymaths...

  • @jonathanstudentkit
    @jonathanstudentkit 6 років тому +2

    wow this is so basic the MIT should be ashamed to post this!

    • @jbrittsun
      @jbrittsun 6 років тому +26

      Jonathan problem with most schools they give way too much Theory and not enough practical application. I went through an entire masters program and probability and statistics, and out of school couldn’t analyze a simple data set. i’m not deemphasizing the theory part, but wish schools would teach more like this and have separate academic tracks for those who want to focus solely on theory.

    • @FrostyAUT
      @FrostyAUT 5 років тому +28

      It's that kind of arrogance that leads to those situations where an entire class of "master" students can be asked "What is a confidence interval? How do we calculate it?" and not a single one of them raises a hand. A university should NEVER be afraid to review the basics. The time that is "wasted" on basics pays of exponentially when you finally get to the advanced stuff.

    • @QuentinAndres06
      @QuentinAndres06 2 роки тому +2

      come on.. Jonathan