Introduction to Bayesian data analysis - Part 2: Why use Bayes?

Поділитися
Вставка
  • Опубліковано 8 вер 2024
  • Try my new interactive online course "Fundamentals of Bayesian Data Analysis in R" over at DataCamp: www.datacamp.c...
    ----
    This is part two of a three part introduction to Bayesian data analysis. This second part aims to explain why Bayesian data analysis is useful. If you haven't watch part one, I really recommend that you do that first: • Introduction to Bayesi...
    More Bayesian stuff can be found on my blog: sumsar.net. :)

КОМЕНТАРІ • 81

  • @pedrocolangelo5844
    @pedrocolangelo5844 3 роки тому +8

    Rasmus, you should really start making videos again. Seriously, the way you teach is, by far, one of the best I've seen on UA-cam (and I watched plenties of videos here since I am a self-taught student of economics).
    I am amazed how tricky and deep concepts seem so simple when brought by you. Thank you for this series on Bayesian Statistics.

    • @qqq_Peace
      @qqq_Peace Рік тому

      I can't agree more with what you said, so I second!

  • @antonsuess3238
    @antonsuess3238 7 років тому +20

    Really like your 3 part introduction on Bayesian modelling! Clearly structured, focussed and entertaining - thank you!

    • @pascaltorvic6246
      @pascaltorvic6246 6 років тому

      Nothing to add..You did sump up my feelings perfectly

  • @MyDarkestFlower
    @MyDarkestFlower 6 років тому +29

    "I think he is smoking tobacco, but i don't know" hahahaha

  • @jiayupeng3515
    @jiayupeng3515 6 років тому +1

    Great introduction of Bayesian thinking. Much clearer than textbooks.

  • @jrdetka
    @jrdetka 5 років тому +1

    Thank you Rasmus!
    A very clear and accessible introduction with lots of opportunties to actually apply what we learned. Thank you for all you work on this.

  • @jenyasidyakin8061
    @jenyasidyakin8061 5 років тому +1

    Answer to the question "Why to accept both A & B at the same time": I think got this one :). for both A & B you need a normalizer often call the evidence or the marginal likelihood P(data)
    So it must be proportional and the same for A & B , thats why when you want to compare A and B you must accept both. It is an evidence for the joint distribution of A & B

  • @schinkelaner93
    @schinkelaner93 5 років тому +1

    All three parts are super helpful. Thanks a lot!

  • @DavenH
    @DavenH 3 роки тому

    This was so clearly and amusingly demonstrated. Great video!

  • @yairgs9899
    @yairgs9899 4 роки тому +1

    ¡Breathtaking! ¡Bravo! No words,just thank you for sharing.

  • @hellojeyjey
    @hellojeyjey 4 роки тому +2

    I would be happy to delve further into interpretations of ML algorithms from Bayesian perspective which you talk about at around 20 minutes into the video. I get the linear regression, but curious to learn more.

  • @randomized6105
    @randomized6105 3 роки тому

    Finally Bayesian is making sense

  • @kjyfhjjj
    @kjyfhjjj 5 років тому

    Very clear and helpful! Best resources I've seen now.

  • @Apolozx4
    @Apolozx4 7 років тому

    Thanks for posting these videos, man. In a economics student and I am very interested in this kind of thing.

  • @alvarosalgado3121
    @alvarosalgado3121 5 років тому +5

    Great explanations, thanks! I just didn't quite understand why in the A/B testing you should accept only results that match "both" datasets at the same time. Is it simply to apply arithmetics to the resulting generated data (like you did for getting the difference between A and B)?
    Thanks!

  • @angeld5093
    @angeld5093 6 років тому +1

    Great introduction, thank you!

  • @buffalobill212
    @buffalobill212 6 років тому

    Great video!
    For the decision analysis, why do you need bayesian analysis? You could just use the maximal likelihood estimation with the cost equation and see that the brochure alone is better. (0.38*1000 - 30 > 0.63*1000 - 300). If you're making a decision based off of this, it doesn't seem beneficial to do the bayesian analysis.

  • @andrewkostandy9510
    @andrewkostandy9510 4 роки тому +4

    Thank you for the excellent introduction series! Shouldn’t the profit calculation at 17:58 also take into consideration the cost of the campaigns for the non-respondents? So it would be profitA = (rateA*(1000-30))+(1-rateA)*-30. And then profitB = (rateB*(1000-300))+(1-rateB)*-300

    • @BiancaAguglia
      @BiancaAguglia 4 роки тому +3

      I too was confused a little about the results, then I realized Rasmus calculated the average profit per person.
      Think of it this way: let say you use campaign A for n people and your signup rate is r. Then:
      1. your total_profit is:
      total_profit = n*r*1000 - n*30
      2. your average_profit is
      average_profit = total_profit / n = (n*r*1000 - n*30) / n = r*1000 - 30
      Using Rasmus's notation, your average_profit = rateA*1000 - 30 😊His numbers are right. I hope this helps.

  • @ruthsindie2660
    @ruthsindie2660 4 роки тому

    finally i have an idea on how to apply Bayesian Analysis to optimization

  • @rezaghaiumy5415
    @rezaghaiumy5415 4 роки тому

    on 17:57, I think the Profit B should be =rateB *1000 - (300 for salmon +30 for brochure)

  • @markstrong3018
    @markstrong3018 5 років тому +2

    Please, could you provide a method how to calculate alfa & beta parameters for a beta-distribution given a particular probability? I noticed that in your example, probability of success ranges between 5% and 15%, and you genarated a distribution with params Beta(3,28). So the question is how did you achieve those alfa = 3 and beta = 28, respectively? Thanks!

    • @nickjames1066
      @nickjames1066 4 роки тому

      Would also like to see explicit code for how one creates beta distributions in R. Currently trying with dbeta() but I can't get the same distribution you get for alpha=3, beta =25.

    • @ElGtheTS
      @ElGtheTS 4 роки тому

      @@nickjames1066 Use rbeta, just like runif and rbinom. E.g. hist(rbeta(n = 10000, shape1 = 3, shape2 = 25), col='darkgreen')

  • @karannchew2534
    @karannchew2534 2 роки тому

    Notes for my future revision.
    Why Bayesian Data Analysis?
    0:29 How easy it is to change Bayesian model while the computation stay the same.
    0:32 You have great flexibility when building Bayesian models, and can focus on that, rather than computational (algorithmic) issues.
    There are often computational (processing) issue in fitting Bayesian model.
    But since there is clean separation between specifying and fitting model in a bayesian framework, you often don't have to focus too much on how your model is computed when you construct it. That mean you can focus on what assumptions are reasonable and what information you should use, rather than on algorithm when doing the actual modelling. There are many tools to help fitting Bayesian models (Stan, PyMc), just specifying the model might just be enough.

  • @nicolapsychotica3350
    @nicolapsychotica3350 5 років тому +2

    Do you make the slides available? Such great explanations. Love your presentations. Thank you so much 🙏

  • @avidreader100
    @avidreader100 5 років тому

    At 11:50 you say the CEO suggested the rate of sign up is usually between 5% and 50%. But a little after 9:12, he was quoted as saying 'between 5% and 15%'. I guess the accuracy does not matter so much. You are perhaps interested in getting closer than the initial model with uniform distribution.

  • @donolegario
    @donolegario 6 років тому +1

    Awesome video!
    But I think in the method B the cost of the brochure is missing, it's only been considered the cost of salmon, so the profitB would be (1000rateB-330) instead of (1000rateB-300),
    Anyway, the whole idea is perfectly explained.
    Cheers!

    • @rasmusab
      @rasmusab  6 років тому +1

      Ah! So when you pay for shipping the salmon, due to the postage system in Scandinavia, you can slip in a brochure at no extra cost in postage :)

    • @nickjames1066
      @nickjames1066 4 роки тому +1

      @@rasmusab Do they also give you the paper and print it for free? :p - Great bayes series btw, thank you!

  • @ashleyjones1054
    @ashleyjones1054 6 років тому

    Tack Rasmus, javligt bra!

  • @sakkariyaibrahim2650
    @sakkariyaibrahim2650 11 місяців тому

    Great lecture

  • @angelf.escalante7825
    @angelf.escalante7825 6 років тому

    Dude, you're a hero!

  • @HansPeterSloot
    @HansPeterSloot 5 років тому +3

    One question why don't you keep t=268 the 72% rate2 for model B but throw both tries away?

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 5 років тому

    really like the practical example

  • @danroche8014
    @danroche8014 4 роки тому

    Excellent content!

  • @CalzOmon
    @CalzOmon 6 років тому

    Thanks for the great videos! Learnt a lot

  • @panagiotissourtzinos9175
    @panagiotissourtzinos9175 7 років тому +4

    Great videos!!! Thank you very much. I have a question though that troubles me. I didn't understand why, while calculating the posteriors for both A and B methods we have to take the one in consideration with the other. Why do we need to keep the probability values only when both model responses agree with the observed responses. Couldn't we produce the posteriors by running the models independently?

    • @rasmusab
      @rasmusab  7 років тому +4

      Yes, for this specific model, that is correct! If you can figure out winch parameter are independent of each other then you can run those part independently. In general, however, you would have to run it all at the same time. :)

    • @jenyasidyakin8061
      @jenyasidyakin8061 5 років тому

      you a normalizer P(data) to be the same for both A & B in order to compare them so you need accept the parameter for both in the same time

  • @KeisOhtsuka
    @KeisOhtsuka 2 роки тому

    17:34 Method B: Don't you send out a brochure (30 Kr) with a salmon (300 Kr) unless it is already included (Fish 270 Kr Brochure 30 Kr). Don't forget to include a QR code. :)

  • @jamesmburu886
    @jamesmburu886 7 років тому

    very informative. Many Thanks

  • @victoriaeshelby766
    @victoriaeshelby766 4 роки тому

    @rasmusab, there is a small mistake in your tutorial at @4:40. The second rate is not incorporated into the model so 64 =/= 72 (which was from the previous draw).

  • @fissehaberhane7341
    @fissehaberhane7341 7 років тому

    Simply great!

  • @karannchew2534
    @karannchew2534 2 роки тому

    17:30 should be
    profitB = rateB x 1000 - (300 + 30)

  • @BharathAllu
    @BharathAllu 7 років тому +4

    Is there a R tutorial for this video?

  • @taylorallred
    @taylorallred 5 років тому

    Nice work!

  • @luigineri4364
    @luigineri4364 7 років тому

    Congrats for the videos. They are amazing. I really like the way you explain. It's brilliant. Also, your voice is a good fit for videos.
    When you calculate the profit for method B, shouldn't you take out 300 for the salmon and 30 still for the mail since you send both?

    • @rasmusab
      @rasmusab  7 років тому +1

      Ah, but the postage system work in mysterious ways. If you're already sending a salmon, the you don't need to pay postage for the brochure, which was the main cost of the 30 kr. :)

  • @josuesdf
    @josuesdf 4 роки тому

    Very good explanation.
    Shouldn't profitB = rateB*1000-300-30?

  • @KeisOhtsuka
    @KeisOhtsuka 2 роки тому

    7:50 I am confused with rate_diff. Simulated subscription rates for Method A and B are not related in any way other than the order in which these are generated (cf. unlike repeated mesures pre- and post-test scores). Is it meaningful to calculate difference scores between two numbers that are not related? Can we just look at confidence intervals?

  • @clapdrix72
    @clapdrix72 4 роки тому

    Why do we need to discard samples when only one draw matches the real data (A:4, B:10)? Why not just throw away the sample for the non-matching method or even just sample them separately? It isn't a bivariate distribution so each method's draws are independent of the other method's.

  • @wahabfiles6260
    @wahabfiles6260 5 років тому

    Please Please make a video on Gaussian Process.....none of the video on the youtube gives intuition like your videos

  • @RottenMonkeyderAffenkopf
    @RottenMonkeyderAffenkopf 6 років тому

    really really good

  • @alebachewtaye7454
    @alebachewtaye7454 4 роки тому

    it is very interesting !
    how to import data from other software in to winbgs for analysis

  • @alexlev4631
    @alexlev4631 7 років тому

    Congratulations! Perfect video and brilliant explanation! I wonder, what's wrong with your BayesianAid project? I missed it on cran.r-project.org?! In fact I use it and even have made a little fix for RNG.

  • @LisaHoving
    @LisaHoving 6 років тому

    Thanks!

  • @hannahlj93
    @hannahlj93 7 років тому

    when will part three be posted?? these videos are good!

    • @rasmusab
      @rasmusab  7 років тому +1

      Sorry for the delay, here it is: ua-cam.com/video/Ie-6H_r7I5A/v-deo.html :)

  • @mohammadmehd
    @mohammadmehd 7 років тому

    Thanks

  • @paulmihalyov7799
    @paulmihalyov7799 6 років тому

    Thanks for the upload, your explanations are great. Just FYI, during the 13th minute, your x axis label on the "Informative" histogram might be a mistake? It caused a little confusion for me.

    • @rasmusab
      @rasmusab  6 років тому

      Yep! Definitely a misstake, nicely spotted. Both axes should read "Posterior on the rate of signup". :)

  • @avidreader100
    @avidreader100 5 років тому

    Can we use the posterior of one round of computation as the informative prior of the next round of improved estimate? When it has no relation to reality such as an expert opinion, and still has its origin from a wild guess of uniform distribution, would it be a better estimate?

  • @stefanstojanovic1735
    @stefanstojanovic1735 7 років тому

    Shouldn't we deduct 330 from expected profit of B since we are sending both salmon and a pamphlet?

  • @frashertseng9426
    @frashertseng9426 6 років тому

    Hi, thanks the great talk. May I have a question that do I need to follow the order of rateA and rateB when computing the diff, or I can randomly draw the value from A and B to compute the diff?

    • @rasmusab
      @rasmusab  6 років тому

      Generally order matters, but in this specific case it doesn't as the two rates are completely unrelated. :)

    • @JohnDraper1993
      @JohnDraper1993 4 роки тому

      ​@@rasmusab A great video, thank you. I do have a question though ;), if my array of rates a and B is of different size then how can i calculate the rate_diff distribution? Or have i made a mistake and they should be of equal size?

    • @benjaminthomas1369
      @benjaminthomas1369 2 роки тому

      @@JohnDraper1993 ​@UCO7kJ__JJ4v4RQU3ZymR3Kw:
      Nice lecture!
      I actually came across the same problem - do you have a solution for that?
      Thanks so much for your teachings - will check our your new course for sure!!!

    • @benjaminthomas1369
      @benjaminthomas1369 2 роки тому

      @@rasmusab
      Nice lecture!
      I actually came across the same problem as John - do you have a solution for that?
      Thanks so much for your teachings - will check our your new course for sure!!!

  • @tukmyjob
    @tukmyjob 7 років тому

    Please answer a question. How to generate Informative rates using beta(3,25) for n draws?

    • @rasmusab
      @rasmusab  7 років тому

      Hi tukmyjob. I'm sorry, but I'm not sure I understand the questions... :)

    • @ralphdamico5627
      @ralphdamico5627 6 років тому

      If you go to time point 11:20 the bottom distribution shows a discrete graph of the continuous Beta distribution. The values are randomly created from this distribution. The fastest way to generate random values of specific distributions is to use uniform random numbers and plug them into the inverse of the cumulative distribution curve. What does the inverse of the Beta distribution look like? Alternate methods also exist. This question assumes a pre supplied package of functions are not being used.
      Have you had any experience with Polynomials (see Peter Fleischmann, 1978) which was enhanced by Todd Headrick (2002) for generating random values from non-Gaussian distributions. Sadly, this is something that I just recently became acquainted with. Thank you for your videos!!!

  • @ayushpandey5261
    @ayushpandey5261 7 років тому

    Unable to find part 3. Could you help me with it?

    • @rasmusab
      @rasmusab  7 років тому +1

      ua-cam.com/video/Ie-6H_r7I5A/v-deo.html

  • @Shapeguydude
    @Shapeguydude 5 років тому

    9:20 my man

  • @ivan_toriya
    @ivan_toriya 2 роки тому

    "How much we should trust our CEO? I think he is smoking tobacco, but I don't know"

  • @megis127
    @megis127 5 років тому

    hail papous with tsimpouk

  • @siqizhang
    @siqizhang 6 років тому

    do you realize that your Nordic accent is very sexy?