Causal Inference with Machine Learning - EXPLAINED!

Поділитися
Вставка
  • Опубліковано 15 січ 2025

КОМЕНТАРІ • 69

  • @scitechtalktv9742
    @scitechtalktv9742 3 роки тому +16

    Very instructive and well-made video !
    I have 1 question: where can I found your video on this calibration thing ? Very curious about that !
    I also have 1 slight remark: 12:08 into the video there is a mistake in the formula for ITE. In both terms of the ITE formula you use W=1, but this trivially has to be W=1 (treated) and W=0 (not treated) respectively I think. Do you agree ?
    It is just a minor remark, the rest is outstanding 👍

    • @CodeEmporium
      @CodeEmporium  3 роки тому +4

      Wow very good eye! I didn't notice this. And you are correct - Wi for the second term in that equation should be 0 and not 1 since it represents "not treated". Thanks for catching that!

    • @CodeEmporium
      @CodeEmporium  3 роки тому +4

      And the Video on calibration - you can go to the video tab on my channel and look for a video "Model Calibration - EXPLAINED". Sorry I can't paste the link here. UA-cam isn't good with it.

    • @scitechtalktv9742
      @scitechtalktv9742 3 роки тому +1

      @@CodeEmporium I find the content of your videos of a VERY good quality ! So when a new one arrives, I am watching them very focused. So spotting a mistake is not that hard for me. I hope you will continue making these videos!
      Can you point me to that video about CALLIBRATION ?

    • @scitechtalktv9742
      @scitechtalktv9742 3 роки тому +1

      @@CodeEmporium OK, I will search for that

    • @scitechtalktv9742
      @scitechtalktv9742 3 роки тому +2

      @@CodeEmporium I think it is this video:
      MODEL CALIBRATION - EXPLAINED ! - Why Logistic Regression DOESN'T return probabilities?!
      ua-cam.com/video/5zbV24vyO44/v-deo.html

  • @gutihernandez7868
    @gutihernandez7868 2 роки тому +7

    great video! Thanks!
    FYI: at 12:20 ITE formula is wrong. Both terms are same (prob of customer purchased given emails) but latter should be prob of customer purchased NOT given emails

    • @CodeEmporium
      @CodeEmporium  2 роки тому +1

      Thank you! And yep. Thanks for pointing that lil mistake out. Some others did too and i hearted a comment for more visibility. Will pin too.

  • @ritvikmath
    @ritvikmath 3 роки тому +3

    Cool topic! One confusion I had on the class transformation approach:
    If W=0 and Y=0 how can we be sure this is a "persuadable" and not a "lost cause"? If I understand correctly, both groups can take on these values. Similar question when W=1 and Y=1, can we be sure this is a "persuadable" and not a "sure thing"?

    • @CodeEmporium
      @CodeEmporium  3 роки тому +1

      Amazing question Ritvik. Yea I think you are correct. Z=1 doesn't just target persuadables, but a superset of that (with some lost causes and some sure things). The main objective of this class transformation approach i believe is to separate the sleeping dogs (we lose money advertising to these people) from the persuadables. I could have been clearer with this, but thanks for pointing this out!

    • @ritvikmath
      @ritvikmath 3 роки тому +2

      Ahh thanks for the clarification. Makes total sense as a sleeping dog vs. not sleeping dog classifier!

    • @heavybreathing6696
      @heavybreathing6696 2 роки тому

      Sleeping dogs in general is a very small percentage of the total population. So using this Z=1 class transformation, we're not really gaining much tbh...You should stick this clarification / correction or update your video description. I had the same exact thought and I rewatched that part of the video multiple times before I saw this correction comment.

  • @maraffio72
    @maraffio72 Рік тому +2

    The two-model approach assumes your models have no error, which is optimistic at best.
    Subtracting the two values from the treated model and the control model completely disregards that these point estimates are not accurate (unless you have perfect and therefore overfitting models, which is bad anyway).
    The two-model approach should only be used as an example to show why modeling and measuring Uplift is not trivial and therefore requires specific tools like Uplift modeling techniques

  • @jingwangphysics
    @jingwangphysics 2 роки тому +2

    Intuitively, since z can only have value 1 or 0, the treatment effect on the persuadable group p(z=1) wrt other groups p(z=0) is p-(1-p) = 2p-1. Great work, thanks!

  • @jeffreagan2001
    @jeffreagan2001 3 роки тому

    Hi, I am in drug discovery and the things you talk about are directly translatable to my work (potential customer vs. patient who will respond best to my drug). Thanks so much!!

  • @TheMrKingplays
    @TheMrKingplays 3 роки тому +2

    Keep up your great work! You are such a good teacher (+ entertainer sometimes ;)).

  • @soumen_das
    @soumen_das Рік тому

    great explanation! Pls continue making videos on these topics “causal inference” as there are very few informational videos.

  • @CodeEmporium
    @CodeEmporium  3 роки тому +7

    Here is a video on how causal inference can go a long way with machine learning. It's a fun video from the foundation of the concept with some important math. Hope i lay this out right and it's easy to understand. Any thoughts? Let me know in the comments or discord (link in description). Cheers!

    • @heyalchang
      @heyalchang 2 роки тому

      What are you using to make this video? The text and animations. It's really simple and effective.

    • @CodeEmporium
      @CodeEmporium  2 роки тому +1

      Camtasia Studio:)

    • @heyalchang
      @heyalchang 2 роки тому +1

      @@CodeEmporium Thanks. Keep up the videos!

    • @linhphung313
      @linhphung313 3 місяці тому

      can you give me document of this video?

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader 3 роки тому +1

    Looking forward to it. Causal Inference is one of the coolest ML ideas I've been able to ise

    • @CodeEmporium
      @CodeEmporium  3 роки тому

      Thank you! And super True

    • @idkwiadfr
      @idkwiadfr 2 роки тому

      Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help.
      Thank you

    • @ChocolateMilkCultLeader
      @ChocolateMilkCultLeader 2 роки тому

      @@idkwiadfr I'll be looking into it soon if you're interested

    • @idkwiadfr
      @idkwiadfr 2 роки тому

      @@ChocolateMilkCultLeader Sure

  • @davidpearce1583
    @davidpearce1583 Рік тому

    12:34 Can anyone explain in this chat setting how the product rule made all those changes tot he ITE equation? I've been staring at it too long and it's not clicking.
    Great vid, thanks.

  • @dmtree__
    @dmtree__ Рік тому

    12:08 shouldn't it be:
    P(Zi = 1 | Xi) = P(Yi = 1 | Xi, Wi = 1) + P(Yi = 0 | Xi, Wi = 0)
    Instead of Wi = 1 and Wi = 0 being on the left, they should be on the right, or what am I missing?

  • @srijanmishra1747
    @srijanmishra1747 3 роки тому +6

    You said ITE is in the range [0,1]. Can it not be negative for the "sleeping dogs" category you defined?

    • @idkwiadfr
      @idkwiadfr 2 роки тому

      Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help.
      Thank you

  • @hassenhadj7665
    @hassenhadj7665 3 роки тому +1

    please, can you make a video to explain the dual aspect collaborative attention ?

  • @kesun852
    @kesun852 8 місяців тому +1

    good course, but I don't get how the data collected for Zi, because Zi = 1 when the sample (in treatment group and convert) OR the sample (in the control group and not convert). Persuadable should be the 'AND' between the above relationship, is it?

  • @dubeya01
    @dubeya01 2 роки тому

    Great video...keep it up...all the best

  • @hameddadgour
    @hameddadgour 9 місяців тому

    Great video. Thank you for sharing!

  • @fangzheng4030
    @fangzheng4030 2 роки тому +1

    Very good video. I have a question, what if the probability of treatment is not 0.5 but some other value, such as 1/4? Then you cannot fully merge the probability person Xi got the email at 13:52. In this scenario, the ITE = -1 + 4/3P(Zi=1|Xi) + 8/3P(Yi=1,wi=1|Xi). What should we do with the unmerged 8/3 P(Yi=1,wi=1|Xi)?

  • @yinyangai_app
    @yinyangai_app 4 місяці тому

    How does one determine if a lead passed into the z function results in 1? Where does that training data for the model come from?

  • @arresteddevelopment2126
    @arresteddevelopment2126 10 місяців тому

    Awesome!! I have a question. What if I don't have Randomized Data? How can I then estimate ICT and do Uplift Modelling?

  • @pablocalvache-nr5wr
    @pablocalvache-nr5wr 8 місяців тому

    Amazing work! congratz!

  • @nirliptapande
    @nirliptapande 2 роки тому +1

    While we can understand the effect of a new policy by modeling it through uplift models, how can check if a certain factor is a cause for environments which we can just model, not change?

    • @idkwiadfr
      @idkwiadfr 2 роки тому

      Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help.
      Thank you

  • @kasunthalgaskotuwa5631
    @kasunthalgaskotuwa5631 2 роки тому +1

    Nice explanation. Have you create any videos about dynamic causal inference with Machine learning

    • @idkwiadfr
      @idkwiadfr 2 роки тому

      Hi, I am starting my master thesis on the same topic, could you please help me find best resources to get going with the topic. It will be of great help.
      Thank you

  • @mehul4mak
    @mehul4mak 10 місяців тому

    10:17 how to get values for z?

  • @muchidariyanto
    @muchidariyanto Рік тому

    Whats is definition of
    1. Model 1
    2. Model 2
    3. Z
    ?

  • @pamg2628
    @pamg2628 2 роки тому

    Great video. Where can I find your video on calibration?

  • @RafaelSilva-k3o
    @RafaelSilva-k3o Рік тому

    This very cool. Is it possible to use the two model approach to measure incremental orders in a a/b test?

  • @won20529jun
    @won20529jun 2 роки тому +1

    Loved it!! Thank you for the derivation; simple when you explain it. I would've just accepted it as something handed down from the gods otherwise

  • @sirabhop.s
    @sirabhop.s Рік тому

    I was wondering why do we need this uplift modeling in the first place, can’t statistics answer the goal?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 2 роки тому

    it seems like ITE measures how much more probability it is due to the treatment than not due to the treatment. so is the way to prove causality is by simply showing a higher probability. doesn't correlation also show high probability?

  • @rayallinkh
    @rayallinkh 2 роки тому

    Thanks, but what if the probability of treatment is not 0.5? let's say we only assign 20% to treatment group. Does the class transformation still hold?

  • @tntcrunch602
    @tntcrunch602 2 роки тому

    What is the video where you show how to calibrate the Uplifting Model? I am not able to find it

    • @CodeEmporium
      @CodeEmporium  2 роки тому

      I have made 3 videos in the series for causal inference. I think this is the second one. You can check out my "Causal Inference" playlist for all these videos (though I am not sure if I got to exactly the calibration of uplift modeling - if not, maybe a future video)

    • @tntcrunch602
      @tntcrunch602 2 роки тому

      @@CodeEmporium thank you, I just found it, great video by the way!

  • @1xwxu
    @1xwxu 3 роки тому

    very good video! Could you please share your slides?

    • @CodeEmporium
      @CodeEmporium  3 роки тому

      Thanks for watching! I created these animations as a video. It isn't a slide deck. Sorry about that

  • @bassamry
    @bassamry 2 роки тому

    good video, im wondering how the casual ML approach could mitigate any biases/clashes in the experiment, for example, if the treatment group individuals were sent an email, but during the experiment run, another experiment was conducted on the content of the email, or the UX of the landing experience on the email..
    how can we analyze the results of whether the email is "good" or "bad" while considering all those external effects?

  • @kawsarahmed9849
    @kawsarahmed9849 2 роки тому

    Very good. Upload next your video

  • @zaheenwani102
    @zaheenwani102 3 роки тому +1

    Intersting

  • @peterszilvasi752
    @peterszilvasi752 2 роки тому

    Hi @CodeEmporium, first of all thank you for the video, very well explained.
    What tool do you use for the animation?

    • @CodeEmporium
      @CodeEmporium  2 роки тому +1

      Glad you like it! I use Camtasia Studio

  • @mabellucena2485
    @mabellucena2485 2 роки тому

    How cute is this!!! Do not waste your time - Promo-SM .

  • @karannchew2534
    @karannchew2534 9 місяців тому

    Xi ∈ ℝ^D
    "Xi" represents a specific customer, where "i" is an index referring to a particular customer.
    "∈" denotes membership, meaning "Xi" belongs to or is an element of.
    "ℝ^D" represents the set of real numbers raised to the power of "D," where "D" is the dimensionality of the feature space. This indicates that each customer is represented as a vector of real numbers with "D" dimensions. Each dimension might correspond to a specific feature or attribute of the customer, such as age, income, spending habits, etc.
    So, the equation "Xi ∈ ℝ^D" means that each customer "Xi" is represented as a vector of real numbers with "D" dimensions.