Keynote: Judea Pearl - The New Science of Cause and Effect

Поділитися
Вставка
  • Опубліковано 21 тра 2024
  • PyData LA 2018
    The talk will explain why data science should embrace an engine for processing cause-effect relationships. I will describe the structure of this engine, how it has revolutionized the data-intensive sciences, and how it is about to revolutions machine learning.
    ---
    www.pydata.org
    PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
    PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
    0:00 Speaker Introduction
    1:00 Introduction
    1:18 Talk Proverb
    2:13 Talk Outline
    5:11 Causal Models and the Cognitive Revolution
    10:50 Typical Causal Questions & The Limitation of Standard Grammar of Science
    18:15 The Ladder of Causation (3 Level Hierarchy)
    26:36 Simpson's Paradox
    31:00 Explainability Deep-Learning Style
    34:26 Distinguish Seeing from Doing
    36:19 The Two Fundamental Laws of Causal Inference
    38:15 Reading Independencies
    40:02 Structural Causal Model (SCM) Inference Engine
    42:33 The Seven Pillars of Causal Wisdom
    44:44 Pillar 5: External Validity and Sample Selection Bias
    45:51 Pillar 5: The Problem in Real Life
    47:06 Pillar 5: The Problem in Mathematics
    49:16 Conclusion
    51:56 Q&A 1: Opinion on Natural Experiments to Discover Causal Connections in Data
    57:38 Q&A 2: Opinion on the Popularization of Statistics in News Media
    1:01:30 Q&A 3
    S/o to github.com/trfore for the video timestamps!
    Want to help add timestamps to our UA-cam videos to help with discoverability? Find out more here: github.com/numfocus/UA-camVi...
  • Наука та технологія

КОМЕНТАРІ • 13

  • @loo2k
    @loo2k Рік тому +15

    Note to future organisers: Please give Judea an attached mic rather than a hand held =)

  • @crypticnomad
    @crypticnomad Рік тому +3

    I have noticed that if we frame a statement in terms of conditional probabilities instead of correlation we can infer some information about the causal structure. If P(B|A) != P(A)*P(B) then one of the following must be true: A causes or prevents B, B causes or prevents A, C some other factors or set of factors cause or prevent both A and B or finally it is random chance. There is an almost linear relationship between the distance that P(B|A) is from P(A)*P(B) and the probability of being in the first two states. Meaning, as an example, as P(B|A) grows larger than P(A)*P(B) so does the probability that we are in either state 1 or 2. In many cases I can with relative confidence say if we are either in states 1 or 2 or in 3 or 4.

  • @NickGeo25
    @NickGeo25 2 роки тому +3

    Brilliant!!!

  • @lukemacomber4792
    @lukemacomber4792 Рік тому +2

    35:49---Back propagation makes counterfactuals possible.

  • @tndgu
    @tndgu Рік тому

    Science simplified. Thanks! :)

  • @MrKrtek00
    @MrKrtek00 Рік тому

    great talk, i really like his books

  • @lukemacomber4792
    @lukemacomber4792 Рік тому +2

    23:53---We want the "software" to modify itself. Can a neural network posit counterfactuals? A neural net can predict. And then measure how closely aligned that prediction matches a training set or constraining parameters. But when is one prediction of value, and when is that same prediction not of value? When is one prediction more "correct" than another, or all others, and when is it more useful to be replaced with another prediction? So the mechanism we're looking for is how to decide between, how to govern predictions in order to achieve something closer to a human level of counterfactual positing, and more importantly, when, rank, order, context. What we're looking for is a pattern recognition that isn't constrained by resembling objects, but by resembling functions.

    • @lukemacomber4792
      @lukemacomber4792 Рік тому

      When do we activate one prediction over another? That can be programmed as a function, and more interestingly, as a series of functions.

  • @lukemacomber4792
    @lukemacomber4792 Рік тому +1

    13:45--Predicate logic, symbolic logic are forms of math...or, math is a form of logic. Predicate logic COULD be represented with numbers, obviously. But without the presence of functional arrays, all you can get the computational platform to do is recognize co-occurrences. Yet...Siri and Alexa use predicate logic to mimic language understanding to great effect (NLP). So what's missing?

  • @lukemacomber4792
    @lukemacomber4792 Рік тому +2

    28:29---Simpson's Paradox...when to go with one data set or another -isn't that answerable by regression models? When age is relevant to a prediction and when it is not? Leave out age and you get one answer. Leave it in, get another. That's the difference between the two graphs, no? A variable? And the relationship between variables can be addressed by function(s). It's when we apply different functions to variables, and between variables, when we create new variables this way, that we often are able to enhance and surpass the original model. Can't we set a neural net, a governing or discriminating entity to determine which pattern of functions and variable sets and interactions are more explanatory of the data under consideration, the new occurrences of patterns with which we wish the AI to recognize? So the counterfactual isn't what matches the data, but rather the series or occurrences of functions. The causal relationships all can be represented by functions or series of functions and can be data-agnostic - that patterns of functions can be recognized across disparate data sets despite context, or, depending on context - these are both possible. No?

    • @lukemacomber4792
      @lukemacomber4792 Рік тому +1

      Our first function is association and it comes from algebra. Are second function is back propagation/chain rule and that comes from calculus. All of our current power in data science and AI comes from these two functions. Why stop there?

    • @hashiromer7668
      @hashiromer7668 4 місяці тому +2

      I think the overall message is the danger of an unknown variable, which if condioned on reverses the pattern observed in data.

    • @annay3963
      @annay3963 3 місяці тому

      We can observe and measure age. But there are things that are hard to observe and measure, such as human intelligence, perseverance, etc., for example.