Judea Pearl Profile picture
Student of causal inference, human reasoning, and history of ideas, all viewed through the sharp lens of artificial intelligence.
Philip Dawid Profile picture 🔥囧Robert Osazuwa Ness囧🔥 Profile picture Episurgeon Profile picture Kingdom and Covenant Profile picture Kambiz Profile picture 17 subscribed
Apr 3, 2023 4 tweets 1 min read
Got my first session with GPT-4, amazing! Though it failed its first causal understanding test.
Me: "Is it possible that smoking causes grade increase on the average and, simultaneously, smoking causes grade decrease in every age group?"
GPT-4: It is theoretically possible
1/4
for smoking to cause an average grade increase overall while causing a grade decrease within every age group. However, this scenario would likely involve some form of Simpson's paradox, where the overall relationship between two variables reverses when accounting for
2/4
Jan 30, 2023 5 tweets 2 min read
To understand how trialists, under additivity, can get away without causal calculus, it is instructive to see how selection bias cures itself in linear systems. Let's start with
Eq. (10) of ucla.in/2LcpmHz which, for any 3 variables, X,Y,W gives:
1/5
(10) beta_yx = beta_yx.w ( 1-beta_xw^2) + beta_xw * beta_yw
It says that the regression of Y on X can be written in terms of other regression coefficients, each and every one of which is conditioned on w. If we now think of W=1 as the index of selecting units into the study,
2/5
Dec 27, 2021 5 tweets 2 min read
As the year draws to a close, many are looking back on the moments from 2021 that gave them hope and encouragement. Kenneth Markus lists his 10 Most Inspiring Moments in the fight against #antisemitism: jewishjournal.com/commentary/opi…
I would like to add another, missed by
1/5
the media, suppressed by our leaders, and hush-hushed by university administrators. Yes, I'm coming back to the USC scandal, and the 65 of its top professors who scored a huge victory through this letter: usc-faaz-12-2021.org. Here they defined in effect a new
2/5
Dec 4, 2021 4 tweets 1 min read
Is it dumbness or deliberate blindness that prevents USC officials from listening to their students and faculty? Death threats were disseminated against Zionists. Incriminating statements were made against the very being of Israel. 60 distinguished professors are pleading 1/4 with USC leadership to explicitly de-criminalize Zionist and Israeli identities [quoting from their Letter]:
"Most importantly, Jewish, Zionist, and Israeli students, as well as those who support the right of the State of Israel to exist need to hear from our leaders that 2/4
Nov 27, 2021 4 tweets 1 min read
A new book "Causation in Science, by Yemima Ben-Menahem makes the point that, in ordinary scientific practice, conservation constraints often serve as explanations. For example: "Why did the roller coaster slow down"? "Because energy must be conserved"
watermark.silverchair.com/fzab078.pdf?to…
1/4
To include such constraints as "causal explanations" Ben-Menachem advocates abandoning the paradigm that causation is a relation between events, or variables. I hesitate! Considering the fact that conversational utterances are in themselves products of language constraints, 2/4
Nov 11, 2021 5 tweets 3 min read
1/5 Finding a do-operator in a @DeepMind article is a tectonic progress that deserves welcoming blessing. The "delusions" treated in this article are endemic of "Evidential Decision Theory" which Causality (ch 4.1.1 ucla.in/38bmhnO)
summarizes in a mnemonic limerick: 2/5
- Whatever evidence an act might provide
- On what could have caused the act,
- Should never be used to help one decide
- On whether to choose that same act.
Typical real life ramifications of these delusions are:
(1) patients should avoid going to the doctor “to reduce the
Nov 1, 2021 4 tweets 2 min read
1/ To appreciate what I mean by "assumptions whose plausibility you cannot judge" I often ask readers to examine how Imbens and Rubin (2015) define "unconfoundedness", the key concept needed for all causal inference. Quoting from their page 479, we find (fasten your seat belts): 2/: First,"the conditional distribution of the outcome under the control treatment, Y i (0), given receipt of the active treatment and given covariates, is identical to its distribution conditional on receipt of the control treatment and conditional on covariates, and second,
Oct 19, 2021 5 tweets 2 min read
1/ I'm compelled to retweet this thread because so often I see well-intentioned people assuming that a "happy ever after" 1-state solution is inevitable because Palestinian's rejection of Jewish self determination (in ANY borders) is so total and deeply entrenched that any other 2/ arrangement amounts to endless blood shed. As one born in Israel and tuned daily to the country's pulse, let me mention another factor which is often ignored in conversations about the 1-state fantasy. Israelis resistance to a 1-state solution is at least as total and deeply
Sep 23, 2021 4 tweets 1 min read
jpost.com/international/… It's now 34-countries boycotting the Durban Conference, but my eyes are still on Norway. Oh Norway, Norway! How could your Gv't face it's people: Sorry, we fell asleep, and found ourselves swimming in the cesspool of civilization. To honor readers whose Governments remained 1/2
Sep 21, 2021 4 tweets 2 min read
1/ Summarizing our discussion of "demand" via "ceteris paribus" (CP), we've seen that, once formalized, CP amounts to comparing Y under two settings of X, say X=x and X=x', while leaving other variables in the structural equation for Y unchanged. The beauty of formal definitions 2/ is that they hold for all models and are independent on the meanings of X, Y,Z, etc,
or the procedure by which we estimate things. Leveraging these beauties, we come to realize that the resultant CP definition of "demand" is none other but the counterfactual definition of
Aug 5, 2021 4 tweets 1 min read
1/ This just in. A new successful paradigm for building AI systems has emerged, called "Foundation Model". According to its inventors
crfm-stanford.github.io, it works as follows: "Train one model on a huge amount of data and adapt it to many applications." Not only is it 2/ seen "as the beginnings of a sweeping paradigm shift in AI", but a whole Center has been erected in its honor, dozens of prominent researchers, post-docs and PhD students has joined its staff, and an interdisciplinary symposium has been announced. We, foot-soldiers in the
Jul 31, 2021 4 tweets 1 min read
1/ Can "traditional statistics" handle "effect sizes?" If we include Neyman-Rubin in "traditional statistics" and interpret "Can" to mean "Can, in principle", the answer is Yes. However, if we take "traditional statistics" to be represented by: Pearson, Fisher, Chochran, Tuckey,. 2/ Breiman, Friedman,...+deceased presidents of ASA, RSS...+authors of stat texts+..., and if we interpret "Can" to mean "Capable of handling a simple problem in 2 weeks time," I would bet 100:1 on "NO!". Reason: They lacked a language to articulate the assumptions needed for
Jul 5, 2021 4 tweets 2 min read
1/ Readers ask: What's the simplest problem in which a combination of experimental and observational studies can be shown to be better than each study alone?
Ans. Consider X--->Z----> Y
with unobserved confounder between X & Z.
Query Q: Find P(y|do(x))
We have 2 valid estimands: 2/
ES1 = P(y|do(x)) estimable from the experiment
ES2 =SUM_z P(z|do(x))P(y|z), the first term is estimable from the experiment, the second from the observational study.
ES2 is better than ES1 for 3 reasons:
1. P(y|z) can rest on a larger sample
2. ES2 is composite (see
Jul 5, 2021 4 tweets 2 min read
1/ It might be useful to look carefully at the logic of proving an "impossibility theorem" for NN, and why it differs fundamentally from Minsky/Papert perceptrons. The way we show that a task is impossible in Rung-1 is to present two different data-generating models (DGM) that 2/ generate the SAME probability distribution (P) but assign two different answers to the research question Q. Thus, the limitation is not in a particular structure of the NN but in ANY method, however sophisticated, that gets its input from a distribution, lacking interventions.
Jun 28, 2021 4 tweets 2 min read
1/ This post re-enforces my support of the Myth of the Lone Genius, as tweeted here: 6.12.2021 (1/ ) Why I refuse to "cancel" or "decolonialize" Euclid Geometry, Archimedes Rule and Newton's Law, despite peers pressure? Because, as explained here ucla.in/2Qg0Rfs (p.8), 2/ putting a human face behind theorems and discoveries makes science "not a book of facts and recipes, but a struggle of the human mind to unveil the mysteries of nature." Personalizing science education makes each student "an active participant in, not a passive recipient of,
Feb 1, 2021 4 tweets 1 min read
1/ Sharing an interesting observation from Frank Wiltzeck's book "Fundamentals."
In the 17th Century, while the entire scientific
world was pre-occupied with planetary motion and other
grand questions of philosophy, Galileo made careful studies of simple forms of motion, e.g., 2/ how balls roll down an inclined plane and how pendulum oscillate. To most of Galileo's contemporaries such measurements must have appeared trivial, if not irrelevant, to their speculations on how the world works. Yet Galileo aspired to a different kind of understanding.
Jan 15, 2021 4 tweets 1 min read
A letter I wrote to the California Board of Education:

I strongly oppose the 2021 California Ethnic Studies Model Curriculum.

I am particularly alarmed by its attempt to depict inter-ethnic relationships as a irreconcilable struggle between racially-defined “oppressed” 1/4 and "oppressors” and by the way it associates "whiteness" with "oppression" and "colonialism".

I am a "white" Jewish American, and I believe that the history of my people is a model of emancipation from oppression and colonialism, culminating in the State of Israel which is 2/4
Nov 6, 2020 6 tweets 2 min read
This question annoys ALL students (and professors) of ML, but they are afraid to ask. Thanks for raising it in this "no hand waving" forum. Take two causal diagrams:
X-->Y and X<--Y, and ask a neural network to decide which is more probable, after seeing 10 billion samples. 1/n The answer will be: No difference; each diagram scores the same fit as the other. Let's be more sophisticated: assign each diagram a prior and run a Bayesian analysis on the samples. Lo and Behold, the posteriors will equal to the priors no matter how we start. How come? 2/n
Oct 27, 2020 4 tweets 1 min read
When I see a paper on explainability, first question I
ask is: "What does it explain?", the data-fitting strategy of
the fitter? or real-life events such as death or survival.
I believe this paper arxiv.org/pdf/2010.10596…
is mostly about the former, as can be seen from the 1/ equations and from the absence of any world-model.
While it is sometimes useful to explain the data-fitting system (eg. for debugging), it is also important to distinguish this kind of counterfactual explanations
from the kind generated in the causal inference literature.
2/3
Jun 20, 2020 4 tweets 2 min read
1/4 Comments on your Front-Door paper:
* The expression "a single, strictly exogenous mediator
variable" is problematic: (1) Causality p. 82 defines
FDC as "A set of variables", not "a single variable". (2)
"exogenous mediator" is an oxymoron. I originally
called it (1973): 2/ "Mediating Instrumental Variables"
ucla.in/2pJzGNK, best described as an "exogenously-disturbed mediator".

* "The first application of FDC" sounds too pessimistic. Situations involving exogenously-disturbed mediators are at least as plausible as "exclusion-restricted
Mar 28, 2020 4 tweets 2 min read
1/ I'm glad, Sean, that our brief exchange has resulted in your great clarification of the issues, from which I have learned a lot. Two thoughts come immediately to mind: (1) It is a blessing that we can enjoy a division of labor between CI and statistics, the former generates 2/ causal estimands, the latter estimate them. Note though that the former is not totally oblivious to the type of data available. Different types of data will result in different estimands. eg.,experimental vs. observational, corrupted by missingness or by proxies or by