What is predictable? A commentary on Tetlock et al. (2023)

Daniel Treisman
{"title":"What is predictable? A commentary on Tetlock et al. (2023)","authors":"Daniel Treisman","doi":"10.1002/ffo2.166","DOIUrl":null,"url":null,"abstract":"<p>Is prediction possible in world politics—and, if so, when? Tetlock et al. (<span>2023</span>) report some of the first systematic evidence on long-range political forecasting. Asked to guess which countries would get nuclear weapons within 25 years and which would undergo border changes due to war or secession, both experts and educated generalists outperformed chance. On nuclear proliferation—but not border changes—the experts beat the generalists, and the difference grew as the time scale increased from 5 to 25 years. What are we to make of this? The authors see messages for both “skeptics,” who consider the political future irreducibly opaque, and “meliorists,” who acknowledge the difficulties but think expertise can still improve predictions. Moreover, they suggest progress could be made through adversarial collaboration between scholars of the two persuasions, which would push both to specify their priors and adopt falsifiable positions.</p><p>It's hard not to admire a research paper that has been more than 25 years in the making—and one can only rejoice that picky referees did not insist the experiment be rerun from scratch. The results prompt two broader questions. First, what makes something easier or harder to predict? Second, when does expertise help? At the risk of restating the obvious, let me offer a few thoughts.</p><p>For clarity, consider a task like those in the article.<sup>1</sup> Respondents at time <i>t</i> must guess the value of a variable <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>+</mo>\n <mi>N</mi>\n </mrow>\n </msub>\n <mo>∈</mo>\n <mo>{</mo>\n <mi>Yes</mi>\n <mo>,</mo>\n <mi>No</mi>\n <mo>}</mo>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t+N}\\in \\{\\mathrm{Yes},\\mathrm{No}\\}$</annotation>\n </semantics></math>, <i>N</i> years in the future, for <i>I</i> countries indexed by <i>i</i>. The “success rate” is the proportion of countries for which the respondent chooses correctly. A task of this kind, <i>A</i>, is “easier” for a given individual than another task, <i>B</i>, if that individual's success rate on <i>A</i> tends to be higher than his success rate on <i>B</i>.</p><p>When will that be the case? The authors give a few examples of easy and difficult tasks. That New Zealand and Norway will not fight a war is “trivially obvious” (p. 1). That anyone could guess who will be US president in 25 years is “far-fetched” (p. 2). They sought challenges for their respondents that fell within the “Goldilocks zone of difficulty” (p. 2), but they do not say what principles or heuristics guided this choice.</p><p>Prediction involves the marriage of information to causal models, explicit or intuitive. This suggests a three-way division of determinants. Difficulty of prediction should depend on: (1) the nature of the underlying causal process, (2) the quality of available models, and (3) the supply of available information.</p><p>Getting a high success rate will be easier when <i>the causal process is regular</i>. Most simply, that requires that <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>+</mo>\n <mi>N</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t+N}$</annotation>\n </semantics></math> have a well-defined mean or linear trend, to which it regresses over time (the process is stationary, in the sense that: <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>=</mo>\n <mi>α</mi>\n <mo>+</mo>\n <mi>γ</mi>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>−</mo>\n <mn>1</mn>\n </mrow>\n </msub>\n <mo>+</mo>\n <mi>β</mi>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>+</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t}=\\alpha +\\gamma {Y}_{i,t-1}+\\beta {X}_{i,t}+{\\varepsilon }_{i,t}$</annotation>\n </semantics></math> where <math>\n <semantics>\n <mrow>\n <mrow>\n <mn>0</mn>\n <mo>≤</mo>\n <mi>γ</mi>\n <mo>&lt;</mo>\n <mn>1</mn>\n <mo>;</mo>\n <mi>E</mi>\n <mo>(</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>)</mo>\n <mo>=</mo>\n <mn>0</mn>\n <mo>;</mo>\n <mi>V</mi>\n <mi>a</mi>\n <mi>r</mi>\n <mo>(</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>)</mo>\n <mo>=</mo>\n <msup>\n <mi>σ</mi>\n <mn>2</mn>\n </msup>\n </mrow>\n </mrow>\n <annotation> $0\\le \\gamma \\lt 1;E({\\varepsilon }_{i,t})=0;Var({\\varepsilon }_{i,t})={\\sigma }^{2}$</annotation>\n </semantics></math> and <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${X}_{i,t}$</annotation>\n </semantics></math> is a vector of exogenous variables that may include a time trend). Extrapolation from past to future is then possible. With variables that follow a random walk, the value of past information depreciates rapidly. Prediction is also easier if the respondent has <i>a well-tested causal model</i> (in the authors' phrase, “relevant and durable principles of causality… to guide forecasters” (p. 18)). The more tests the model has undergone, the more confidence one can generally have in estimated parameters. Finally, applying a valid model requires <i>information about the values of causal variables</i>, <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${X}_{i,t}$</annotation>\n </semantics></math>.</p><p>These points also suggest when expertise will make a difference. Experts are those who have studied similar cases to develop causal models. When a causal process is highly irregular, study will not help, so regularity is a condition not just for predictability but also for expertise to matter. However, if processes are too simple, even laymen may decode them, so a certain degree of complexity should boost the expertise premium. Models will be more reliable when there has been plenty of opportunity to test them. So another condition is a rich history of similar cases. Finally, expertise should also increase familiarity with relevant information, so experts will have an advantage when information is accessible, but not too easily available to laypeople.</p><p>These expectations about when expert analysis will improve prediction overlap with conditions under which subconscious intuition should be most trustworthy. “When do judgments reflect true expertise?” Kahneman (<span>2011</span>) asks. Answer: When the environment is regular, and when the experts have had opportunity “to learn these regularities through prolonged practice.” His example of reliable intuitive judgment is chess; by contrast, “stock picking and long-term political forecasting” are “zero-validity environments.”<sup>2</sup> Psychology also suggests another type of expertise that may improve forecasting—familiarity with common biases and practice in “taming intuitive predictions.”</p><p>Does time frame, <i>N</i>, affect predictability? That we know more about tomorrow than about the distant future is a cliché. Often that's right. Whether a woman will win the 2024 US presidential election is easier to guess than the gender of the 2044 winner. But short-run is not always easier. Whether Joe Biden will be president in 2025 is harder to predict than whether he will be in 2030. If change tends to occur in one direction, the odds of transition may cumulate, eventually reducing uncertainty about the outcome (“in the long run, we are all dead”). Measurable structural factors may over time outweigh unmeasurable contingencies. Estimating the impact of economic development on democratization is easier in 20-year than in 1-year periods (Treisman, <span>2015</span>). In autocracies with income close to $5000, the best prediction of regime type in year <i>t</i> + 30—“autocracy”—would be right 57% of the time. The best prediction in year <i>t</i> + 60—“democracy”—would have an 81% success rate.<sup>3</sup></p><p>What about complexity? While this tends to make things harder and increase the expert advantage, the outcomes of complex macro-systems are sometimes easier to predict than microchoices. The level of demand for cars in the US next year may be easier to forecast than which individuals will buy one. Sometimes emergent properties of complex systems are more regular than the individual actions that comprise them.</p><p>Some think human actions are less predictable than physical phenomena because of subjectivity. This, too, is not always true. Some human behavior is highly structured, while some physical phenomena are extremely irregular. Among the games humans play, some have single equilibria, making outcomes easy to forecast. Others—often involving asymmetric information and beliefs about the beliefs of others—have multiple equilibria even for the same observable parameters. When many sets of mutually consistent beliefs are possible, it is hard to know which will be “selected.” Expertise <i>might</i> help—for instance, to identify relevant “focal points”—but often it will not. These considerations are summarized in Table 1.</p><p>Might these ideas explain the difference in results for nuclear proliferation and border change? Although both causal processes are complex, that for nuclear proliferation involves fewer key players (state governments), and their identities are known, as opposed to secessionist groups that might emerge in future. In both cases, experts will be more familiar with key information than generalists. But whereas nuclear experts tend to know something about all potential nuclear powers, scholars who publish on secession often specialize in limited geographical areas. Few have expertise on both the Kuril Islands and the Ethiopia–Somalia border. Both questions may turn on beliefs about beliefs. But secession—like most processes involving mass mobilization—is particularly prone to multiple equilibria. Few want to join a movement that is too small to be effective, even if many would join one large enough to succeed. Models of such processes go by names like “tipping” or “prairie fires” that evoke the speed with which one equilibrium can replace another (Kuran, <span>1989</span>; Schelling, <span>1978</span>).</p><p>All these factors suggest why border changes may be harder to predict than nuclear proliferation and less subject to expertise. On the other hand, there are far fewer past instances of the latter from which to learn. Nine states have acquired nuclear weapons, while at least 817 border changes occurred between 1816 and 1996 (Tir et al., <span>1998</span>). Here, the greater number of cases seems to have been more than offset by the first three considerations. Although prediction in politics is bound to remain difficult, perhaps thinking along these lines may improve our predictions about what can and cannot be predicted.</p>","PeriodicalId":100567,"journal":{"name":"FUTURES & FORESIGHT SCIENCE","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ffo2.166","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FUTURES & FORESIGHT SCIENCE","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ffo2.166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Is prediction possible in world politics—and, if so, when? Tetlock et al. (2023) report some of the first systematic evidence on long-range political forecasting. Asked to guess which countries would get nuclear weapons within 25 years and which would undergo border changes due to war or secession, both experts and educated generalists outperformed chance. On nuclear proliferation—but not border changes—the experts beat the generalists, and the difference grew as the time scale increased from 5 to 25 years. What are we to make of this? The authors see messages for both “skeptics,” who consider the political future irreducibly opaque, and “meliorists,” who acknowledge the difficulties but think expertise can still improve predictions. Moreover, they suggest progress could be made through adversarial collaboration between scholars of the two persuasions, which would push both to specify their priors and adopt falsifiable positions.

It's hard not to admire a research paper that has been more than 25 years in the making—and one can only rejoice that picky referees did not insist the experiment be rerun from scratch. The results prompt two broader questions. First, what makes something easier or harder to predict? Second, when does expertise help? At the risk of restating the obvious, let me offer a few thoughts.

For clarity, consider a task like those in the article.1 Respondents at time t must guess the value of a variable Y i , t + N { Yes , No } ${Y}_{i,t+N}\in \{\mathrm{Yes},\mathrm{No}\}$ , N years in the future, for I countries indexed by i. The “success rate” is the proportion of countries for which the respondent chooses correctly. A task of this kind, A, is “easier” for a given individual than another task, B, if that individual's success rate on A tends to be higher than his success rate on B.

When will that be the case? The authors give a few examples of easy and difficult tasks. That New Zealand and Norway will not fight a war is “trivially obvious” (p. 1). That anyone could guess who will be US president in 25 years is “far-fetched” (p. 2). They sought challenges for their respondents that fell within the “Goldilocks zone of difficulty” (p. 2), but they do not say what principles or heuristics guided this choice.

Prediction involves the marriage of information to causal models, explicit or intuitive. This suggests a three-way division of determinants. Difficulty of prediction should depend on: (1) the nature of the underlying causal process, (2) the quality of available models, and (3) the supply of available information.

Getting a high success rate will be easier when the causal process is regular. Most simply, that requires that Y i , t + N ${Y}_{i,t+N}$ have a well-defined mean or linear trend, to which it regresses over time (the process is stationary, in the sense that: Y i , t = α + γ Y i , t 1 + β X i , t + ε i , t ${Y}_{i,t}=\alpha +\gamma {Y}_{i,t-1}+\beta {X}_{i,t}+{\varepsilon }_{i,t}$ where 0 γ < 1 ; E ( ε i , t ) = 0 ; V a r ( ε i , t ) = σ 2 $0\le \gamma \lt 1;E({\varepsilon }_{i,t})=0;Var({\varepsilon }_{i,t})={\sigma }^{2}$ and X i , t ${X}_{i,t}$ is a vector of exogenous variables that may include a time trend). Extrapolation from past to future is then possible. With variables that follow a random walk, the value of past information depreciates rapidly. Prediction is also easier if the respondent has a well-tested causal model (in the authors' phrase, “relevant and durable principles of causality… to guide forecasters” (p. 18)). The more tests the model has undergone, the more confidence one can generally have in estimated parameters. Finally, applying a valid model requires information about the values of causal variables, X i , t ${X}_{i,t}$ .

These points also suggest when expertise will make a difference. Experts are those who have studied similar cases to develop causal models. When a causal process is highly irregular, study will not help, so regularity is a condition not just for predictability but also for expertise to matter. However, if processes are too simple, even laymen may decode them, so a certain degree of complexity should boost the expertise premium. Models will be more reliable when there has been plenty of opportunity to test them. So another condition is a rich history of similar cases. Finally, expertise should also increase familiarity with relevant information, so experts will have an advantage when information is accessible, but not too easily available to laypeople.

These expectations about when expert analysis will improve prediction overlap with conditions under which subconscious intuition should be most trustworthy. “When do judgments reflect true expertise?” Kahneman (2011) asks. Answer: When the environment is regular, and when the experts have had opportunity “to learn these regularities through prolonged practice.” His example of reliable intuitive judgment is chess; by contrast, “stock picking and long-term political forecasting” are “zero-validity environments.”2 Psychology also suggests another type of expertise that may improve forecasting—familiarity with common biases and practice in “taming intuitive predictions.”

Does time frame, N, affect predictability? That we know more about tomorrow than about the distant future is a cliché. Often that's right. Whether a woman will win the 2024 US presidential election is easier to guess than the gender of the 2044 winner. But short-run is not always easier. Whether Joe Biden will be president in 2025 is harder to predict than whether he will be in 2030. If change tends to occur in one direction, the odds of transition may cumulate, eventually reducing uncertainty about the outcome (“in the long run, we are all dead”). Measurable structural factors may over time outweigh unmeasurable contingencies. Estimating the impact of economic development on democratization is easier in 20-year than in 1-year periods (Treisman, 2015). In autocracies with income close to $5000, the best prediction of regime type in year t + 30—“autocracy”—would be right 57% of the time. The best prediction in year t + 60—“democracy”—would have an 81% success rate.3

What about complexity? While this tends to make things harder and increase the expert advantage, the outcomes of complex macro-systems are sometimes easier to predict than microchoices. The level of demand for cars in the US next year may be easier to forecast than which individuals will buy one. Sometimes emergent properties of complex systems are more regular than the individual actions that comprise them.

Some think human actions are less predictable than physical phenomena because of subjectivity. This, too, is not always true. Some human behavior is highly structured, while some physical phenomena are extremely irregular. Among the games humans play, some have single equilibria, making outcomes easy to forecast. Others—often involving asymmetric information and beliefs about the beliefs of others—have multiple equilibria even for the same observable parameters. When many sets of mutually consistent beliefs are possible, it is hard to know which will be “selected.” Expertise might help—for instance, to identify relevant “focal points”—but often it will not. These considerations are summarized in Table 1.

Might these ideas explain the difference in results for nuclear proliferation and border change? Although both causal processes are complex, that for nuclear proliferation involves fewer key players (state governments), and their identities are known, as opposed to secessionist groups that might emerge in future. In both cases, experts will be more familiar with key information than generalists. But whereas nuclear experts tend to know something about all potential nuclear powers, scholars who publish on secession often specialize in limited geographical areas. Few have expertise on both the Kuril Islands and the Ethiopia–Somalia border. Both questions may turn on beliefs about beliefs. But secession—like most processes involving mass mobilization—is particularly prone to multiple equilibria. Few want to join a movement that is too small to be effective, even if many would join one large enough to succeed. Models of such processes go by names like “tipping” or “prairie fires” that evoke the speed with which one equilibrium can replace another (Kuran, 1989; Schelling, 1978).

All these factors suggest why border changes may be harder to predict than nuclear proliferation and less subject to expertise. On the other hand, there are far fewer past instances of the latter from which to learn. Nine states have acquired nuclear weapons, while at least 817 border changes occurred between 1816 and 1996 (Tir et al., 1998). Here, the greater number of cases seems to have been more than offset by the first three considerations. Although prediction in politics is bound to remain difficult, perhaps thinking along these lines may improve our predictions about what can and cannot be predicted.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
什么是可预测的?对 Tetlock 等人(2023 年)的评论
因此,另一个条件是丰富的类似案例史。最后,专业知识还应该增加对相关信息的熟悉程度,因此,当信息可以被普通人获取,但又不太容易被普通人获取时,专家就会有优势。这些关于专家分析何时会提高预测能力的预期,与潜意识直觉应该最值得信赖的条件重叠。"什么时候判断能反映真正的专业知识?卡尼曼(2011 年)问道。答案是当环境是有规律的,当专家有机会 "通过长期实践学习这些规律性 "时。他举出的可靠直觉判断的例子是国际象棋;相比之下,"选股和长期政治预测 "则是 "零有效性环境"。2 心理学还提出了另一种可提高预测能力的专业知识--熟悉常见偏差并练习 "驯服直觉预测"。我们对明天的了解多于对遥远未来的了解,这是老生常谈。这句话往往是对的。2024 年美国总统大选是否会有女性获胜,比 2044 年获胜者的性别更容易猜测。但短期预测并不总是那么容易。乔-拜登在 2025 年是否会成为总统比他在 2030 年是否会成为总统更难预测。如果变化趋向于一个方向,过渡的几率可能会累积,最终降低结果的不确定性("从长远来看,我们都死了")。随着时间的推移,可衡量的结构性因素可能会超过不可衡量的偶然因素。估算经济发展对民主化的影响,20 年期比 1 年期更容易(Treisman,2015 年)。在收入接近 5000 美元的专制国家,对 t+30 年政权类型的最佳预测--"专制"--在 57% 的情况下是正确的。而对 t + 60 年的最佳预测--"民主"--则有 81% 的成功率。虽然这往往会增加工作难度,增加专家优势,但复杂宏观系统的结果有时比微观选择更容易预测。美国明年的汽车需求量可能比哪些人会购买汽车更容易预测。有些人认为,由于主观性,人类行为比物理现象更难预测。这也并非总是正确的。有些人类行为是高度结构化的,而有些物理现象则极不规则。在人类玩的游戏中,有些只有一个均衡点,结果很容易预测。而另一些博弈--通常涉及信息不对称和对他人信念的信念--即使是相同的可观测参数,也有多个均衡点。当可能存在多组相互一致的信念时,很难知道哪一组会被 "选中"。专业知识可能会有所帮助--例如,确定相关的 "焦点"--但往往无济于事。表 1 总结了这些考虑因素。这些观点能否解释核扩散和边境变化结果的差异?虽然两个因果过程都很复杂,但核扩散涉及的关键参与者(国家政府)较少,而且他们的身份是已知的,而未来可能出现的分离主义团体则不同。在这两种情况下,专家都比普通人更熟悉关键信息。但是,核问题专家往往对所有潜在的核大国都有所了解,而发表有关分裂问题文章的学者则往往专注于有限的地理区域。很少有人对千岛群岛和埃塞俄比亚-索马里边境都有研究。这两个问题都可能取决于对信仰的信念。但是,与大多数涉及群众动员的过程一样,分离主义特别容易出现多重平衡。很少有人愿意加入一个规模太小、难以奏效的运动,即使很多人愿意加入一个规模足够大、能够取得成功的运动。此类过程的模型被命名为 "倾覆 "或 "燎原之火",令人联想到一种平衡取代另一种平衡的速度(Kuran,1989;Schelling,1978)。另一方面,后者可借鉴的过往案例要少得多。九个国家获得了核武器,而在 1816 年至 1996 年期间至少发生了 817 次边界变化(Tir 等人,1998 年)。在这里,前三个因素似乎抵消了更多的案例。尽管对政治进行预测仍然困难重重,但沿着这些思路进行思考,或许能改善我们对什么能预测、什么不能预测的预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.00
自引率
0.00%
发文量
0
期刊最新文献
Issue Information Issue Information Simplification errors in predictive models Don't push the wrong button. The concept of microperspective in futures research Science fiction in military planning—Case allied command transformation and visions of warfare 2036
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1