{"title":"What is predictable? A commentary on Tetlock et al. (2023)","authors":"Daniel Treisman","doi":"10.1002/ffo2.166","DOIUrl":null,"url":null,"abstract":"<p>Is prediction possible in world politics—and, if so, when? Tetlock et al. (<span>2023</span>) report some of the first systematic evidence on long-range political forecasting. Asked to guess which countries would get nuclear weapons within 25 years and which would undergo border changes due to war or secession, both experts and educated generalists outperformed chance. On nuclear proliferation—but not border changes—the experts beat the generalists, and the difference grew as the time scale increased from 5 to 25 years. What are we to make of this? The authors see messages for both “skeptics,” who consider the political future irreducibly opaque, and “meliorists,” who acknowledge the difficulties but think expertise can still improve predictions. Moreover, they suggest progress could be made through adversarial collaboration between scholars of the two persuasions, which would push both to specify their priors and adopt falsifiable positions.</p><p>It's hard not to admire a research paper that has been more than 25 years in the making—and one can only rejoice that picky referees did not insist the experiment be rerun from scratch. The results prompt two broader questions. First, what makes something easier or harder to predict? Second, when does expertise help? At the risk of restating the obvious, let me offer a few thoughts.</p><p>For clarity, consider a task like those in the article.<sup>1</sup> Respondents at time <i>t</i> must guess the value of a variable <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>+</mo>\n <mi>N</mi>\n </mrow>\n </msub>\n <mo>∈</mo>\n <mo>{</mo>\n <mi>Yes</mi>\n <mo>,</mo>\n <mi>No</mi>\n <mo>}</mo>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t+N}\\in \\{\\mathrm{Yes},\\mathrm{No}\\}$</annotation>\n </semantics></math>, <i>N</i> years in the future, for <i>I</i> countries indexed by <i>i</i>. The “success rate” is the proportion of countries for which the respondent chooses correctly. A task of this kind, <i>A</i>, is “easier” for a given individual than another task, <i>B</i>, if that individual's success rate on <i>A</i> tends to be higher than his success rate on <i>B</i>.</p><p>When will that be the case? The authors give a few examples of easy and difficult tasks. That New Zealand and Norway will not fight a war is “trivially obvious” (p. 1). That anyone could guess who will be US president in 25 years is “far-fetched” (p. 2). They sought challenges for their respondents that fell within the “Goldilocks zone of difficulty” (p. 2), but they do not say what principles or heuristics guided this choice.</p><p>Prediction involves the marriage of information to causal models, explicit or intuitive. This suggests a three-way division of determinants. Difficulty of prediction should depend on: (1) the nature of the underlying causal process, (2) the quality of available models, and (3) the supply of available information.</p><p>Getting a high success rate will be easier when <i>the causal process is regular</i>. Most simply, that requires that <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>+</mo>\n <mi>N</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t+N}$</annotation>\n </semantics></math> have a well-defined mean or linear trend, to which it regresses over time (the process is stationary, in the sense that: <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>=</mo>\n <mi>α</mi>\n <mo>+</mo>\n <mi>γ</mi>\n <msub>\n <mi>Y</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n <mo>−</mo>\n <mn>1</mn>\n </mrow>\n </msub>\n <mo>+</mo>\n <mi>β</mi>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>+</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${Y}_{i,t}=\\alpha +\\gamma {Y}_{i,t-1}+\\beta {X}_{i,t}+{\\varepsilon }_{i,t}$</annotation>\n </semantics></math> where <math>\n <semantics>\n <mrow>\n <mrow>\n <mn>0</mn>\n <mo>≤</mo>\n <mi>γ</mi>\n <mo><</mo>\n <mn>1</mn>\n <mo>;</mo>\n <mi>E</mi>\n <mo>(</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>)</mo>\n <mo>=</mo>\n <mn>0</mn>\n <mo>;</mo>\n <mi>V</mi>\n <mi>a</mi>\n <mi>r</mi>\n <mo>(</mo>\n <msub>\n <mi>ε</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n <mo>)</mo>\n <mo>=</mo>\n <msup>\n <mi>σ</mi>\n <mn>2</mn>\n </msup>\n </mrow>\n </mrow>\n <annotation> $0\\le \\gamma \\lt 1;E({\\varepsilon }_{i,t})=0;Var({\\varepsilon }_{i,t})={\\sigma }^{2}$</annotation>\n </semantics></math> and <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${X}_{i,t}$</annotation>\n </semantics></math> is a vector of exogenous variables that may include a time trend). Extrapolation from past to future is then possible. With variables that follow a random walk, the value of past information depreciates rapidly. Prediction is also easier if the respondent has <i>a well-tested causal model</i> (in the authors' phrase, “relevant and durable principles of causality… to guide forecasters” (p. 18)). The more tests the model has undergone, the more confidence one can generally have in estimated parameters. Finally, applying a valid model requires <i>information about the values of causal variables</i>, <math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>X</mi>\n <mrow>\n <mi>i</mi>\n <mo>,</mo>\n <mi>t</mi>\n </mrow>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${X}_{i,t}$</annotation>\n </semantics></math>.</p><p>These points also suggest when expertise will make a difference. Experts are those who have studied similar cases to develop causal models. When a causal process is highly irregular, study will not help, so regularity is a condition not just for predictability but also for expertise to matter. However, if processes are too simple, even laymen may decode them, so a certain degree of complexity should boost the expertise premium. Models will be more reliable when there has been plenty of opportunity to test them. So another condition is a rich history of similar cases. Finally, expertise should also increase familiarity with relevant information, so experts will have an advantage when information is accessible, but not too easily available to laypeople.</p><p>These expectations about when expert analysis will improve prediction overlap with conditions under which subconscious intuition should be most trustworthy. “When do judgments reflect true expertise?” Kahneman (<span>2011</span>) asks. Answer: When the environment is regular, and when the experts have had opportunity “to learn these regularities through prolonged practice.” His example of reliable intuitive judgment is chess; by contrast, “stock picking and long-term political forecasting” are “zero-validity environments.”<sup>2</sup> Psychology also suggests another type of expertise that may improve forecasting—familiarity with common biases and practice in “taming intuitive predictions.”</p><p>Does time frame, <i>N</i>, affect predictability? That we know more about tomorrow than about the distant future is a cliché. Often that's right. Whether a woman will win the 2024 US presidential election is easier to guess than the gender of the 2044 winner. But short-run is not always easier. Whether Joe Biden will be president in 2025 is harder to predict than whether he will be in 2030. If change tends to occur in one direction, the odds of transition may cumulate, eventually reducing uncertainty about the outcome (“in the long run, we are all dead”). Measurable structural factors may over time outweigh unmeasurable contingencies. Estimating the impact of economic development on democratization is easier in 20-year than in 1-year periods (Treisman, <span>2015</span>). In autocracies with income close to $5000, the best prediction of regime type in year <i>t</i> + 30—“autocracy”—would be right 57% of the time. The best prediction in year <i>t</i> + 60—“democracy”—would have an 81% success rate.<sup>3</sup></p><p>What about complexity? While this tends to make things harder and increase the expert advantage, the outcomes of complex macro-systems are sometimes easier to predict than microchoices. The level of demand for cars in the US next year may be easier to forecast than which individuals will buy one. Sometimes emergent properties of complex systems are more regular than the individual actions that comprise them.</p><p>Some think human actions are less predictable than physical phenomena because of subjectivity. This, too, is not always true. Some human behavior is highly structured, while some physical phenomena are extremely irregular. Among the games humans play, some have single equilibria, making outcomes easy to forecast. Others—often involving asymmetric information and beliefs about the beliefs of others—have multiple equilibria even for the same observable parameters. When many sets of mutually consistent beliefs are possible, it is hard to know which will be “selected.” Expertise <i>might</i> help—for instance, to identify relevant “focal points”—but often it will not. These considerations are summarized in Table 1.</p><p>Might these ideas explain the difference in results for nuclear proliferation and border change? Although both causal processes are complex, that for nuclear proliferation involves fewer key players (state governments), and their identities are known, as opposed to secessionist groups that might emerge in future. In both cases, experts will be more familiar with key information than generalists. But whereas nuclear experts tend to know something about all potential nuclear powers, scholars who publish on secession often specialize in limited geographical areas. Few have expertise on both the Kuril Islands and the Ethiopia–Somalia border. Both questions may turn on beliefs about beliefs. But secession—like most processes involving mass mobilization—is particularly prone to multiple equilibria. Few want to join a movement that is too small to be effective, even if many would join one large enough to succeed. Models of such processes go by names like “tipping” or “prairie fires” that evoke the speed with which one equilibrium can replace another (Kuran, <span>1989</span>; Schelling, <span>1978</span>).</p><p>All these factors suggest why border changes may be harder to predict than nuclear proliferation and less subject to expertise. On the other hand, there are far fewer past instances of the latter from which to learn. Nine states have acquired nuclear weapons, while at least 817 border changes occurred between 1816 and 1996 (Tir et al., <span>1998</span>). Here, the greater number of cases seems to have been more than offset by the first three considerations. Although prediction in politics is bound to remain difficult, perhaps thinking along these lines may improve our predictions about what can and cannot be predicted.</p>","PeriodicalId":100567,"journal":{"name":"FUTURES & FORESIGHT SCIENCE","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ffo2.166","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FUTURES & FORESIGHT SCIENCE","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ffo2.166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Is prediction possible in world politics—and, if so, when? Tetlock et al. (2023) report some of the first systematic evidence on long-range political forecasting. Asked to guess which countries would get nuclear weapons within 25 years and which would undergo border changes due to war or secession, both experts and educated generalists outperformed chance. On nuclear proliferation—but not border changes—the experts beat the generalists, and the difference grew as the time scale increased from 5 to 25 years. What are we to make of this? The authors see messages for both “skeptics,” who consider the political future irreducibly opaque, and “meliorists,” who acknowledge the difficulties but think expertise can still improve predictions. Moreover, they suggest progress could be made through adversarial collaboration between scholars of the two persuasions, which would push both to specify their priors and adopt falsifiable positions.
It's hard not to admire a research paper that has been more than 25 years in the making—and one can only rejoice that picky referees did not insist the experiment be rerun from scratch. The results prompt two broader questions. First, what makes something easier or harder to predict? Second, when does expertise help? At the risk of restating the obvious, let me offer a few thoughts.
For clarity, consider a task like those in the article.1 Respondents at time t must guess the value of a variable , N years in the future, for I countries indexed by i. The “success rate” is the proportion of countries for which the respondent chooses correctly. A task of this kind, A, is “easier” for a given individual than another task, B, if that individual's success rate on A tends to be higher than his success rate on B.
When will that be the case? The authors give a few examples of easy and difficult tasks. That New Zealand and Norway will not fight a war is “trivially obvious” (p. 1). That anyone could guess who will be US president in 25 years is “far-fetched” (p. 2). They sought challenges for their respondents that fell within the “Goldilocks zone of difficulty” (p. 2), but they do not say what principles or heuristics guided this choice.
Prediction involves the marriage of information to causal models, explicit or intuitive. This suggests a three-way division of determinants. Difficulty of prediction should depend on: (1) the nature of the underlying causal process, (2) the quality of available models, and (3) the supply of available information.
Getting a high success rate will be easier when the causal process is regular. Most simply, that requires that have a well-defined mean or linear trend, to which it regresses over time (the process is stationary, in the sense that: where and is a vector of exogenous variables that may include a time trend). Extrapolation from past to future is then possible. With variables that follow a random walk, the value of past information depreciates rapidly. Prediction is also easier if the respondent has a well-tested causal model (in the authors' phrase, “relevant and durable principles of causality… to guide forecasters” (p. 18)). The more tests the model has undergone, the more confidence one can generally have in estimated parameters. Finally, applying a valid model requires information about the values of causal variables, .
These points also suggest when expertise will make a difference. Experts are those who have studied similar cases to develop causal models. When a causal process is highly irregular, study will not help, so regularity is a condition not just for predictability but also for expertise to matter. However, if processes are too simple, even laymen may decode them, so a certain degree of complexity should boost the expertise premium. Models will be more reliable when there has been plenty of opportunity to test them. So another condition is a rich history of similar cases. Finally, expertise should also increase familiarity with relevant information, so experts will have an advantage when information is accessible, but not too easily available to laypeople.
These expectations about when expert analysis will improve prediction overlap with conditions under which subconscious intuition should be most trustworthy. “When do judgments reflect true expertise?” Kahneman (2011) asks. Answer: When the environment is regular, and when the experts have had opportunity “to learn these regularities through prolonged practice.” His example of reliable intuitive judgment is chess; by contrast, “stock picking and long-term political forecasting” are “zero-validity environments.”2 Psychology also suggests another type of expertise that may improve forecasting—familiarity with common biases and practice in “taming intuitive predictions.”
Does time frame, N, affect predictability? That we know more about tomorrow than about the distant future is a cliché. Often that's right. Whether a woman will win the 2024 US presidential election is easier to guess than the gender of the 2044 winner. But short-run is not always easier. Whether Joe Biden will be president in 2025 is harder to predict than whether he will be in 2030. If change tends to occur in one direction, the odds of transition may cumulate, eventually reducing uncertainty about the outcome (“in the long run, we are all dead”). Measurable structural factors may over time outweigh unmeasurable contingencies. Estimating the impact of economic development on democratization is easier in 20-year than in 1-year periods (Treisman, 2015). In autocracies with income close to $5000, the best prediction of regime type in year t + 30—“autocracy”—would be right 57% of the time. The best prediction in year t + 60—“democracy”—would have an 81% success rate.3
What about complexity? While this tends to make things harder and increase the expert advantage, the outcomes of complex macro-systems are sometimes easier to predict than microchoices. The level of demand for cars in the US next year may be easier to forecast than which individuals will buy one. Sometimes emergent properties of complex systems are more regular than the individual actions that comprise them.
Some think human actions are less predictable than physical phenomena because of subjectivity. This, too, is not always true. Some human behavior is highly structured, while some physical phenomena are extremely irregular. Among the games humans play, some have single equilibria, making outcomes easy to forecast. Others—often involving asymmetric information and beliefs about the beliefs of others—have multiple equilibria even for the same observable parameters. When many sets of mutually consistent beliefs are possible, it is hard to know which will be “selected.” Expertise might help—for instance, to identify relevant “focal points”—but often it will not. These considerations are summarized in Table 1.
Might these ideas explain the difference in results for nuclear proliferation and border change? Although both causal processes are complex, that for nuclear proliferation involves fewer key players (state governments), and their identities are known, as opposed to secessionist groups that might emerge in future. In both cases, experts will be more familiar with key information than generalists. But whereas nuclear experts tend to know something about all potential nuclear powers, scholars who publish on secession often specialize in limited geographical areas. Few have expertise on both the Kuril Islands and the Ethiopia–Somalia border. Both questions may turn on beliefs about beliefs. But secession—like most processes involving mass mobilization—is particularly prone to multiple equilibria. Few want to join a movement that is too small to be effective, even if many would join one large enough to succeed. Models of such processes go by names like “tipping” or “prairie fires” that evoke the speed with which one equilibrium can replace another (Kuran, 1989; Schelling, 1978).
All these factors suggest why border changes may be harder to predict than nuclear proliferation and less subject to expertise. On the other hand, there are far fewer past instances of the latter from which to learn. Nine states have acquired nuclear weapons, while at least 817 border changes occurred between 1816 and 1996 (Tir et al., 1998). Here, the greater number of cases seems to have been more than offset by the first three considerations. Although prediction in politics is bound to remain difficult, perhaps thinking along these lines may improve our predictions about what can and cannot be predicted.