首页 > 最新文献

Journal of Cultural Analytics最新文献

英文 中文
Divergence and the Complexity of Difference in Text and Culture 文本与文化差异的分歧与复杂性
Q1 Arts and Humanities Pub Date : 2020-10-07 DOI: 10.22148/001c.17585
Kent K. Chang, S. Dedeo
Measuring how much two documents differ is a basic task in the quantitative analysis of text. Because difference is a complex, interpretive concept, researchers often operationalize difference as distance, a mathematical function that represents documents through a metaphor of physical space. Yet the constraints of that metaphor mean that distance can only capture some of the ways that documents can relate to each other. We show how a more general concept, divergence, can help solve this problem, alerting us to new ways in which documents can relate to each other. In contrast to distance, divergence can capture enclosure relationships, where two documents differ because the patterns found in one are a partial subset of those in the other, and the emergence of shortcuts, where two documents can be brought closer through mediation by a third. We provide an example of this difference measure, Kullback–Leibler Divergence, and apply it to two worked examples: the presentation of scientific arguments in Charles Darwin’s Origin of Species (1859) and the rhetorical structure of philosophical texts by Aristotle, David Hume, and Immanuel Kant. These examples illuminate the complex relationship between time and what we refer to as an archive’s “enclosure architecture”, and show how divergence can be used in the quantitative analysis of historical, literary, and cultural texts to reveal cognitive structures invisible to spatial metaphors.
衡量两个文档的差异是文本定量分析的一项基本任务。由于差异是一个复杂的解释概念,研究人员经常将差异作为距离来操作,这是一种通过物理空间的隐喻来表示文档的数学函数。然而,这种隐喻的限制意味着距离只能捕捉到文档之间的一些联系方式。我们展示了一个更通用的概念,即分歧,如何帮助解决这个问题,提醒我们注意文档相互关联的新方式。与距离相反,分歧可以捕获封闭关系,其中两个文档不同,因为其中一个文档中发现的模式是另一个文档的部分子集,以及快捷方式的出现,其中第三个文档可以通过中介将两个文档拉近。我们提供了这种差异度量的一个例子,Kullback–Leibler Divergence,并将其应用于两个实例:查尔斯·达尔文的《物种起源》(1859)中科学论点的提出,以及亚里士多德、大卫·休谟和伊曼纽尔·康德的哲学文本的修辞结构。这些例子阐明了时间和我们所说的档案馆“封闭建筑”之间的复杂关系,并展示了如何在历史、文学和文化文本的定量分析中使用分歧来揭示空间隐喻看不见的认知结构。
{"title":"Divergence and the Complexity of Difference in Text and Culture","authors":"Kent K. Chang, S. Dedeo","doi":"10.22148/001c.17585","DOIUrl":"https://doi.org/10.22148/001c.17585","url":null,"abstract":"Measuring how much two documents differ is a basic task in the quantitative analysis of text. Because difference is a complex, interpretive concept, researchers often operationalize difference as distance, a mathematical function that represents documents through a metaphor of physical space. Yet the constraints of that metaphor mean that distance can only capture some of the ways that documents can relate to each other. We show how a more general concept, divergence, can help solve this problem, alerting us to new ways in which documents can relate to each other. In contrast to distance, divergence can capture enclosure relationships, where two documents differ because the patterns found in one are a partial subset of those in the other, and the emergence of shortcuts, where two documents can be brought closer through mediation by a third. We provide an example of this difference measure, Kullback–Leibler Divergence, and apply it to two worked examples: the presentation of scientific arguments in Charles Darwin’s Origin of Species (1859) and the rhetorical structure of philosophical texts by Aristotle, David Hume, and Immanuel Kant. These examples illuminate the complex relationship between time and what we refer to as an archive’s “enclosure architecture”, and show how divergence can be used in the quantitative analysis of historical, literary, and cultural texts to reveal cognitive structures invisible to spatial metaphors.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42978443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Can GPT-3 Pass a Writer’s Turing Test? GPT-3能通过作家的图灵测试吗?
Q1 Arts and Humanities Pub Date : 2020-09-14 DOI: 10.22148/001C.17212
Katherine Elkins, Jon Chun
Until recently the field of natural language generation relied upon formalized grammar systems, small-scale statistical models, and lengthy sets of heuristic rules. This older technology was fairly limited and brittle: it could remix language into word salad poems or chat with humans within narrowly defined topics. Recently, very large-scale statistical language models have dramatically advanced the field, and GPT-3 is just one example. It can internalize the rules of language without explicit programming or rules. Instead, much like a human child, GPT-3 learns language through repeated exposure, albeit on a much larger scale. Without explicit rules, it can sometimes fail at the simplest of linguistic tasks, but it can also excel at more difficult ones like imitating an author or waxing philosophical.
直到最近,自然语言生成领域依赖于形式化的语法系统、小规模的统计模型和冗长的启发式规则集。这种旧的技术相当有限和脆弱:它可以将语言重新组合成单词色拉诗,或者在狭窄的定义主题内与人类聊天。最近,非常大规模的统计语言模型极大地推动了该领域的发展,GPT-3只是其中一个例子。它可以内化语言的规则,而不需要明确的编程或规则。相反,就像人类儿童一样,GPT-3通过反复接触来学习语言,尽管规模要大得多。如果没有明确的规则,它有时会在最简单的语言任务上失败,但它也可以在模仿作家或哲学等更困难的任务上表现出色。
{"title":"Can GPT-3 Pass a Writer’s Turing Test?","authors":"Katherine Elkins, Jon Chun","doi":"10.22148/001C.17212","DOIUrl":"https://doi.org/10.22148/001C.17212","url":null,"abstract":"Until recently the field of natural language generation relied upon formalized grammar systems, small-scale statistical models, and lengthy sets of heuristic rules. This older technology was fairly limited and brittle: it could remix language into word salad poems or chat with humans within narrowly defined topics. Recently, very large-scale statistical language models have dramatically advanced the field, and GPT-3 is just one example. It can internalize the rules of language without explicit programming or rules. Instead, much like a human child, GPT-3 learns language through repeated exposure, albeit on a much larger scale. Without explicit rules, it can sometimes fail at the simplest of linguistic tasks, but it can also excel at more difficult ones like imitating an author or waxing philosophical.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48493398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Dramatic Structure and Social Status in Shakespeare’s Plays 莎士比亚戏剧中的戏剧结构与社会地位
Q1 Arts and Humanities Pub Date : 2020-04-08 DOI: 10.22148/001c.12556
Heather Froehlich
This article discusses ways that dramatic structure can be analyzed through the use of social titles in Shakespeare’s plays. Freytag’s (1863) pyramid of dramatic structure is based on patterns he found in Shakespearean and Greek tragedy; more recently, computational methods are being employed to model narrative structure at scale. However, there has not yet been a study which discusses whether or not specific lexical items can be indicative of dramatic structure. Using Shakespeare’s plays as an example, this essay fills the gap by observing how social titles can be used to explore the viability of narrative structure.
本文讨论了如何通过莎士比亚戏剧中使用社会头衔来分析戏剧结构。弗雷塔格(1863)的戏剧结构金字塔是基于他在莎士比亚和希腊悲剧中发现的模式;最近,计算方法被用来对叙事结构进行大规模建模。然而,目前还没有一项研究来讨论特定的词汇项目是否可以指示戏剧结构。本文以莎士比亚戏剧为例,通过观察社会头衔如何被用来探索叙事结构的可行性来填补这一空白。
{"title":"Dramatic Structure and Social Status in Shakespeare’s Plays","authors":"Heather Froehlich","doi":"10.22148/001c.12556","DOIUrl":"https://doi.org/10.22148/001c.12556","url":null,"abstract":"This article discusses ways that dramatic structure can be analyzed through the use of social titles in Shakespeare’s plays. Freytag’s (1863) pyramid of dramatic structure is based on patterns he found in Shakespearean and Greek tragedy; more recently, computational methods are being employed to model narrative structure at scale. However, there has not yet been a study which discusses whether or not specific lexical items can be indicative of dramatic structure. Using Shakespeare’s plays as an example, this essay fills the gap by observing how social titles can be used to explore the viability of narrative structure.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42051166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Faces extracted from Time Magazine 1923-2014 1923-2014年时代杂志上的面孔
Q1 Arts and Humanities Pub Date : 2020-03-18 DOI: 10.22148/001c.12265
Ana Jofre, Vincent J. Berardi, Carl Bennett, M. Reale, Josh Cole
We present metadata of labeled faces extracted from a Time magazine archive that contains 3,389 issues ranging from 1923 to 2012. The data we are publishing consists of three subsets: Dataset 1) the gender labels and image characteristics for each of the 327,322 faces that were automatically-extracted from the entire Time archive, Dataset 2) a subset of 8,789 faces from a sample of 100 issues that were labeled by Amazon Mechanical Turk (AMT) workers according to ten dimensions (including gender) and used as training data to produce Dataset 1, and Dataset 3) the raw data collected from the AMT workers before being processed to produce Dataset 2.
我们展示了从《时代》杂志档案中提取的标记面孔的元数据,该档案包含从1923年到2012年的3389期。我们发布的数据包括三个子集:数据集1)性别标签和图像特征的自动提取的327322张面孔从整个时间归档,数据集2)的一个子集8789面临100样品的问题被亚马逊土耳其机器人标记(AMT)工人根据十个维度(包括性别)和作为训练数据集1的原始数据收集和数据集3)AMT工人在加工生产数据集2。
{"title":"Faces extracted from Time Magazine 1923-2014","authors":"Ana Jofre, Vincent J. Berardi, Carl Bennett, M. Reale, Josh Cole","doi":"10.22148/001c.12265","DOIUrl":"https://doi.org/10.22148/001c.12265","url":null,"abstract":"We present metadata of labeled faces extracted from a Time magazine archive that contains 3,389 issues ranging from 1923 to 2012. The data we are publishing consists of three subsets: Dataset 1) the gender labels and image characteristics for each of the 327,322 faces that were automatically-extracted from the entire Time archive, Dataset 2) a subset of 8,789 faces from a sample of 100 issues that were labeled by Amazon Mechanical Turk (AMT) workers according to ten dimensions (including gender) and used as training data to produce Dataset 1, and Dataset 3) the raw data collected from the AMT workers before being processed to produce Dataset 2.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45366414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On the perceived complexity of literature. A response to Nan Z. Da 论文学的复杂性。对南达的回应
Q1 Arts and Humanities Pub Date : 2020-01-25 DOI: 10.22148/001c.11829
Fotis Jannidis
At the center of Nan Z. Da's article is the claim that quantitative methods cannot produce any useful insights with respect to literary texts: "CLS's methodology and premises are similar to those used in professional sectors (if more primitive), but they are missing economic or mathematical justification for their drastic reduction of literary, literaryhistorical, and linguistic complexity. In these other sectors where we are truly dealing with large data sets, the purposeful reduction of features like nuance, lexical variance, and grammatical complexity is desirable (for that industry's standards and goals). In literary studies, there is no rationale for such reductionism; in fact, the discipline is about reducing
南之达文章的核心主张是,定量方法无法对文学文本产生任何有用的见解:“CLS的方法论和前提与专业领域使用的方法和前提相似(如果更原始的话),但它们缺乏经济或数学上的理由来证明它们对文学、文史和语言复杂性的大幅降低。在我们真正处理大型数据集的其他领域,有目的地减少细微差别、词汇差异和语法复杂性等特征是可取的(符合该行业的标准和目标)。在文学研究中,这种还原论是不存在的;事实上,纪律是关于减少
{"title":"On the perceived complexity of literature. A response to Nan Z. Da","authors":"Fotis Jannidis","doi":"10.22148/001c.11829","DOIUrl":"https://doi.org/10.22148/001c.11829","url":null,"abstract":"At the center of Nan Z. Da's article is the claim that quantitative methods cannot produce any useful insights with respect to literary texts: \"CLS's methodology and premises are similar to those used in professional sectors (if more primitive), but they are missing economic or mathematical justification for their drastic reduction of literary, literaryhistorical, and linguistic complexity. In these other sectors where we are truly dealing with large data sets, the purposeful reduction of features like nuance, lexical variance, and grammatical complexity is desirable (for that industry's standards and goals). In literary studies, there is no rationale for such reductionism; in fact, the discipline is about reducing","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45657207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Are We Breaking the Social Contract? 我们在破坏社会契约吗?
Q1 Arts and Humanities Pub Date : 2020-01-24 DOI: 10.22148/001c.11828
Giovanni Colavizza
The ambition of scholarship in the humanities is to systematically understand the human condition in all its aspects and times. To this end, humanists are more apt to interpret specific phenomena than generalize to previously unseen observations. When we consider scholarship as a collective effort, this has consequences. I argue that most of the humanities rely on a distinct social contract. This contract states that interpretive arguments are expected to be plausible and the grounds on which they are made, verifiable. This is the scholarly purpose (albeit not the rhetorical one) of most of what goes in our footnotes, especially references. Reference verification is mostly a virtual act, i.e., it all too rarely happens in practice, yet it is in principle always possible. Any individual scholar in any domain in the humanities can, by virtue of this contract, verify the evidence supporting any argument in a non-mediated way. This is essential to, at the very least, distinguish between solid and haphazard arguments.
人文学科的学术抱负是系统地了解人类的各个方面和时代。为此,人文主义者更倾向于解释特定的现象,而不是推广到以前看不见的观察结果。当我们将学术视为一种集体努力时,这是有后果的。我认为,大多数人文学科都依赖于一种独特的社会契约。该合同规定,解释性论点应是可信的,其提出的依据是可验证的。这是我们脚注中大多数内容的学术目的(尽管不是修辞目的),尤其是参考文献。参考验证大多是一种虚拟行为,即在实践中很少发生,但原则上总是可能的。人文学科任何领域的任何学者都可以根据本合同,以非中介的方式验证支持任何论点的证据。至少,这对于区分可靠的论点和偶然的论点至关重要。
{"title":"Are We Breaking the Social Contract?","authors":"Giovanni Colavizza","doi":"10.22148/001c.11828","DOIUrl":"https://doi.org/10.22148/001c.11828","url":null,"abstract":"The ambition of scholarship in the humanities is to systematically understand the human condition in all its aspects and times. To this end, humanists are more apt to interpret specific phenomena than generalize to previously unseen observations. When we consider scholarship as a collective effort, this has consequences. I argue that most of the humanities rely on a distinct social contract. This contract states that interpretive arguments are expected to be plausible and the grounds on which they are made, verifiable. This is the scholarly purpose (albeit not the rhetorical one) of most of what goes in our footnotes, especially references. Reference verification is mostly a virtual act, i.e., it all too rarely happens in practice, yet it is in principle always possible. Any individual scholar in any domain in the humanities can, by virtue of this contract, verify the evidence supporting any argument in a non-mediated way. This is essential to, at the very least, distinguish between solid and haphazard arguments.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42757997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dummy
Q1 Arts and Humanities Pub Date : 2020-01-23 DOI: 10.5040/9781474218573.0085
blah blah
{"title":"Dummy","authors":"blah blah","doi":"10.5040/9781474218573.0085","DOIUrl":"https://doi.org/10.5040/9781474218573.0085","url":null,"abstract":"","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45848849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Send us your null results 将您的空结果发送给我们
Q1 Arts and Humanities Pub Date : 2020-01-22 DOI: 10.22148/001c.11824
A. Piper
A considerable amount of work has been produced in quantitative fields addressing what has colloquially been called the "replication crisis." By this is meant three related phenomena: 1) the low statistical power of many studies resulting in an inability to reproduce a similar effect size; 2) a bias towards selecting statistically "significant" results for publication; and 3) a tendency to not make data and code available for others to use. A considerable amount of work has been produced in quantitative fields addressing what has colloquially been called the "replication crisis."1 By this is meant three related phenomena: 1) the low statistical power of many studies resulting in an inability to reproduce a similar effect size; 2) a bias towards selecting statistically "significant" results for publication; and 3) a tendency to not make data and code available for others to use. What this means in more straightforward language is that researchers (and the public) overwhelmingly focus on "positive" results; they tend to over-estimate how strong their results are (how large a difference some variable or combination of variables makes); and they bury a considerable amount of decisions/judgments in their research process that have an impact on the outcomes. The graph in Figure 1 down below represents the first two dimensions of this problem in very succinct form (see Simmons et al for a discussion of the third).2 Why does this matter for Cultural Analytics? After all, much of the work in CA is insulated from problem #1 (low power) because of the often large sample sizes used. Even small effects are mostly going to be reproducible with large enough samples. Many will also rightly point out that a focus on significance testing is not always at the heart of interpretive research. Regardless of the number of texts used, researchers often take a more descriptive or exploratory approach to their documents, where the idea of "null" models makes less sense. And problem #3 is dealt with through a code and data repository that accompanies most articles (at least in CA and at least in most cases). J O U R N A L O F C U L T U R A L A N A L Y T I C S 2 But these caveats overlook a larger and more systemic problem that has to do with selection bias towards positive results. Whether you are doing significance testing or just saying you have found something "interesting," the emphasis in publication is almost always on finding something "positive." This is as much a part of the culture of academic publishing as it is the current moment in the shift towards data-driven approaches for studying culture. There is enormous pressure in the field to report something positive -that a method "worked" or "shows" something. One of the enduring critiques of new computational methods is that they "don't show us anything we didn't already know." While many would disagree (rightly pointing to positive examples of new knowledge) or see this as a classic case of "hindsight bias" (our
在定量领域已经产生了相当数量的工作,以解决俗称的“复制危机”。这意味着三个相关现象:1)许多研究的低统计能力导致无法重现类似的效应大小;2)倾向于选择统计上“显著”的结果发表;3)不提供数据和代码供他人使用的倾向。在定量领域已经产生了相当数量的工作,以解决俗称的“复制危机”。这意味着三个相关的现象:1)许多研究的低统计能力导致无法重现类似的效应大小;2)倾向于选择统计上“显著”的结果发表;3)不提供数据和代码供他人使用的倾向。用更直接的语言来说,这意味着研究人员(和公众)压倒性地关注“积极”的结果;他们倾向于高估结果的强度(某些变量或变量组合产生的差异有多大);他们在研究过程中隐藏了相当多的决定/判断,这些决定/判断会影响结果。下面的图1以非常简洁的形式表示了这个问题的前两个维度(参见Simmons等人对第三个维度的讨论)为什么这对文化分析学很重要?毕竟,CA中的大部分工作都与问题#1(低功耗)绝缘,因为通常使用的样本量很大。即使是很小的影响,在足够大的样本中也大多是可重复的。许多人还会正确地指出,对显著性检验的关注并不总是解释性研究的核心。无论使用的文本数量如何,研究人员通常对他们的文档采取更具描述性或探索性的方法,在这种情况下,“零”模型的概念意义不大。问题#3是通过大多数文章附带的代码和数据存储库来处理的(至少在CA中,至少在大多数情况下)。但是,这些警告忽略了一个更大、更系统性的问题,这个问题与对积极结果的选择偏见有关。无论你是在做显著性检验,还是只是说你发现了一些“有趣的”东西,出版物的重点几乎总是在发现一些“积极的”东西。这既是学术出版文化的一部分,也是当前研究文化向数据驱动方法转变的一部分。在这个领域有巨大的压力要报告一些积极的东西——一种方法“有效”或“显示”了一些东西。对新计算方法的持久批评之一是,它们“没有向我们展示任何我们不知道的东西”。虽然许多人不同意(正确地指出新知识的积极例子),或者认为这是一个典型的“后见之明偏见”(我们的同事总是神奇地正确的能力),但事实上,在大多数情况下,这些方法根本没有向我们展示任何东西。只是你没听说过那些案子。如果我们用计算机对一些文本进行所有实验,我预计(至少)95%的情况下,这个过程没有产生任何感兴趣的见解。换句话说,积极的结果将非常罕见。然而,奇迹般地,所有的文章在CA报告一个积极的结果(包括我的)。公平地说,所有的文学和文化研究都是如此。据我所知,从来没有人发表过这样的文章:我读了很多书或看了很多电视节目,结果发现我对它们的想法并不重要。但这种情况也经常发生。我们只是从来没听说过。是时候改变这种文化了。其他领域的研究人员提出了各种各样的建议来解决这个问题,包括在完成之前预先提交文章,这样接受就不会偏向于积极的结果,使研究过程尽可能开放和透明在CA,我们希望首先鼓励提交那些没有显示出积极结果的作品,无论其定义多么宽泛。这可以是CA杂志的另一种方式,也可以在更广泛的文化分析领域工作,可以开始改变人文科学和文化研究的研究文化。这不仅意味着改变我们所考虑的证据的规模,或者使我们的判断更加透明和可测试。这也意味着在我们的努力没有产生明显效果或洞察力的所有情况下,要更加透明。正如其他人所呼吁的那样,是时候把失败当作一种认知上的好处来接受了这可能是CA在文化研究领域改变研究文化的最激进的姿态。所以让我在这里打开闸门:我们保证公布你的无效结果。 我所说的无效结果是指没有统计意义的东西(例如,使用机器学习,获奖小说无法与纽约时报评论的小说区分开来,其准确性超过随机猜测)。或者从解释的角度来看,没有显示出明显有趣的模式(我们在所有ECCO上运行主题建模算法,无论使用什么参数,主题似乎都不能代表历史兴趣的合理类别,即无论我们做什么,它都不能很好地工作)。这些是我们考虑的零结果的例子。我相信你还能想到很多很多。重要的是,提交的内容要有框架、合理和充实,就像你一直垂涎的那样,在你能想象到的最高声望的地方发表。但仅仅因为这篇文章“什么都没有”(你知道我的意思,别把我说成后现代主义),并不意味着它不应该发表。如果这个问题很重要,那么我们应该听听一个方法是如何无法解决这个问题的。这不仅可以节省研究人员了解重点的时间,还可以打开共享的研究领域——也许方法中存在可以改进的问题,或者你正在寻找的东西真的没有太大的效果。只有经过反复的尝试,我们才能对错误的想法或方法的局限性有信心。只有这样,我们才能生活在一个并非人人都是对的研究文化中。图1图表顶部的分布代表了已发表的结果-绝大多数偏向于统计显著性(蓝色,见深蓝色的小部分,它掩盖了粉红色的不重要的研究)。右边的分布代表了重复的结果,它显示了一个正态分布,绝大多数倾向于不重要的结果(粉红色)。正如评论家们越来越多地指出的那样,目前的统计推断模型在数学上倾向于高估现实世界关联的影响。摘自:开放科学协作,“心理科学的可再现性评估”,《科学》349,aac4716(2015)。DOI: 10.1126 / science.aac4716。
{"title":"Send us your null results","authors":"A. Piper","doi":"10.22148/001c.11824","DOIUrl":"https://doi.org/10.22148/001c.11824","url":null,"abstract":"A considerable amount of work has been produced in quantitative fields addressing what has colloquially been called the \"replication crisis.\" By this is meant three related phenomena: 1) the low statistical power of many studies resulting in an inability to reproduce a similar effect size; 2) a bias towards selecting statistically \"significant\" results for publication; and 3) a tendency to not make data and code available for others to use. A considerable amount of work has been produced in quantitative fields addressing what has colloquially been called the \"replication crisis.\"1 By this is meant three related phenomena: 1) the low statistical power of many studies resulting in an inability to reproduce a similar effect size; 2) a bias towards selecting statistically \"significant\" results for publication; and 3) a tendency to not make data and code available for others to use. What this means in more straightforward language is that researchers (and the public) overwhelmingly focus on \"positive\" results; they tend to over-estimate how strong their results are (how large a difference some variable or combination of variables makes); and they bury a considerable amount of decisions/judgments in their research process that have an impact on the outcomes. The graph in Figure 1 down below represents the first two dimensions of this problem in very succinct form (see Simmons et al for a discussion of the third).2 Why does this matter for Cultural Analytics? After all, much of the work in CA is insulated from problem #1 (low power) because of the often large sample sizes used. Even small effects are mostly going to be reproducible with large enough samples. Many will also rightly point out that a focus on significance testing is not always at the heart of interpretive research. Regardless of the number of texts used, researchers often take a more descriptive or exploratory approach to their documents, where the idea of \"null\" models makes less sense. And problem #3 is dealt with through a code and data repository that accompanies most articles (at least in CA and at least in most cases). J O U R N A L O F C U L T U R A L A N A L Y T I C S 2 But these caveats overlook a larger and more systemic problem that has to do with selection bias towards positive results. Whether you are doing significance testing or just saying you have found something \"interesting,\" the emphasis in publication is almost always on finding something \"positive.\" This is as much a part of the culture of academic publishing as it is the current moment in the shift towards data-driven approaches for studying culture. There is enormous pressure in the field to report something positive -that a method \"worked\" or \"shows\" something. One of the enduring critiques of new computational methods is that they \"don't show us anything we didn't already know.\" While many would disagree (rightly pointing to positive examples of new knowledge) or see this as a classic case of \"hindsight bias\" (our","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47706636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annotating Narrative Levels: Review of Guideline No. 7 注释叙述层次:对准则7的回顾
Q1 Arts and Humanities Pub Date : 2020-01-21 DOI: 10.22148/001c.11775
Gunther Martens
The guideline under review builds on the acquired knowledge of the field of narrative theory. Its main references are to classical structuralist narratology, both in terms of definitions (Todorov, Genette, Dolezel) and by way of its guiding principles, which strive for simplicity, hierarchy, minimal interpretation and a strict focus on the annotation of text-intrinsic, linguistic aspects of narrative. Most recent attempts to do “computational narratology” have been similarly “structuralist” in outlook, albeit with a stronger focus on aspects of story grammar: the basis constituents of the story are to some extent hard-coded into the language of any story, and are thus more easily formalized. The present guideline goes well beyond this restriction to story grammar. In fact, the guideline promises to tackle aspects of narrative transmission from the highest level (author) to the lowest (character), but also demarcation of scenes at the level of plot, as well as focalisation. Thus, the guideline can be said to be very wide in scope.
正在审查的指导方针建立在叙事理论领域已获得的知识基础上。它的主要参考是经典结构主义叙事学,无论是在定义上(托多罗夫、热涅特、多列泽尔),还是在指导原则上,都力求简单、分层、最小化解释,并严格关注对叙事文本内在和语言方面的注释。最近大多数“计算叙事学”的尝试都是类似于“结构主义”的观点,尽管更侧重于故事语法方面:故事的基本成分在某种程度上是硬编码到任何故事的语言中,因此更容易形式化。本指南远远超出了故事语法的限制。事实上,该指南承诺解决从最高层次(作者)到最低层次(角色)的叙事传递方面的问题,还包括情节层面的场景划分,以及焦点化。因此,该指南的范围可以说是非常广泛的。
{"title":"Annotating Narrative Levels: Review of Guideline No. 7","authors":"Gunther Martens","doi":"10.22148/001c.11775","DOIUrl":"https://doi.org/10.22148/001c.11775","url":null,"abstract":"The guideline under review builds on the acquired knowledge of the field of narrative theory. Its main references are to classical structuralist narratology, both in terms of definitions (Todorov, Genette, Dolezel) and by way of its guiding principles, which strive for simplicity, hierarchy, minimal interpretation and a strict focus on the annotation of text-intrinsic, linguistic aspects of narrative. Most recent attempts to do “computational narratology” have been similarly “structuralist” in outlook, albeit with a stronger focus on aspects of story grammar: the basis constituents of the story are to some extent hard-coded into the language of any story, and are thus more easily formalized. The present guideline goes well beyond this restriction to story grammar. In fact, the guideline promises to tackle aspects of narrative transmission from the highest level (author) to the lowest (character), but also demarcation of scenes at the level of plot, as well as focalisation. Thus, the guideline can be said to be very wide in scope.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46023525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annotating Narrative Levels: Review of Guideline No. 8 注释性叙述层次:第8号准则审查
Q1 Arts and Humanities Pub Date : 2020-01-15 DOI: 10.22148/16.064
T. McEnaney
“Let me tell you a story.” The proposed guidelines suggest that this phrase serveas the heuristic that readers supply at the beginning of any possible embeddednarrative to identify a shift in narrative frames or levels. (The difference between“frame” and “level,” although perhaps confusing in the history of narratology,does not seem like an important distinction at this stage of the project.) Thissimple phrase, the author suggests, can replace a field of narrative theory theyfeel would “simply confuse my student annotators.” However simple the phrasemight seem, however, it, in fact, conceals a number of key narratological issues:focalization, temporal indices, diction / register, person, fictional paratexts, duration,and, no doubt, others. The question for the guidelines is whether onecan leapfrog the particularity of these issues if students use the above phrase toannotate texts with XML tags and produce operational scripts that identify thenested narratives. As it currently stands, students seem capable of learning thebasic idea of nested narratives and tagging changes in narrative frames, but thereare no real results to confirm the project’s success, as the author reports they arenot yet able to confirm any inter-annotation agreement.
“让我给你讲一个故事。”拟议的指导方针建议,这个短语是读者在任何可能的嵌入叙事开始时提供的启发,以识别叙事框架或层次的转变。(“框架”和“层次”之间的区别,尽管在叙事学史上可能令人困惑,但在项目的现阶段似乎并不是一个重要的区别。)作者认为,这个简单的短语可以取代他们认为“只会让我的学生注释者感到困惑”的叙事理论领域,隐藏了许多关键的叙事问题:聚焦、时间索引、措辞/语域、人物、虚构的副文本、持续时间,毫无疑问,还有其他问题。指导方针的问题是,如果学生使用上述短语用XML标记说明文本,并生成识别测试叙述的操作脚本,是否可以超越这些问题的特殊性。目前,学生们似乎能够学习嵌套叙事的基本概念,并标记叙事框架中的变化,但没有真正的结果来证实该项目的成功,因为作者报告说,他们还无法证实任何相互注释的一致性。
{"title":"Annotating Narrative Levels: Review of Guideline No. 8","authors":"T. McEnaney","doi":"10.22148/16.064","DOIUrl":"https://doi.org/10.22148/16.064","url":null,"abstract":"“Let me tell you a story.” The proposed guidelines suggest that this phrase serve\u0000as the heuristic that readers supply at the beginning of any possible embedded\u0000narrative to identify a shift in narrative frames or levels. (The difference between\u0000“frame” and “level,” although perhaps confusing in the history of narratology,\u0000does not seem like an important distinction at this stage of the project.) This\u0000simple phrase, the author suggests, can replace a field of narrative theory they\u0000feel would “simply confuse my student annotators.” However simple the phrase\u0000might seem, however, it, in fact, conceals a number of key narratological issues:\u0000focalization, temporal indices, diction / register, person, fictional paratexts, duration,\u0000and, no doubt, others. The question for the guidelines is whether one\u0000can leapfrog the particularity of these issues if students use the above phrase to\u0000annotate texts with XML tags and produce operational scripts that identify the\u0000nested narratives. As it currently stands, students seem capable of learning the\u0000basic idea of nested narratives and tagging changes in narrative frames, but there\u0000are no real results to confirm the project’s success, as the author reports they are\u0000not yet able to confirm any inter-annotation agreement.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47837666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Cultural Analytics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1