Journal Article Eye Tracking in Linguistics. Salvatore Attardo and Lucy Pickering Get access Eye Tracking in Linguistics. Salvatore Attardo Lucy Pickering, London, Bloomsbury Academic, 2023, pp. 304, £28.99 (P/B). ISBN: 978-1-3501-1751-8. Caterina Cacioli Caterina Cacioli Lettere e Filosofia, Università degli Studi di Firenze, Italy E-mail: caterina.cacioli@unifi.it https://orcid.org/0000-0002-9994-5770 Search for other works by this author on: Oxford Academic Google Scholar Digital Scholarship in the Humanities, fqad081, https://doi.org/10.1093/llc/fqad081 Published: 03 November 2023
语言学中的眼动追踪。Salvatore Attardo和Lucy Pickering获得语言学中的眼动追踪。塞尔瓦托·阿塔多·露西·皮克林,伦敦,布卢姆斯伯里学术出版社,2023年,第304页,28.99英镑(P/B)。ISBN: 978-1-3501-1751-8。Caterina Cacioli意大利佛罗伦萨大学来信:Filosofia e -mail: caterina.cacioli@unifi.it https://orcid.org/0000-0002-9994-5770搜索作者的其他作品:牛津学术谷歌人文学者数字奖学金,fqad081, https://doi.org/10.1093/llc/fqad081出版日期:2023年11月3日
{"title":"Eye Tracking in Linguistics. Salvatore Attardo and Lucy Pickering","authors":"Caterina Cacioli","doi":"10.1093/llc/fqad081","DOIUrl":"https://doi.org/10.1093/llc/fqad081","url":null,"abstract":"Journal Article Eye Tracking in Linguistics. Salvatore Attardo and Lucy Pickering Get access Eye Tracking in Linguistics. Salvatore Attardo Lucy Pickering, London, Bloomsbury Academic, 2023, pp. 304, £28.99 (P/B). ISBN: 978-1-3501-1751-8. Caterina Cacioli Caterina Cacioli Lettere e Filosofia, Università degli Studi di Firenze, Italy E-mail: caterina.cacioli@unifi.it https://orcid.org/0000-0002-9994-5770 Search for other works by this author on: Oxford Academic Google Scholar Digital Scholarship in the Humanities, fqad081, https://doi.org/10.1093/llc/fqad081 Published: 03 November 2023","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"22 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135874910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This article evaluates word rankings suggested by Ary L. Goldberger, Albert C. Yang, and C. Peng as a means of establishing the authorship of texts in the light of Delta, developed by John Burrows at about the same time. The tests carried out with high ranking function words and results established with the more modern approaches of Rolling Delta, Rolling Classify, and the General Imposters method give clear evidence that word rankings only return crude and unreliable results that cannot keep up with non-traditional modern methods. Even though the stylistic difference between Marlowe and Shakespeare plays could be stated, word rankings failed to recognize Shakespearean stylistics in The Jew of Malta, Edward II, and Doctor Faustus. It was only through the use of z-scores that a wider vocabulary provided a larger degree of differentiation.
摘要本文根据约翰·巴罗斯(John Burrows)在同一时期提出的Delta理论,对Ary L. Goldberger、Albert C. Yang和C. Peng提出的文本作者身份确定方法进行了评价。用高排名的功能词进行的测试以及用滚动三角洲、滚动分类和一般冒名顶替法等更现代的方法建立的结果清楚地表明,单词排名只会返回粗糙和不可靠的结果,无法跟上非传统的现代方法。尽管马洛和莎士比亚戏剧的风格差异可以被陈述出来,但在《马耳他的犹太人》、《爱德华二世》和《浮士德博士》中,单词排名未能识别出莎士比亚的风格。只有通过使用z分数,更广泛的词汇才能提供更大程度的分化。
{"title":"Methodological observations concerning word rankings and <i>z</i>-score refinements","authors":"Hartmut Ilsemann","doi":"10.1093/llc/fqad079","DOIUrl":"https://doi.org/10.1093/llc/fqad079","url":null,"abstract":"Abstract This article evaluates word rankings suggested by Ary L. Goldberger, Albert C. Yang, and C. Peng as a means of establishing the authorship of texts in the light of Delta, developed by John Burrows at about the same time. The tests carried out with high ranking function words and results established with the more modern approaches of Rolling Delta, Rolling Classify, and the General Imposters method give clear evidence that word rankings only return crude and unreliable results that cannot keep up with non-traditional modern methods. Even though the stylistic difference between Marlowe and Shakespeare plays could be stated, word rankings failed to recognize Shakespearean stylistics in The Jew of Malta, Edward II, and Doctor Faustus. It was only through the use of z-scores that a wider vocabulary provided a larger degree of differentiation.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"4 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135875263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal Article Research Methods for Digital Discourse Analysis. Camilla Vásquez Get access Research Methods for Digital Discourse Analysis. Camilla Vásquez, London, Bloomsbury Academic Press, 2022, 330 pp., $108.00 (hbk). ISBN: 978-1-350-16683-7. Hang Yu, Hang Yu School of Foreign Studies, Northwestern Polytechnical University, China E-mail: uibeyuhang@163.com Search for other works by this author on: Oxford Academic Google Scholar Sijia Chang Sijia Chang School of Foreign Studies, Northwestern Polytechnical University, China Search for other works by this author on: Oxford Academic Google Scholar Digital Scholarship in the Humanities, fqad082, https://doi.org/10.1093/llc/fqad082 Published: 02 November 2023
{"title":"Research Methods for Digital Discourse Analysis. Camilla Vásquez","authors":"Hang Yu, Sijia Chang","doi":"10.1093/llc/fqad082","DOIUrl":"https://doi.org/10.1093/llc/fqad082","url":null,"abstract":"Journal Article Research Methods for Digital Discourse Analysis. Camilla Vásquez Get access Research Methods for Digital Discourse Analysis. Camilla Vásquez, London, Bloomsbury Academic Press, 2022, 330 pp., $108.00 (hbk). ISBN: 978-1-350-16683-7. Hang Yu, Hang Yu School of Foreign Studies, Northwestern Polytechnical University, China E-mail: uibeyuhang@163.com Search for other works by this author on: Oxford Academic Google Scholar Sijia Chang Sijia Chang School of Foreign Studies, Northwestern Polytechnical University, China Search for other works by this author on: Oxford Academic Google Scholar Digital Scholarship in the Humanities, fqad082, https://doi.org/10.1093/llc/fqad082 Published: 02 November 2023","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"9 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135975790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claire Hardaker, Alice Deignan, Elena Semino, Tara Coltman-Patel, William Dance, Zsófia Demjén, Chris Sanderson, Derek Gatherer
Abstract This article introduces and explores the 3.5-million-word Victorian Anti-Vaccination Discourse Corpus (VicVaDis). The corpus is intended to provide a (freely accessible) historical resource for the investigation of the earliest public concerns and arguments against vaccination in England, which revolved around compulsory vaccination against smallpox in the second half of the 19th century. It consists of 133 anti-vaccination pamphlets and publications gathered from 1854 to 1906, a span of 53 years that loosely coincides with the Victorian era (1837–1901). This timeframe was chosen to capture the period between the 1853 Vaccination Act, which made smallpox vaccination for babies compulsory, and the 1907 Act that effectively ended the mandatory nature of vaccination. After an overview of the historical background, this article describes the rationale, design and construction of the corpus, and then demonstrates how it can be exploited to investigate the main arguments against compulsory vaccination by means of widely accessible corpus linguistic tools. Where appropriate, parallels are drawn between Victorian and 21st-century vaccine-hesitant attitudes and arguments. Overall, this article demonstrates the potential of corpus analysis to add to our understanding of historical concerns about vaccination.
{"title":"The Victorian anti-vaccination discourse corpus (VicVaDis): construction and exploration","authors":"Claire Hardaker, Alice Deignan, Elena Semino, Tara Coltman-Patel, William Dance, Zsófia Demjén, Chris Sanderson, Derek Gatherer","doi":"10.1093/llc/fqad075","DOIUrl":"https://doi.org/10.1093/llc/fqad075","url":null,"abstract":"Abstract This article introduces and explores the 3.5-million-word Victorian Anti-Vaccination Discourse Corpus (VicVaDis). The corpus is intended to provide a (freely accessible) historical resource for the investigation of the earliest public concerns and arguments against vaccination in England, which revolved around compulsory vaccination against smallpox in the second half of the 19th century. It consists of 133 anti-vaccination pamphlets and publications gathered from 1854 to 1906, a span of 53 years that loosely coincides with the Victorian era (1837–1901). This timeframe was chosen to capture the period between the 1853 Vaccination Act, which made smallpox vaccination for babies compulsory, and the 1907 Act that effectively ended the mandatory nature of vaccination. After an overview of the historical background, this article describes the rationale, design and construction of the corpus, and then demonstrates how it can be exploited to investigate the main arguments against compulsory vaccination by means of widely accessible corpus linguistic tools. Where appropriate, parallels are drawn between Victorian and 21st-century vaccine-hesitant attitudes and arguments. Overall, this article demonstrates the potential of corpus analysis to add to our understanding of historical concerns about vaccination.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"5 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134908938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The Song-era artifacts carry ideas of naturalness and moderate elegance, preserving and passing them on to future generations. The emerging digital opportunities and initiatives allow us not only to study pattern samples in detail but also to create digital equivalents that are not subject to aging or destruction. This research aims to obtain specific knowledge and situational experience regarding the digitization of Song Dynasty patterns. At the preliminary research stage, the authors performed an online survey on a popular Chinese platform. The survey confirmed the interest of a wide audience in cultural heritage (CH) objects and their preservation. The respondents noted the focus on digital sources of information and considered the documentary video format an optimal educational design. In the first stage, Song patterns were digitally processed using MATLAB version 7.0 and ACDSee. In the subsequent phase, based on the processed patterns, the authors developed an instructional video. In the third stage, the researchers assessed students’ knowledge acquired by watching a video (experimental group) or attending a lecture with a presentation based on raw patterns (control group). The results obtained before and after the intervention showed significant progress in each group. It indicated the effectiveness of the digitized images and this intervention for obtaining new knowledge and raising awareness about the Song Dynasty CH.
{"title":"The combination of the Song Dynasty patterns and digital technology","authors":"Jia Hu","doi":"10.1093/llc/fqad077","DOIUrl":"https://doi.org/10.1093/llc/fqad077","url":null,"abstract":"Abstract The Song-era artifacts carry ideas of naturalness and moderate elegance, preserving and passing them on to future generations. The emerging digital opportunities and initiatives allow us not only to study pattern samples in detail but also to create digital equivalents that are not subject to aging or destruction. This research aims to obtain specific knowledge and situational experience regarding the digitization of Song Dynasty patterns. At the preliminary research stage, the authors performed an online survey on a popular Chinese platform. The survey confirmed the interest of a wide audience in cultural heritage (CH) objects and their preservation. The respondents noted the focus on digital sources of information and considered the documentary video format an optimal educational design. In the first stage, Song patterns were digitally processed using MATLAB version 7.0 and ACDSee. In the subsequent phase, based on the processed patterns, the authors developed an instructional video. In the third stage, the researchers assessed students’ knowledge acquired by watching a video (experimental group) or attending a lecture with a presentation based on raw patterns (control group). The results obtained before and after the intervention showed significant progress in each group. It indicated the effectiveness of the digitized images and this intervention for obtaining new knowledge and raising awareness about the Song Dynasty CH.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136377294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Words are essential parts for understanding classical Chinese poems. We report a collection of 32,399 classical Chinese poems that were annotated with word boundaries. Statistics about the annotated poems support a few heuristic experiences, including the patterns of lines and a practice for the parallel structures (對仗), that researchers of Chinese literature discuss in the literature. The annotators were affiliated with two universities, so they could annotate the poems as independently as possible. Results of an inter-rater agreement study indicate that the annotators have consensus over the identified words 93 per cent of the time and have perfect consensus for the segmentation of a poem 42 per cent of the time. We applied unsupervised classification methods to annotate the poems in several different settings, and evaluated the results with human annotations. Under favorable conditions, the classifier identified about 88 per cent of the words, and segmented poems perfectly 22 per cent of the time.
{"title":"Machine learning and data analysis for word segmentation of classical Chinese poems: illustrations with Tang and Song examples","authors":"Chao-Lin Liu, Wei-Ting Chang, Chang-Ting Chu, Ti-Yong Zheng","doi":"10.1093/llc/fqad073","DOIUrl":"https://doi.org/10.1093/llc/fqad073","url":null,"abstract":"Abstract Words are essential parts for understanding classical Chinese poems. We report a collection of 32,399 classical Chinese poems that were annotated with word boundaries. Statistics about the annotated poems support a few heuristic experiences, including the patterns of lines and a practice for the parallel structures (對仗), that researchers of Chinese literature discuss in the literature. The annotators were affiliated with two universities, so they could annotate the poems as independently as possible. Results of an inter-rater agreement study indicate that the annotators have consensus over the identified words 93 per cent of the time and have perfect consensus for the segmentation of a poem 42 per cent of the time. We applied unsupervised classification methods to annotate the poems in several different settings, and evaluated the results with human annotations. Under favorable conditions, the classifier identified about 88 per cent of the words, and segmented poems perfectly 22 per cent of the time.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"26 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135567899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Kunqu, one of the oldest forms of Chinese opera, features a unique artistic expression arising from the interplay between vocal melody and the tonal quality of its lyrics. Identifying Kunqu’s character tone trend (vocal melodies derived from tonal quality of the lyrics) is critical to understanding and preserving this art form. Traditional research methods, which rely on qualitative descriptions by musicologists, have often been debated due to their subjective nature. In this study, we present a novel approach to analyze the character tone trend in Kunqu by employing computer modeling machine learning techniques. By extracting the character tone trend of Kunqu using computational modeling methods and employing machine learning techniques to apply cluster analysis on Kunqu’s character tone melody, our model uncovers musical structural patterns between singing and speech, validating and refining the qualitative findings of musicologists. Furthermore, our model can automatically assess whether a piece adheres to the rhythmic norms of ‘the integration of literature and music’ in Kunqu, thus contributing to the digitization, creation, and preservation of this important cultural heritage.
{"title":"Research on character tone trend clustering of Kunqu Opera based on quantum adaptive genetic algorithm","authors":"Rui Tian, Ruheng Yin, Junrong Ban","doi":"10.1093/llc/fqad074","DOIUrl":"https://doi.org/10.1093/llc/fqad074","url":null,"abstract":"Abstract Kunqu, one of the oldest forms of Chinese opera, features a unique artistic expression arising from the interplay between vocal melody and the tonal quality of its lyrics. Identifying Kunqu’s character tone trend (vocal melodies derived from tonal quality of the lyrics) is critical to understanding and preserving this art form. Traditional research methods, which rely on qualitative descriptions by musicologists, have often been debated due to their subjective nature. In this study, we present a novel approach to analyze the character tone trend in Kunqu by employing computer modeling machine learning techniques. By extracting the character tone trend of Kunqu using computational modeling methods and employing machine learning techniques to apply cluster analysis on Kunqu’s character tone melody, our model uncovers musical structural patterns between singing and speech, validating and refining the qualitative findings of musicologists. Furthermore, our model can automatically assess whether a piece adheres to the rhythmic norms of ‘the integration of literature and music’ in Kunqu, thus contributing to the digitization, creation, and preservation of this important cultural heritage.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135666913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The year 1993 represents a momentous milestone in the not-so-long history of translation and interpreting studies (TIS). The foundational paper published by Mona Baker entitled ‘Corpus linguistics and translation studies: Implications and applications’ in 1993 has signalled a defining moment in the application of digital humanities (DH) approaches in TIS. Since then, corpus-based TIS, as a most visible manifestation of DH in TIS, has come into being and is now gradually entering into maturity. Compared with the previously largely anecdotal, impressionist, and prescriptivist accounts of translation and interpreting, the incorporation of DH tools (e.g. CL) has significantly enriched TIS with new perspectives. This makes it possible for researchers to explore the various aspects of translation and interpreting in a more objective and systematic way, drawing on real-world data. Now one-third of a century has passed since the publication of Baker’s seminal paper, DH-inspired studies of translation and interpreting are in full swing. As we are reaching the 30-year mark of the influential publication, it is important for us to take stock of the previous achievements and look to the future both with pride and a cool head. In this article, we trace the developments of a DH approach to TIS and present the state of the art, before discussing some of the limitations and pitfalls and the road ahead going forward.
{"title":"One-third of a century on: the state of the art, pitfalls, and the way ahead relating to digital humanities approaches to translation and interpreting studies","authors":"Chonglong Gu","doi":"10.1093/llc/fqad076","DOIUrl":"https://doi.org/10.1093/llc/fqad076","url":null,"abstract":"Abstract The year 1993 represents a momentous milestone in the not-so-long history of translation and interpreting studies (TIS). The foundational paper published by Mona Baker entitled ‘Corpus linguistics and translation studies: Implications and applications’ in 1993 has signalled a defining moment in the application of digital humanities (DH) approaches in TIS. Since then, corpus-based TIS, as a most visible manifestation of DH in TIS, has come into being and is now gradually entering into maturity. Compared with the previously largely anecdotal, impressionist, and prescriptivist accounts of translation and interpreting, the incorporation of DH tools (e.g. CL) has significantly enriched TIS with new perspectives. This makes it possible for researchers to explore the various aspects of translation and interpreting in a more objective and systematic way, drawing on real-world data. Now one-third of a century has passed since the publication of Baker’s seminal paper, DH-inspired studies of translation and interpreting are in full swing. As we are reaching the 30-year mark of the influential publication, it is important for us to take stock of the previous achievements and look to the future both with pride and a cool head. In this article, we trace the developments of a DH approach to TIS and present the state of the art, before discussing some of the limitations and pitfalls and the road ahead going forward.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135889024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract A series of social media posts on 4chan then 8chan, signed under the pseudonym ‘Q’, started a movement known as QAnon, which led some of its most radical supporters to violent and illegal actions. To identify the person(s) behind Q, we evaluate the coincidence between the linguistic properties of the texts written by Q and to those written by a list of suspects provided by journalistic investigation. To identify the authors of these posts, serious challenges have to be addressed. The ‘Q drops’ are very short texts, written in a way that constitute a sort of literary genre in itself, with very peculiar features of style. These texts might have been written by different authors, whose other writings are often hard to find. After an online ethnography of the movement, necessary to collect enough material written by these thirteen potential authors, we use supervised machine learning to build stylistic profiles for each of them. We then performed a ‘rolling analysis’, looking repeatedly through a moving window for parts of Q’s writings matching our profiles. We conclude that two different individuals, Paul F. and Ron W., are the closest match to Q’s linguistic signature, and they could have successively written Q’s texts. These potential authors are not high-ranked personality from the US administration, but rather social media activists.
{"title":"Who could be behind QAnon? Authorship attribution with supervised machine-learning","authors":"Florian Cafiero, Jean-Baptiste Camps","doi":"10.1093/llc/fqad061","DOIUrl":"https://doi.org/10.1093/llc/fqad061","url":null,"abstract":"Abstract A series of social media posts on 4chan then 8chan, signed under the pseudonym ‘Q’, started a movement known as QAnon, which led some of its most radical supporters to violent and illegal actions. To identify the person(s) behind Q, we evaluate the coincidence between the linguistic properties of the texts written by Q and to those written by a list of suspects provided by journalistic investigation. To identify the authors of these posts, serious challenges have to be addressed. The ‘Q drops’ are very short texts, written in a way that constitute a sort of literary genre in itself, with very peculiar features of style. These texts might have been written by different authors, whose other writings are often hard to find. After an online ethnography of the movement, necessary to collect enough material written by these thirteen potential authors, we use supervised machine learning to build stylistic profiles for each of them. We then performed a ‘rolling analysis’, looking repeatedly through a moving window for parts of Q’s writings matching our profiles. We conclude that two different individuals, Paul F. and Ron W., are the closest match to Q’s linguistic signature, and they could have successively written Q’s texts. These potential authors are not high-ranked personality from the US administration, but rather social media activists.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135889025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicola Bertoldi, Francis Lareau, Charles H Pence, Christophe Malaterre
Abstract As one of the oldest continuously publishing journals in statistics (published since 1901), Biometrika provides a unique window onto the history of statistics and its epistemic development throughout the 20th and the beginning of the 21st centuries. While the early history of the discipline, with the works of key figures, such as Karl Pearson, Francis Galton, or Ronald Fisher, is relatively well known, the later (and longer) episodes of its intellectual development remain understudied. By applying digital tools to the full-text corpus of the journal articles (N = 5,596), the objective of this study is to provide a novel quantitative exploration of the history of the statistical sciences via an all-encompassing view of 120 years of Biometrika. To this aim, topic-modelling analyses are used and provide insights into the epistemic content of the journal and its evolution. Striking changes in the thematic content of the journal are documented and quantified for the first time, from the decline of Pearsonian and Weldonian biometrical research and the journal’s tight connection to biology in the 1930s to the rise of modern statistical methods beginning in the 1960s and 1970s. Newly developed approaches are used to infer author networks from publication topics. The resulting network of authors shows the existence of several communities, well-aligned with topic clusters and their evolution through time. It also highlights the role of specific figures over more than a century of publishing history and provides a first window onto the foundation, development, and diverse applications of the statistical sciences.
{"title":"A quantitative window on the history of statistics: topic-modelling 120 years of <i>Biometrika</i>","authors":"Nicola Bertoldi, Francis Lareau, Charles H Pence, Christophe Malaterre","doi":"10.1093/llc/fqad072","DOIUrl":"https://doi.org/10.1093/llc/fqad072","url":null,"abstract":"Abstract As one of the oldest continuously publishing journals in statistics (published since 1901), Biometrika provides a unique window onto the history of statistics and its epistemic development throughout the 20th and the beginning of the 21st centuries. While the early history of the discipline, with the works of key figures, such as Karl Pearson, Francis Galton, or Ronald Fisher, is relatively well known, the later (and longer) episodes of its intellectual development remain understudied. By applying digital tools to the full-text corpus of the journal articles (N = 5,596), the objective of this study is to provide a novel quantitative exploration of the history of the statistical sciences via an all-encompassing view of 120 years of Biometrika. To this aim, topic-modelling analyses are used and provide insights into the epistemic content of the journal and its evolution. Striking changes in the thematic content of the journal are documented and quantified for the first time, from the decline of Pearsonian and Weldonian biometrical research and the journal’s tight connection to biology in the 1930s to the rise of modern statistical methods beginning in the 1960s and 1970s. Newly developed approaches are used to infer author networks from publication topics. The resulting network of authors shows the existence of several communities, well-aligned with topic clusters and their evolution through time. It also highlights the role of specific figures over more than a century of publishing history and provides a first window onto the foundation, development, and diverse applications of the statistical sciences.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135918922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}