{"title":"Ole Schützler and Julia Schlüter (eds.). Data and methods in corpus linguistics. Comparative approaches. Cambridge: Cambridge University Press, 2022. 357 pp. ISBN 978-1-10849964-4","authors":"Matthias Eitelmann","doi":"10.2478/icame-2023-0010","DOIUrl":"https://doi.org/10.2478/icame-2023-0010","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"32 1","pages":"149 - 152"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73351201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The study in hand investigates the impact of social status on the use and change of pragmatic formulae in historical varieties of English. The study asks which leavetaking formulae are used between writers of equal social status in varieties of English in the later 18th century. Working on a corpus of letters compiled from two subsets of letters each from 18th-century Scottish and Irish English, the study illustrates pragmatic change on the basis of the investigation of leavetakings involving the servant formula. By doing so, the study also helps to widen the hitherto predominating narrow focus on mainly English English. The study shows that the use of formulae is situationally dependant. It suggests that pragmatic change takes place amongst writers of equal social status in the private domain, which then leads to the use of such formulae in the public domain and to the use between writers of different status groups.
{"title":"From I am, with sincere regard, your most obedient servant to Yours sincerely: The simplification of leavetaking formulae in 18th-century Scottish and Irish English letters","authors":"C. Elsweiler, P. Ronan","doi":"10.2478/icame-2023-0001","DOIUrl":"https://doi.org/10.2478/icame-2023-0001","url":null,"abstract":"Abstract The study in hand investigates the impact of social status on the use and change of pragmatic formulae in historical varieties of English. The study asks which leavetaking formulae are used between writers of equal social status in varieties of English in the later 18th century. Working on a corpus of letters compiled from two subsets of letters each from 18th-century Scottish and Irish English, the study illustrates pragmatic change on the basis of the investigation of leavetakings involving the servant formula. By doing so, the study also helps to widen the hitherto predominating narrow focus on mainly English English. The study shows that the use of formulae is situationally dependant. It suggests that pragmatic change takes place amongst writers of equal social status in the private domain, which then leads to the use of such formulae in the public domain and to the use between writers of different status groups.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"420 1","pages":"1 - 17"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84917963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In this research article we introduce the South Asian Online Englishes (SAOnE) corpus representing four South Asian countries, i.e. Bangladesh, India, Pakistan, and Sri Lanka, and two native English-speaking countries, i.e. the UK and the USA. We have used semi-automatic and manual methods to collect data from three internet registers, i.e. newspaper comments, web forums and tweets, and a collection of internet sub-registers which we label as blogs and websites. Additionally, we have collected text messages using online freelance hiring platforms from each of the South Asian countries mentioned above. Each register category in the corpus consists of approximately 1 million words per register per country, except text messages, which contains around 500,000 words per country and only includes the four South Asian countries. We have verified the origin of website and blog links, authors of Twitter, and where possible of commenters and web forum users to make sure that only local content of each country is included. The corpus features some indigenous language content, which is tagged. In addition to the description of this dataset, we also present a pilot study analysing three discourse particles, namely na, neh, and yaar. The discourse particles na and yaar are native to Hindi/Urdu, while neh is based on a Sinhala negation marker. Our analysis indicates that na and neh have similarities in terms of their position in the clause/utterance. However, neh is confined to Sri Lanka while the Hindi/Urdu based discourse particles are also used in our Twitter data from Sri Lanka and Bangladesh. The use of these discourse particles in Bangladeshi tweets shows the influence of Indian culture through Bollywood celebrities. Of the Hindi/Urdu discourse particles yaar and na, yaar is preferred in Pakistan while na is preferred in India; additionally, yaar is used at the start of the clause more often in our Pakistani data. Lastly, we discuss the implications of the pilot study, the advantages of the type of data used for the pilot study, and future research directions.
{"title":"Compiling a corpus of South Asian online Englishes: A report, some reflections and a pilot study","authors":"Muhammad Shakir, Dagmar Deuber","doi":"10.2478/icame-2023-0007","DOIUrl":"https://doi.org/10.2478/icame-2023-0007","url":null,"abstract":"Abstract In this research article we introduce the South Asian Online Englishes (SAOnE) corpus representing four South Asian countries, i.e. Bangladesh, India, Pakistan, and Sri Lanka, and two native English-speaking countries, i.e. the UK and the USA. We have used semi-automatic and manual methods to collect data from three internet registers, i.e. newspaper comments, web forums and tweets, and a collection of internet sub-registers which we label as blogs and websites. Additionally, we have collected text messages using online freelance hiring platforms from each of the South Asian countries mentioned above. Each register category in the corpus consists of approximately 1 million words per register per country, except text messages, which contains around 500,000 words per country and only includes the four South Asian countries. We have verified the origin of website and blog links, authors of Twitter, and where possible of commenters and web forum users to make sure that only local content of each country is included. The corpus features some indigenous language content, which is tagged. In addition to the description of this dataset, we also present a pilot study analysing three discourse particles, namely na, neh, and yaar. The discourse particles na and yaar are native to Hindi/Urdu, while neh is based on a Sinhala negation marker. Our analysis indicates that na and neh have similarities in terms of their position in the clause/utterance. However, neh is confined to Sri Lanka while the Hindi/Urdu based discourse particles are also used in our Twitter data from Sri Lanka and Bangladesh. The use of these discourse particles in Bangladeshi tweets shows the influence of Indian culture through Bollywood celebrities. Of the Hindi/Urdu discourse particles yaar and na, yaar is preferred in Pakistan while na is preferred in India; additionally, yaar is used at the start of the clause more often in our Pakistani data. Lastly, we discuss the implications of the pilot study, the advantages of the type of data used for the pilot study, and future research directions.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"5 1","pages":"119 - 139"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74576326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This study presents a method for identifying words that appear in corpus data earlier than their first date of attestation in dictionaries. We demonstrate the application of this method based on a large diachronic corpus, the TV Corpus, and the Oxford English Dictionary (OED). Combining automatic extraction of candidate terms from the TV Corpus with comprehensive manual analysis and verification, the method identifies 32 words that were used in TV series before their first attestation in the OED. We present a detailed discussion of these words, analysing their distribution across decades and genres of the TV Corpus, their origins, semantic domains and word-formation processes. We also present extracts with their first uses in the TV Corpus and analyse how the words were presented to the large and anonymous mass audience. Our study shows that the method we present is suitable for identifying early attestations of words in large corpora, even though in the case of the TV Corpus, a great deal of manual analysis and verification is needed. In addition, we argue that TV series and other types of fictional texts are an important resource for studying the coinage and spread of terms, due to their function and the fact that they address a mass audience.
{"title":"TV series as disseminators of emerging vocabulary: Non-codified expressions in the TV Corpus","authors":"Daniela Landert, Tanja Säily, Mika Hämäläinen","doi":"10.2478/icame-2023-0004","DOIUrl":"https://doi.org/10.2478/icame-2023-0004","url":null,"abstract":"Abstract This study presents a method for identifying words that appear in corpus data earlier than their first date of attestation in dictionaries. We demonstrate the application of this method based on a large diachronic corpus, the TV Corpus, and the Oxford English Dictionary (OED). Combining automatic extraction of candidate terms from the TV Corpus with comprehensive manual analysis and verification, the method identifies 32 words that were used in TV series before their first attestation in the OED. We present a detailed discussion of these words, analysing their distribution across decades and genres of the TV Corpus, their origins, semantic domains and word-formation processes. We also present extracts with their first uses in the TV Corpus and analyse how the words were presented to the large and anonymous mass audience. Our study shows that the method we present is suitable for identifying early attestations of words in large corpora, even though in the case of the TV Corpus, a great deal of manual analysis and verification is needed. In addition, we argue that TV series and other types of fictional texts are an important resource for studying the coinage and spread of terms, due to their function and the fact that they address a mass audience.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"32 3 1","pages":"63 - 79"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88830233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The present study sought to identify the generic structures of the results sections of scientific research articles (RAs) between Applied Linguistics and Physics. Following a manual search approach, a total of 200 RAs in the field of Applied Linguistics and Physics from different top prestigious journals randomly were singled out and analyzed. In addition to offering a tentative template for the rhetorical organizations of results sections, the findings revealed shared and non-shared rhetorical units as well as obligatory and optional steps in the results sections (RSs) of research articles between the disciplines. The findings also indicated that RA writers organize the contents of the RSs around certain rhetorical resources (i.e., M1, M2, M3, M4, and M5) to present key experimental and factual analytical results of their studies. The findings further suggested the existence of common core of rhetorical resources in writing RSs between the disciplines, albeit there are a set of certain steps playing an essential part in distinguishing textual features of each discipline as well as depicting how RSs of individual discipline are developed. The findings generated from the study can offer a number of important pedagogical implications for teaching EAP and ESP courses, especially for Applied Linguistics and Physics teachers and students.
{"title":"A comparative corpus-based investigation of results sections of research articles in Applied Linguistics and Physics","authors":"Muhammed Parviz","doi":"10.2478/icame-2023-0005","DOIUrl":"https://doi.org/10.2478/icame-2023-0005","url":null,"abstract":"Abstract The present study sought to identify the generic structures of the results sections of scientific research articles (RAs) between Applied Linguistics and Physics. Following a manual search approach, a total of 200 RAs in the field of Applied Linguistics and Physics from different top prestigious journals randomly were singled out and analyzed. In addition to offering a tentative template for the rhetorical organizations of results sections, the findings revealed shared and non-shared rhetorical units as well as obligatory and optional steps in the results sections (RSs) of research articles between the disciplines. The findings also indicated that RA writers organize the contents of the RSs around certain rhetorical resources (i.e., M1, M2, M3, M4, and M5) to present key experimental and factual analytical results of their studies. The findings further suggested the existence of common core of rhetorical resources in writing RSs between the disciplines, albeit there are a set of certain steps playing an essential part in distinguishing textual features of each discipline as well as depicting how RSs of individual discipline are developed. The findings generated from the study can offer a number of important pedagogical implications for teaching EAP and ESP courses, especially for Applied Linguistics and Physics teachers and students.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"49 1","pages":"81 - 108"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75205367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This study of contemporary American English examines how males and females are evaluated in terms of their personality, physical appearance, societal importance, etc. across various registers. In this study, evaluation is defined as an expression of a speaker or writer’s attitude toward, viewpoint on, or feelings about a male or female referent, which generally carries a positive or a negative meaning. The evaluative tokens analyzed in the study include noun phrases (e.g., a real jerk) and adjectival modification (e.g., congenial) co-occurring with gender-specific nominal expressions (e.g., boy, lady) or pronominal expressions (e.g., he, she). The findings imply a distinct gender patterning in the evaluation: whereas males are evaluated in terms of their skills, abilities, acuities and importance in society, females are typically assessed in terms of their looks and appearance. Males occupy considerably more evaluative space than females, particularly in the Newspaper register. The preponderance of the evaluation of males even in twenty-first-century American English is surprising, considering changes in gender role attitudes in U.S. society in recent decades.
{"title":"Gender and evaluation in contemporary American English: A corpus study based on pronominal and nominal expressions with male and female reference","authors":"Md Nazmus Saqueb Kathon","doi":"10.2478/icame-2023-0003","DOIUrl":"https://doi.org/10.2478/icame-2023-0003","url":null,"abstract":"Abstract This study of contemporary American English examines how males and females are evaluated in terms of their personality, physical appearance, societal importance, etc. across various registers. In this study, evaluation is defined as an expression of a speaker or writer’s attitude toward, viewpoint on, or feelings about a male or female referent, which generally carries a positive or a negative meaning. The evaluative tokens analyzed in the study include noun phrases (e.g., a real jerk) and adjectival modification (e.g., congenial) co-occurring with gender-specific nominal expressions (e.g., boy, lady) or pronominal expressions (e.g., he, she). The findings imply a distinct gender patterning in the evaluation: whereas males are evaluated in terms of their skills, abilities, acuities and importance in society, females are typically assessed in terms of their looks and appearance. Males occupy considerably more evaluative space than females, particularly in the Newspaper register. The preponderance of the evaluation of males even in twenty-first-century American English is surprising, considering changes in gender role attitudes in U.S. society in recent decades.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"29 1","pages":"39 - 61"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83149273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pascual Pérez-Paredes and Geraldine Mark (eds.). Beyond concordance lines: Corpora in language education. Amsterdam/Philadelphia: John Benjamins Publishing Company, 2021. ix. 255 pp. ISBN: 978-9-02720989-4 (HB)","authors":"Peter Crosthwaite","doi":"10.2478/icame-2023-0009","DOIUrl":"https://doi.org/10.2478/icame-2023-0009","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"14 1","pages":"145 - 148"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87792921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Legal discourse is widely assumed to be resistant to change, and indeed legislative documents are extremely conservative with fixed and formulaic structures. However, recent research has shown that changes can be observed in the lexico-grammatical features of some legal documents when examined diachronically, particularly since the emergence in the 1970s of the Plain Language Movement, which sought to draw attention to the unnecessary complexity of the official language, this including legal discourse. Despite the crucial changes in legal language in recent years, research in that direction is scarce to date, particularly in the British English variety, probably due, in part, to the shortage of specialised corpora that allow this kind of studies. In order to bridge this gap, we have embarked on the compilation of the Corpus of Contemporary English Legal Decisions, 1950–2021 (CoCELD), a corpus of British judicial decisions produced between 1950 and 2021. In this paper we present the structure and characteristics of CoCELD, as well as the methodology used for its compilation. The new corpus, which was released in February 2022, contains sample texts of roughly 2,500 words for each year from 1950 to 2021, which adds up to more than 730,000 words. The corpus contains files in raw text and with POS-annotation, and is freely available for the research community under signed consent. With CoCELD we hope to contribute with a new, useful resource for linguists with an interest in legal language, from both a synchronic and a diachronic perspective.
{"title":"The Corpus of Contemporary English Legal Decisions, 1950–2021 (CoCELD): A new tool for analysing recent changes in English legal discourse","authors":"Paula Rodríguez-Puente, David Hernández-Coalla","doi":"10.2478/icame-2023-0006","DOIUrl":"https://doi.org/10.2478/icame-2023-0006","url":null,"abstract":"Abstract Legal discourse is widely assumed to be resistant to change, and indeed legislative documents are extremely conservative with fixed and formulaic structures. However, recent research has shown that changes can be observed in the lexico-grammatical features of some legal documents when examined diachronically, particularly since the emergence in the 1970s of the Plain Language Movement, which sought to draw attention to the unnecessary complexity of the official language, this including legal discourse. Despite the crucial changes in legal language in recent years, research in that direction is scarce to date, particularly in the British English variety, probably due, in part, to the shortage of specialised corpora that allow this kind of studies. In order to bridge this gap, we have embarked on the compilation of the Corpus of Contemporary English Legal Decisions, 1950–2021 (CoCELD), a corpus of British judicial decisions produced between 1950 and 2021. In this paper we present the structure and characteristics of CoCELD, as well as the methodology used for its compilation. The new corpus, which was released in February 2022, contains sample texts of roughly 2,500 words for each year from 1950 to 2021, which adds up to more than 730,000 words. The corpus contains files in raw text and with POS-annotation, and is freely available for the research community under signed consent. With CoCELD we hope to contribute with a new, useful resource for linguists with an interest in legal language, from both a synchronic and a diachronic perspective.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"22 1","pages":"109 - 117"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85299261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To date, only a few studies have been carried out on how to position the corpus linguistic approach within the study of language and within scientific approaches in general. Among the notable exceptions, McEnery and Brezina’s introduction (p. 1–2) mentions Leech (1992), Stubbs (2001) and Teubert (2005). There is thus a considerable gap to be filled by the present volume. The focus is not, however, to contrast the authors’ stance with that of previous linguistic studies, but instead to take renewed look at corpus linguistics through Karl Popper’s work on the philosophy of science, as this “provokes new ways of looking at old problems and practices” (p. 2). Across the chapters, McEnery and Brezina formulate 48 principles of corpus linguistics, some of which are partly modified in the course of the discussion. These principles constitute the theoretical foundations of corpus linguistics – three of the central ones are given below:
{"title":"Tony McEnery and Vaclav Brezina. Fundamental principles of corpus linguistics. Cambridge: Cambridge University Press, 2022. 313 pp. ISBN 978-1-1071-1062-5","authors":"Magnus Levin","doi":"10.2478/icame-2023-0008","DOIUrl":"https://doi.org/10.2478/icame-2023-0008","url":null,"abstract":"To date, only a few studies have been carried out on how to position the corpus linguistic approach within the study of language and within scientific approaches in general. Among the notable exceptions, McEnery and Brezina’s introduction (p. 1–2) mentions Leech (1992), Stubbs (2001) and Teubert (2005). There is thus a considerable gap to be filled by the present volume. The focus is not, however, to contrast the authors’ stance with that of previous linguistic studies, but instead to take renewed look at corpus linguistics through Karl Popper’s work on the philosophy of science, as this “provokes new ways of looking at old problems and practices” (p. 2). Across the chapters, McEnery and Brezina formulate 48 principles of corpus linguistics, some of which are partly modified in the course of the discussion. These principles constitute the theoretical foundations of corpus linguistics – three of the central ones are given below:","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"4 1","pages":"141 - 143"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78639256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper uses corpus data and methods of distributional semantics in order to study English clippings such as dorm (< dormitory), memo (< memorandum), or quake (< earthquake). We investigate whether systematic meaning differences between clippings and their source words can be detected. The analysis is based on a sample of 50 English clippings. Each of the clippings is represented by a concordance of 100 examples in context that were gathered from the Corpus of Contemporary American English. We compare clippings and their source words both at the aggregate level and in terms of comparisons between individual clippings and their source words. The data show that clippings tend to be used in contexts that represent involved text production, which aligns with the idea that clipped words signal familiarity with their referents. It is further observed that individual clippings and their source words partly diverge in their distributional profiles, reflecting both overlap and differences with regard to their meanings. We interpret these findings against the theoretical background of Construction Grammar and specifically the Principle of No Synonymy.
{"title":"Meaning differences between English clippings and their source words: A corpus-based study","authors":"M. Hilpert, D. Saavedra, Jennifer Rains","doi":"10.2478/icame-2023-0002","DOIUrl":"https://doi.org/10.2478/icame-2023-0002","url":null,"abstract":"Abstract This paper uses corpus data and methods of distributional semantics in order to study English clippings such as dorm (< dormitory), memo (< memorandum), or quake (< earthquake). We investigate whether systematic meaning differences between clippings and their source words can be detected. The analysis is based on a sample of 50 English clippings. Each of the clippings is represented by a concordance of 100 examples in context that were gathered from the Corpus of Contemporary American English. We compare clippings and their source words both at the aggregate level and in terms of comparisons between individual clippings and their source words. The data show that clippings tend to be used in contexts that represent involved text production, which aligns with the idea that clipped words signal familiarity with their referents. It is further observed that individual clippings and their source words partly diverge in their distributional profiles, reflecting both overlap and differences with regard to their meanings. We interpret these findings against the theoretical background of Construction Grammar and specifically the Principle of No Synonymy.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"7 1","pages":"19 - 37"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84622379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}