阿曼王室演讲语料库：汇编与分析

IF 1.5 0 LANGUAGE & LINGUISTICS Arab World English Journal Pub Date : 2023-12-15 DOI:10.24093/awej/vol14no4.9

Aladdin Al Zahran, R. Jamoussi

{"title":"阿曼王室演讲语料库：汇编与分析","authors":"Aladdin Al Zahran, R. Jamoussi","doi":"10.24093/awej/vol14no4.9","DOIUrl":null,"url":null,"abstract":"For many years, researchers have directed their attention primarily toward developing written corpora, with the consequence that spoken corpora have consistently remained rare compared to written ones. The laborious transcription and annotation tasks make creating and maintaining spoken corpora a challenging endeavor. This project aims to build a transcribed corpus of Oman Royal Speeches and make it available online through a custom-made concordance tool. The study also aims to test the corpus for fundamental corpus-based lexical, stylistic, and discourse-analytical implementations. Compiling the Oman Royal Speeches Corpus is meant to fill a gap by contributing to the development of Arabic spoken language corpora and make available a research tool that can facilitate corpus-based research, uses, and applications in various areas of investigation. The corpus-building process underwent a five-stage process, including data capture, data processing, concordance tool development, testing and evaluation, and online deployment. With 98,511 tokens, the resultant corpus represents a searchable archive of Royal Speeches with a built-in online concordance tool that allows multiple search types and Keyword-in-Context query result display. The corpus has been tested for various corpus-analytic uses and has been found to provide significant findings in these areas. Thus, it has the potential to function as a reliable and authentic record and source of information for researchers and specialists in various fields, as well as a research tool allowing for various applications and analyses in language-related topics.","PeriodicalId":45153,"journal":{"name":"Arab World English Journal","volume":"87 3","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Oman Royal Speeches Corpus: Compilation and Analysis\",\"authors\":\"Aladdin Al Zahran, R. Jamoussi\",\"doi\":\"10.24093/awej/vol14no4.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For many years, researchers have directed their attention primarily toward developing written corpora, with the consequence that spoken corpora have consistently remained rare compared to written ones. The laborious transcription and annotation tasks make creating and maintaining spoken corpora a challenging endeavor. This project aims to build a transcribed corpus of Oman Royal Speeches and make it available online through a custom-made concordance tool. The study also aims to test the corpus for fundamental corpus-based lexical, stylistic, and discourse-analytical implementations. Compiling the Oman Royal Speeches Corpus is meant to fill a gap by contributing to the development of Arabic spoken language corpora and make available a research tool that can facilitate corpus-based research, uses, and applications in various areas of investigation. The corpus-building process underwent a five-stage process, including data capture, data processing, concordance tool development, testing and evaluation, and online deployment. With 98,511 tokens, the resultant corpus represents a searchable archive of Royal Speeches with a built-in online concordance tool that allows multiple search types and Keyword-in-Context query result display. The corpus has been tested for various corpus-analytic uses and has been found to provide significant findings in these areas. Thus, it has the potential to function as a reliable and authentic record and source of information for researchers and specialists in various fields, as well as a research tool allowing for various applications and analyses in language-related topics.\",\"PeriodicalId\":45153,\"journal\":{\"name\":\"Arab World English Journal\",\"volume\":\"87 3\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arab World English Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24093/awej/vol14no4.9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arab World English Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24093/awej/vol14no4.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

摘要

多年来，研究人员的注意力主要集中在开发书面语料库上，因此口语语料库与书面语料库相比一直很少见。费力的转录和注释工作使创建和维护口语语料库成为一项具有挑战性的工作。本项目旨在建立阿曼皇家演讲的转录语料库，并通过定制的对照工具在线提供。这项研究还旨在测试该语料库在基于语料库的词汇、文体和话语分析方面的基本实施情况。编纂阿曼王室演讲语料库的目的是通过促进阿拉伯语口语语料库的发展来填补空白，并提供一种研究工具，以促进基于语料库的研究、使用和在各个调查领域的应用。语料库的建设过程经历了五个阶段，包括数据采集、数据处理、对译工具开发、测试和评估以及在线部署。最终形成的语料库包含 98,511 个词条，是一个可搜索的皇家演讲档案库，内置在线对照工具，允许多种搜索类型和关键词上下文查询结果显示。该语料库已通过各种语料库分析用途的测试，并在这些领域提供了重要发现。因此，该语料库有可能成为各领域研究人员和专家可靠、真实的记录和信息来源，也有可能成为在语言相关主题方面进行各种应用和分析的研究工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Oman Royal Speeches Corpus: Compilation and Analysis

For many years, researchers have directed their attention primarily toward developing written corpora, with the consequence that spoken corpora have consistently remained rare compared to written ones. The laborious transcription and annotation tasks make creating and maintaining spoken corpora a challenging endeavor. This project aims to build a transcribed corpus of Oman Royal Speeches and make it available online through a custom-made concordance tool. The study also aims to test the corpus for fundamental corpus-based lexical, stylistic, and discourse-analytical implementations. Compiling the Oman Royal Speeches Corpus is meant to fill a gap by contributing to the development of Arabic spoken language corpora and make available a research tool that can facilitate corpus-based research, uses, and applications in various areas of investigation. The corpus-building process underwent a five-stage process, including data capture, data processing, concordance tool development, testing and evaluation, and online deployment. With 98,511 tokens, the resultant corpus represents a searchable archive of Royal Speeches with a built-in online concordance tool that allows multiple search types and Keyword-in-Context query result display. The corpus has been tested for various corpus-analytic uses and has been found to provide significant findings in these areas. Thus, it has the potential to function as a reliable and authentic record and source of information for researchers and specialists in various fields, as well as a research tool allowing for various applications and analyses in language-related topics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Arab World English Journal LANGUAGE & LINGUISTICS-

自引率

30.00%

发文量

187