首页 > 最新文献

Computational Linguistics & Natural Language Processing eJournal最新文献

英文 中文
Digital Storytelling: Computer Based Learning Activity to Enhance Young Learner Vocabulary 数位讲故事:以电脑为基础的学习活动,以提高青少年学习者的词汇量
Pub Date : 2020-11-24 DOI: 10.2139/ssrn.3736914
Endang Sulistianingsih, Nur Aflahatun
Vocabulary is very important in English language teaching but often ignored in learning activities. It is difficult for EFL learners to learn English with a lack of vocabulary. Digital storytelling is one of learning model which is interesting and can be used to enhance EFL learners’ vocabulary. This study was aimed at describing the effectiveness of digital storytelling to enhance young learner vocabulary. The research used one group pretest-posttest design. There was two evaluation before and after the intervention to measure the effectiveness. The participant was twenty-nine students at state elementary school of Central Java, Indonesia. Quantitative data analysis i.e a score of vocabulary mastery was done by t-test. The research findings revealed that digital storytelling was effective to enhance EFL learner’s vocabulary, made them being joyful, relax, well-motivated, and having self-enthusiasm while learning English. Digital storytelling is a powerful learning activity based computer for EFL Learner’s to enhance their vocabulary.
词汇在英语教学中非常重要,但在学习活动中却经常被忽视。对于英语学习者来说,缺乏词汇是很难学习英语的。数字讲故事是一种有趣的学习模式,可以用来提高英语学习者的词汇量。本研究旨在描述数字讲故事对提高年轻学习者词汇量的有效性。该研究采用了一组前测后测设计。干预前后分别进行两项评价,以衡量干预效果。参与者是印度尼西亚中爪哇省国立小学的29名学生。采用t检验进行定量数据分析,即词汇掌握得分。研究结果表明,数字化讲故事能有效提高英语学习者的词汇量,使他们在学习英语时快乐、放松、积极主动、自我热情。数字讲故事是一种基于计算机的强大的学习活动,可以帮助英语学习者提高词汇量。
{"title":"Digital Storytelling: Computer Based Learning Activity to Enhance Young Learner Vocabulary","authors":"Endang Sulistianingsih, Nur Aflahatun","doi":"10.2139/ssrn.3736914","DOIUrl":"https://doi.org/10.2139/ssrn.3736914","url":null,"abstract":"Vocabulary is very important in English language teaching but often ignored in learning activities. It is difficult for EFL learners to learn English with a lack of vocabulary. Digital storytelling is one of learning model which is interesting and can be used to enhance EFL learners’ vocabulary. This study was aimed at describing the effectiveness of digital storytelling to enhance young learner vocabulary. The research used one group pretest-posttest design. There was two evaluation before and after the intervention to measure the effectiveness. The participant was twenty-nine students at state elementary school of Central Java, Indonesia. Quantitative data analysis i.e a score of vocabulary mastery was done by t-test. The research findings revealed that digital storytelling was effective to enhance EFL learner’s vocabulary, made them being joyful, relax, well-motivated, and having self-enthusiasm while learning English. Digital storytelling is a powerful learning activity based computer for EFL Learner’s to enhance their vocabulary.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130778122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corporate ESG News and The Stock Market 企业ESG新闻与股票市场
Pub Date : 2020-10-30 DOI: 10.2139/ssrn.3723799
Walid Taleb, Théo Le Guenedal, Frédéric Lepetit, Vincent Mortier, Takaya Sekine, Lauren Stagnol
ESG investing's popularity has continually increased in the past five years. ESG data is increasingly integrated into investment processes. However, the information contained in ESG-related news for corporates has not been entirely exploited by institutional and long-only investors. The objective of this paper is to identify the benefits of ESG news information for active and factor-based investors. Indeed, one of the issues with ESG is the low frequency of score updates. For active management, we analyze ESG-sorted portfolios in investment universes filtered by ESG news volume. Metrics of ESG-related news are sourced from Truvalue Labs, a provider of Artificial Intelligence-powered ESG insights and analytics. We find that the approach of a universe focused on ESG news of corporates has been efficient in the early 2010s on the lower ESG-ranked side of the universe, but also on the higher ESG rank. More recently, it has positively contributed to more dynamic approaches of ESG investing. Finally, increasing the sensitivity to the highly visible SDGs significantly improves the return of ESG long-short portfolios.
ESG投资的受欢迎程度在过去五年中不断上升。ESG数据越来越多地集成到投资流程中。然而,企业esg相关新闻中包含的信息并没有被机构投资者和只做多的投资者完全利用。本文的目的是确定积极的和基于因素的投资者的ESG新闻信息的好处。事实上,ESG的问题之一就是评分更新的频率太低。对于主动管理,我们在ESG新闻量过滤的投资领域中分析ESG分类的投资组合。ESG相关新闻的指标来自Truvalue Labs,这是一家人工智能驱动的ESG洞察和分析提供商。我们发现,在2010年代初,对ESG排名较低的公司来说,关注企业ESG新闻的方法是有效的,但对ESG排名较高的公司也是如此。最近,它为更有活力的ESG投资方法做出了积极贡献。最后,提高对高度可见的可持续发展目标的敏感性可以显著提高ESG多空组合的回报。
{"title":"Corporate ESG News and The Stock Market","authors":"Walid Taleb, Théo Le Guenedal, Frédéric Lepetit, Vincent Mortier, Takaya Sekine, Lauren Stagnol","doi":"10.2139/ssrn.3723799","DOIUrl":"https://doi.org/10.2139/ssrn.3723799","url":null,"abstract":"ESG investing's popularity has continually increased in the past five years. ESG data is increasingly integrated into investment processes. However, the information contained in ESG-related news for corporates has not been entirely exploited by institutional and long-only investors. The objective of this paper is to identify the benefits of ESG news information for active and factor-based investors. Indeed, one of the issues with ESG is the low frequency of score updates. For active management, we analyze ESG-sorted portfolios in investment universes filtered by ESG news volume. Metrics of ESG-related news are sourced from Truvalue Labs, a provider of Artificial Intelligence-powered ESG insights and analytics. We find that the approach of a universe focused on ESG news of corporates has been efficient in the early 2010s on the lower ESG-ranked side of the universe, but also on the higher ESG rank. More recently, it has positively contributed to more dynamic approaches of ESG investing. Finally, increasing the sensitivity to the highly visible SDGs significantly improves the return of ESG long-short portfolios.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123016976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Neural Discourse Modelling of Conversations 对话的神经话语建模
Pub Date : 2020-07-29 DOI: 10.2139/ssrn.3663042
John M. Pierre
Deep neural networks have shown recent promise in many language-related tasks such as the modelling of conversations. We extend RNN-based sequence to sequence models to capture the long-range discourse across many turns of conversation. We perform a sensitivity analysis on how much additional context affects performance, and provide quantitative and qualitative evidence that these models can capture discourse relationships across multiple utterances. Our results show how adding an additional RNN layer for modelling discourse improves the quality of output utterances and providing more of the previous conversation as input also improves performance. By searching the generated outputs for specific discourse markers, we show how neural discourse models can exhibit increased coherence and cohesion in conversations.
深度神经网络最近在许多与语言相关的任务中显示出了前景,比如对话建模。我们将基于rnn的序列扩展到序列模型,以捕获跨多个会话回合的远程话语。我们对额外的上下文对表现的影响程度进行了敏感性分析,并提供了定量和定性的证据,证明这些模型可以捕获多个话语之间的话语关系。我们的研究结果表明,添加一个额外的RNN层来建模话语可以提高输出话语的质量,并且提供更多之前的对话作为输入也可以提高性能。通过搜索特定话语标记的生成输出,我们展示了神经话语模型如何在对话中表现出增强的连贯性和凝聚力。
{"title":"Neural Discourse Modelling of Conversations","authors":"John M. Pierre","doi":"10.2139/ssrn.3663042","DOIUrl":"https://doi.org/10.2139/ssrn.3663042","url":null,"abstract":"Deep neural networks have shown recent promise in many language-related tasks such as the modelling of conversations. We extend RNN-based sequence to sequence models to capture the long-range discourse across many turns of conversation. We perform a sensitivity analysis on how much additional context affects performance, and provide quantitative and qualitative evidence that these models can capture discourse relationships across multiple utterances. Our results show how adding an additional RNN layer for modelling discourse improves the quality of output utterances and providing more of the previous conversation as input also improves performance. By searching the generated outputs for specific discourse markers, we show how neural discourse models can exhibit increased coherence and cohesion in conversations.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121184274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FinBERT—A Deep Learning Approach to Extracting Textual Information 一种提取文本信息的深度学习方法
Pub Date : 2020-07-28 DOI: 10.2139/ssrn.3910214
Allen Huang, Hui Wang, Yi Yang
In this paper, we develop FinBERT, a state-of-the-art deep learning algorithm that incorporates the contextual relations between words in the finance domain. First, using a researcher-labeled analyst report sample, we document that FinBERT significantly outperforms the Loughran and McDonald (LM) dictionary, the naïve Bayes, and Word2Vec in sentiment classification, primarily because of its ability to uncover sentiment in sentences that other algorithms mislabel as neutral. Next, we show that other approaches underestimate the textual informativeness of earnings conference calls by at least 32% compared with FinBERT. Our results also indicate that FinBERT’s greater accuracy is especially relevant when empirical tests may suffer from low power, such as with small samples. Last, textual sentiments summarized by FinBERT can better predict future earnings than the LM dictionary, especially after 2011, consistent with firms’ strategic disclosures reducing the information content of textual sentiments measured with LM dictionary. Our results have implications for academic researchers, investment professionals, and financial market regulators who want to extract insights from financial texts.
在本文中,我们开发了FinBERT,这是一种最先进的深度学习算法,它结合了金融领域中单词之间的上下文关系。首先,使用研究人员标记的分析师报告样本,我们证明FinBERT在情感分类方面显着优于Loughran和McDonald (LM)字典,naïve贝叶斯和Word2Vec,主要是因为它能够发现其他算法错误标记为中立的句子中的情感。接下来,我们表明,与FinBERT相比,其他方法低估了财报电话会议的文本信息量至少32%。我们的结果还表明,当实证测试可能受到低功率的影响时,例如使用小样本时,FinBERT的更高准确性尤其相关。最后,FinBERT总结的文本情感比LM词典能更好地预测未来收益,特别是在2011年之后,这与企业的战略披露相一致,减少了LM词典测量的文本情感的信息含量。我们的研究结果对想要从金融文本中提取见解的学术研究人员、投资专业人士和金融市场监管者具有启示意义。
{"title":"FinBERT—A Deep Learning Approach to Extracting Textual Information","authors":"Allen Huang, Hui Wang, Yi Yang","doi":"10.2139/ssrn.3910214","DOIUrl":"https://doi.org/10.2139/ssrn.3910214","url":null,"abstract":"In this paper, we develop FinBERT, a state-of-the-art deep learning algorithm that incorporates the contextual relations between words in the finance domain. First, using a researcher-labeled analyst report sample, we document that FinBERT significantly outperforms the Loughran and McDonald (LM) dictionary, the naïve Bayes, and Word2Vec in sentiment classification, primarily because of its ability to uncover sentiment in sentences that other algorithms mislabel as neutral. Next, we show that other approaches underestimate the textual informativeness of earnings conference calls by at least 32% compared with FinBERT. Our results also indicate that FinBERT’s greater accuracy is especially relevant when empirical tests may suffer from low power, such as with small samples. Last, textual sentiments summarized by FinBERT can better predict future earnings than the LM dictionary, especially after 2011, consistent with firms’ strategic disclosures reducing the information content of textual sentiments measured with LM dictionary. Our results have implications for academic researchers, investment professionals, and financial market regulators who want to extract insights from financial texts.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128229327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Implementation on Text Classification Using Bag of Words Model 基于词袋模型的文本分类实现
Pub Date : 2019-05-17 DOI: 10.2139/ssrn.3507923
Nisha V M, D. Kumar R
Bag of words provides one way to deal with text representation and apply it to a standard type of text arrangement. This method depends on the idea of Bag-of-Words (BOW) that measures the content which is accessible from Wikipedia, Kaggle, Gmail and so on. The proposed method is utilized to create a Vector Space Model, which truly sustained into a Support Vector Machine classifier. This is to arrange and gathering of document records that are publically accessible datasets through social media. The text results demonstrate the examination between the raw information and the clean information that is viewed on the word cloud.
词袋提供了一种处理文本表示的方法,并将其应用于标准类型的文本排列。这种方法依赖于词汇袋(BOW)的概念,它测量从维基百科、Kaggle、Gmail等网站上可访问的内容。利用所提出的方法创建一个向量空间模型,该模型真正持续为支持向量机分类器。这是为了整理和收集可通过社交媒体公开访问的数据集的文档记录。文本结果演示了在词云上查看的原始信息和干净信息之间的检查。
{"title":"Implementation on Text Classification Using Bag of Words Model","authors":"Nisha V M, D. Kumar R","doi":"10.2139/ssrn.3507923","DOIUrl":"https://doi.org/10.2139/ssrn.3507923","url":null,"abstract":"Bag of words provides one way to deal with text representation and apply it to a standard type of text arrangement. This method depends on the idea of Bag-of-Words (BOW) that measures the content which is accessible from Wikipedia, Kaggle, Gmail and so on. The proposed method is utilized to create a Vector Space Model, which truly sustained into a Support Vector Machine classifier. This is to arrange and gathering of document records that are publically accessible datasets through social media. The text results demonstrate the examination between the raw information and the clean information that is viewed on the word cloud.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129613423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Language Style Similarity and Friendship Networks 语言风格相似性和友谊网络
Pub Date : 2019-02-27 DOI: 10.2139/ssrn.3131715
Balázs Kovács, Adam M. Kleinbaum
This paper demonstrates that linguistic similarity predicts network tie formation and that friends exhibit linguistic convergence over time. Study 1 analyzes the linguistic styles and the emerging friendship network in a complete cohort of 285 students. Study 2 analyzes a large-scale dataset of online reviews. Across both studies, we collected data in two waves to examine changes in both friendship networks and linguistic styles. Using the LIWC linguistic framework, we analyze the text of students’ essays and of 1.7 million reviews by 159,651 Yelp reviewers. We find that similarity in linguistic style corresponds to higher likelihood of friendship formation and persistence, and that friendship ties, in turn, correspond with a convergence in linguistic style. We discuss the implications of the co-evolution of linguistic styles and social networks, which contribute to the formation of relational echo chambers.
本文证明了语言相似性可以预测网络关系的形成,并且随着时间的推移,朋友表现出语言趋同。研究1分析了285名学生的语言风格和新出现的友谊网络。研究2分析了一个大规模的在线评论数据集。在这两项研究中,我们分两波收集数据,以检查友谊网络和语言风格的变化。使用LIWC语言框架,我们分析了学生的论文文本和159,651名Yelp评论者的170万条评论。我们发现,语言风格的相似性对应于更高的友谊形成和持久的可能性,而友谊关系反过来又对应于语言风格的趋同。我们讨论了语言风格和社会网络共同进化的影响,这有助于形成关系回声室。
{"title":"Language Style Similarity and Friendship Networks","authors":"Balázs Kovács, Adam M. Kleinbaum","doi":"10.2139/ssrn.3131715","DOIUrl":"https://doi.org/10.2139/ssrn.3131715","url":null,"abstract":"This paper demonstrates that linguistic similarity predicts network tie formation and that friends exhibit linguistic convergence over time. Study 1 analyzes the linguistic styles and the emerging friendship network in a complete cohort of 285 students. Study 2 analyzes a large-scale dataset of online reviews. Across both studies, we collected data in two waves to examine changes in both friendship networks and linguistic styles. Using the LIWC linguistic framework, we analyze the text of students’ essays and of 1.7 million reviews by 159,651 Yelp reviewers. We find that similarity in linguistic style corresponds to higher likelihood of friendship formation and persistence, and that friendship ties, in turn, correspond with a convergence in linguistic style. We discuss the implications of the co-evolution of linguistic styles and social networks, which contribute to the formation of relational echo chambers.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122317000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Multi-Layer Arabic Text Steganographic Method Based on Letter Shaping 基于字母整形的多层阿拉伯文本隐写方法
Pub Date : 2019-01-25 DOI: 10.5121/ijnsa.2019.11103
A.F. Al Azzawi
Text documents are widely used, however, the text steganography is more difficult than other media because of a little redundant information. This paper presents a text steganography methodology appropriate for Arabic Unicode texts that do not use a normal sequential inserting process to overcome the security issues of the current approaches that are sensitive to steg-analysis. The Arabic Unicode text is kept within main unshaped letters, and the proposed method is used text file as cover text to hide a bit in each letter by reshaping the letters according to its position (beginning, middle, end of the word, or standalone), this hiding process is accomplished through multi-embedding layer where each layer contains all words with the same Tag detected using the POS tagger, and the embedding layers are selected randomly using the stego key to improve the security issues. The experimental result shows that the purposed method satisfied the hiding capacity requirements, improve security, and imperceptibility is better than currently developed approaches
摘要文本文件被广泛使用,但由于文本中存在少量冗余信息,使得文本的隐写比其他介质更加困难。本文提出了一种文本隐写方法,适用于不使用正常顺序插入过程的阿拉伯Unicode文本,以克服当前对隐写分析敏感的方法的安全问题。该方法利用文本文件作为封面文本,根据字母的位置(单词的开头、中间、结尾或独立)对字母进行重塑,从而在每个字母中隐藏一个位,该隐藏过程通过多嵌入层实现,其中每层包含使用POS标记器检测到的具有相同标签的所有单词。并采用隐进密钥随机选择嵌入层,提高了算法的安全性。实验结果表明,该方法满足了隐藏容量要求,提高了安全性,且隐蔽性优于现有方法
{"title":"A Multi-Layer Arabic Text Steganographic Method Based on Letter Shaping","authors":"A.F. Al Azzawi","doi":"10.5121/ijnsa.2019.11103","DOIUrl":"https://doi.org/10.5121/ijnsa.2019.11103","url":null,"abstract":"Text documents are widely used, however, the text steganography is more difficult than other media because of a little redundant information. This paper presents a text steganography methodology appropriate for Arabic Unicode texts that do not use a normal sequential inserting process to overcome the security issues of the current approaches that are sensitive to steg-analysis. The Arabic Unicode text is kept within main unshaped letters, and the proposed method is used text file as cover text to hide a bit in each letter by reshaping the letters according to its position (beginning, middle, end of the word, or standalone), this hiding process is accomplished through multi-embedding layer where each layer contains all words with the same Tag detected using the POS tagger, and the embedding layers are selected randomly using the stego key to improve the security issues. The experimental result shows that the purposed method satisfied the hiding capacity requirements, improve security, and imperceptibility is better than currently developed approaches","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127392235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Politeness Strategies of Russian School Students: Quantitative Approach to Qualitative Data 俄罗斯中学生礼貌策略:定性数据的定量分析
Pub Date : 2018-12-05 DOI: 10.2139/ssrn.3296303
M. Grabovskaya, E. Gridneva, A. Vlakhov
This study deals with the politeness strategies of speakers of Russian, focusing on verbal expression of politeness. After running a field survey in schools in mid-2018, we try to analyze specific verbal markers of expressing politeness quantitatively. Four such markers were selected for this study, namely greeting, leave-taking, expressing gratitude and apology. Quantitative analysis shows that there is a clear frequency pattern found in these markers’ use, indicating a relatively high degree of sociolinguistic variation. Possible causes of this effect are discussed, including cultural diversity and multilingual setting of the modern Russian school communicative domain
本研究以俄语说话者的礼貌策略为研究对象,重点关注礼貌的言语表达。在2018年年中对学校进行实地调查后,我们试图定量分析表达礼貌的具体言语标记。本研究选择了四个这样的标记,分别是问候、告别、表达感谢和道歉。定量分析表明,在这些标记的使用中发现了一个明确的频率模式,表明社会语言学差异程度相对较高。讨论了造成这种影响的可能原因,包括文化多样性和现代俄罗斯学校交际领域的多语言环境
{"title":"Politeness Strategies of Russian School Students: Quantitative Approach to Qualitative Data","authors":"M. Grabovskaya, E. Gridneva, A. Vlakhov","doi":"10.2139/ssrn.3296303","DOIUrl":"https://doi.org/10.2139/ssrn.3296303","url":null,"abstract":"This study deals with the politeness strategies of speakers of Russian, focusing on verbal expression of politeness. After running a field survey in schools in mid-2018, we try to analyze specific verbal markers of expressing politeness quantitatively. Four such markers were selected for this study, namely greeting, leave-taking, expressing gratitude and apology. Quantitative analysis shows that there is a clear frequency pattern found in these markers’ use, indicating a relatively high degree of sociolinguistic variation. Possible causes of this effect are discussed, including cultural diversity and multilingual setting of the modern Russian school communicative domain","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122104436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentence-Level Dialects Identification in the Greater China Region 大中国地区句子级方言识别
Pub Date : 2016-12-30 DOI: 10.5121/IJNLC.2016.5602
Fan Xu, Mingwen Wang, Maoxi Li
Identifying the different varieties of the same language is more challenging than unrelated languages identification. In this paper, we propose an approach to discriminate language varieties or dialects of Mandarin Chinese for the Mainland China, Hong Kong, Taiwan, Macao, Malaysia and Singapore, a.k.a., the Greater China Region (GCR). When applied to the dialects identification of the GCR, we find that the commonly used character-level or word-level uni-gram feature is not very efficient since there exist several specific problems such as the ambiguity and context-dependent characteristic of words in the dialects of the GCR. To overcome these challenges, we use not only the general features like character-level n-gram, but also many new word-level features, including PMI-based and word alignment-based features. A series of evaluation results on both the news and open-domain dataset from Wikipedia show the effectiveness of the proposed approach.
识别同一种语言的不同变体比识别不相关的语言更具挑战性。本文提出了一种区分中国大陆、香港、台湾、澳门、马来西亚和新加坡,即大中华地区(GCR)普通话语言变体或方言的方法。将常用的字符级或词级一元图特征应用于GCR方言识别时,由于GCR方言中存在词的歧义性和上下文依赖性等具体问题,其识别效率不高。为了克服这些挑战,我们不仅使用一般的特征,如字符级n-gram,而且还使用许多新的词级特征,包括基于pmi和基于词对齐的特征。在维基百科的新闻和开放域数据集上的一系列评估结果表明了该方法的有效性。
{"title":"Sentence-Level Dialects Identification in the Greater China Region","authors":"Fan Xu, Mingwen Wang, Maoxi Li","doi":"10.5121/IJNLC.2016.5602","DOIUrl":"https://doi.org/10.5121/IJNLC.2016.5602","url":null,"abstract":"Identifying the different varieties of the same language is more challenging than unrelated languages identification. In this paper, we propose an approach to discriminate language varieties or dialects of Mandarin Chinese for the Mainland China, Hong Kong, Taiwan, Macao, Malaysia and Singapore, a.k.a., the Greater China Region (GCR). When applied to the dialects identification of the GCR, we find that the commonly used character-level or word-level uni-gram feature is not very efficient since there exist several specific problems such as the ambiguity and context-dependent characteristic of words in the dialects of the GCR. To overcome these challenges, we use not only the general features like character-level n-gram, but also many new word-level features, including PMI-based and word alignment-based features. A series of evaluation results on both the news and open-domain dataset from Wikipedia show the effectiveness of the proposed approach.","PeriodicalId":256367,"journal":{"name":"Computational Linguistics & Natural Language Processing eJournal","volume":"40 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123598389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
Computational Linguistics & Natural Language Processing eJournal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1