首页 > 最新文献

ACM Transactions on Asian and Low-Resource Language Information Processing最新文献

英文 中文
A DENSE SPATIAL NETWORK MODEL FOR EMOTION RECOGNITION USING LEARNING APPROACHES 利用学习方法建立情感识别的密集空间网络模型
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-10 DOI: 10.1145/3688000
L. V., Dinesh Kumar Anguraj
Researchers are increasingly eager to develop techniques to extract emotional data from new sources due to the exponential growth of subjective information on Web 2.0. One of the most challenging aspects of textual emotion detection is the collection of data with emotion labels, given the subjectivity involved in labeling emotions. To address this significant issue, our research aims to aid in the development of effective solutions. We propose a Deep Convolutional Belief-based Spatial Network Model (DCB-SNM) as a semi-automated technique to tackle this challenge. This model involves two basic phases of analysis: text and video. In this process, pre-trained annotators identify the dominant emotion. Our work evaluates the impact of this automatic pre-annotation approach on manual emotion annotation from the perspectives of annotation time and agreement. The data on annotation time indicates an increase of roughly 20% when the pre-annotation procedure is utilized, without negatively affecting the annotators' skill. This demonstrates the benefits of pre-annotation approaches. Additionally, pre-annotation proves to be particularly advantageous for contributors with low prediction accuracy, enhancing overall annotation efficiency and reliability.
由于 Web 2.0 上的主观信息呈指数级增长,研究人员越来越热衷于开发从新来源中提取情感数据的技术。文本情感检测最具挑战性的方面之一是收集带有情感标签的数据,因为情感标签涉及主观性。为了解决这一重大问题,我们的研究旨在帮助开发有效的解决方案。我们提出了一种基于深度卷积信念的空间网络模型(DCB-SNM),作为应对这一挑战的半自动化技术。该模型涉及两个基本分析阶段:文本和视频。在这一过程中,预先训练好的注释者会识别出主要情绪。我们的工作从注释时间和一致性的角度评估了这种自动预注释方法对人工情感注释的影响。注释时间方面的数据表明,使用预注释程序后,注释时间大约增加了 20%,而且不会对注释者的技能产生负面影响。这说明了预标注方法的好处。此外,事实证明预注释对预测准确率低的注释者特别有利,可提高整体注释效率和可靠性。
{"title":"A DENSE SPATIAL NETWORK MODEL FOR EMOTION RECOGNITION USING LEARNING APPROACHES","authors":"L. V., Dinesh Kumar Anguraj","doi":"10.1145/3688000","DOIUrl":"https://doi.org/10.1145/3688000","url":null,"abstract":"Researchers are increasingly eager to develop techniques to extract emotional data from new sources due to the exponential growth of subjective information on Web 2.0. One of the most challenging aspects of textual emotion detection is the collection of data with emotion labels, given the subjectivity involved in labeling emotions. To address this significant issue, our research aims to aid in the development of effective solutions. We propose a Deep Convolutional Belief-based Spatial Network Model (DCB-SNM) as a semi-automated technique to tackle this challenge. This model involves two basic phases of analysis: text and video. In this process, pre-trained annotators identify the dominant emotion. Our work evaluates the impact of this automatic pre-annotation approach on manual emotion annotation from the perspectives of annotation time and agreement. The data on annotation time indicates an increase of roughly 20% when the pre-annotation procedure is utilized, without negatively affecting the annotators' skill. This demonstrates the benefits of pre-annotation approaches. Additionally, pre-annotation proves to be particularly advantageous for contributors with low prediction accuracy, enhancing overall annotation efficiency and reliability.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141920813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning and Vision-based approach for Human fall detection and classification in naturally occurring scenes using video data 基于学习和视觉的方法,利用视频数据对自然场景中的人体跌倒进行检测和分类
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-10 DOI: 10.1145/3687125
Shashvat Singh, Kumkum Kumari, A. Vaish
The advancement of medicine presents challenges for modern cultures, especially with unpredictable elderly falling incidents anywhere due to serious health issues. Delayed rescue for at-risk elders can be dangerous. Traditional elder safety methods like video surveillance or wearable sensors are inefficient and burdensome, wasting human resources and requiring caregivers' constant fall detection monitoring. Thus, a more effective and convenient solution is needed to ensure elderly safety. In this paper, a method is presented for detecting human falls in naturally occurring scenes using videos through a traditional Convolutional Neural Network (CNN) model, Inception-v3, VGG-19 and two versions of the You Only Look Once (YOLO) working model. The primary focus of this work is human fall detection through the utilization of deep learning models. Specifically, the YOLO approach is adopted for object detection and tracking in video scenes. By implementing YOLO, human subjects are identified, and bounding boxes are generated around them. The classification of various human activities, including fall detection is accomplished through the analysis of deformation features extracted from these bounding boxes. The traditional CNN model achieves an impressive 99.83% accuracy in human fall detection, surpassing other state-of-the-art methods. However, training time is longer compared to YOLO-v2 and YOLO-v3, but significantly shorter than Inception-v3, taking only around 10% of its total training time.
医学的发展给现代文化带来了挑战,尤其是由于严重健康问题而在任何地方发生的不可预测的老人跌倒事件。延误对高危老人的救援可能会带来危险。传统的老年人安全方法,如视频监控或可穿戴传感器,既低效又繁琐,既浪费人力资源,又需要护理人员持续监测跌倒情况。因此,需要一种更有效、更便捷的解决方案来确保老年人的安全。本文介绍了一种通过传统卷积神经网络(CNN)模型、Inception-v3、VGG-19 和两个版本的 "只看一眼"(YOLO)工作模型,在自然发生的场景中利用视频检测人体跌倒的方法。这项工作的主要重点是通过利用深度学习模型进行人体跌倒检测。具体来说,YOLO 方法被用于视频场景中的物体检测和跟踪。通过实施 YOLO,可以识别出人类主体,并在其周围生成边界框。通过分析从这些边界框中提取的形变特征,可以完成包括跌倒检测在内的各种人类活动的分类。传统的 CNN 模型在人类跌倒检测方面达到了令人印象深刻的 99.83% 的准确率,超过了其他最先进的方法。不过,与 YOLO-v2 和 YOLO-v3 相比,训练时间较长,但与 Inception-v3 相比,训练时间明显缩短,仅占总训练时间的 10%左右。
{"title":"Learning and Vision-based approach for Human fall detection and classification in naturally occurring scenes using video data","authors":"Shashvat Singh, Kumkum Kumari, A. Vaish","doi":"10.1145/3687125","DOIUrl":"https://doi.org/10.1145/3687125","url":null,"abstract":"The advancement of medicine presents challenges for modern cultures, especially with unpredictable elderly falling incidents anywhere due to serious health issues. Delayed rescue for at-risk elders can be dangerous. Traditional elder safety methods like video surveillance or wearable sensors are inefficient and burdensome, wasting human resources and requiring caregivers' constant fall detection monitoring. Thus, a more effective and convenient solution is needed to ensure elderly safety. In this paper, a method is presented for detecting human falls in naturally occurring scenes using videos through a traditional Convolutional Neural Network (CNN) model, Inception-v3, VGG-19 and two versions of the You Only Look Once (YOLO) working model. The primary focus of this work is human fall detection through the utilization of deep learning models. Specifically, the YOLO approach is adopted for object detection and tracking in video scenes. By implementing YOLO, human subjects are identified, and bounding boxes are generated around them. The classification of various human activities, including fall detection is accomplished through the analysis of deformation features extracted from these bounding boxes. The traditional CNN model achieves an impressive 99.83% accuracy in human fall detection, surpassing other state-of-the-art methods. However, training time is longer compared to YOLO-v2 and YOLO-v3, but significantly shorter than Inception-v3, taking only around 10% of its total training time.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141920202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNN-Based Models for Emotion and Sentiment Analysis Using Speech Data 基于 CNN 的语音数据情感和情绪分析模型
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-08 DOI: 10.1145/3687303
Anjum Madan, Devender Kumar
The study aims to present an in-depth Sentiment Analysis (SA) grounded by the presence of emotions in the speech signals. Nowadays, all kinds of web-based applications ranging from social media platforms and video-sharing sites to e-commerce applications provide support for Human-Computer Interfaces (HCIs). These media applications allow users to share their experiences in all forms such as text, audio, video, GIF, etc. The most natural and fundamental form of expressing oneself is through speech. Speech-Based Sentiment Analysis (SBSA) is the task of gaining insights into speech signals. It aims to classify the statement as neutral, negative, or positive. On the other hand, Speech Emotion Recognition (SER) categorizes speech signals into the following emotions: disgust, fear, sadness, anger, happiness, and neutral. It is necessary to recognize the sentiments along with the profoundness of the emotions in the speech signals. To cater to the above idea, a methodology is proposed defining a text-oriented SA model using the combination of CNN and Bi-LSTM techniques along with an embedding layer, applied to the text obtained from speech signals; achieving an accuracy of 84.49%. Also, the proposed methodology suggests an Emotion Analysis (EA) model based on the CNN technique highlighting the type of emotion present in the speech signal with an accuracy measure of 95.12%. The presented architecture can also be applied to different other domains like product review systems, video recommendation systems, education, health, security, etc.
本研究旨在通过语音信号中存在的情感,提出一种深入的情感分析(Sentiment Analysis,SA)方法。如今,从社交媒体平台、视频分享网站到电子商务应用,各种基于网络的应用都为人机交互界面(HCI)提供了支持。这些媒体应用允许用户以文本、音频、视频、GIF 等各种形式分享他们的体验。最自然、最基本的表达方式是语音。基于语音的情感分析(SBSA)是一项深入了解语音信号的任务。其目的是将语句分为中性、负面或正面。另一方面,语音情感识别(SER)将语音信号分为以下几种情感:厌恶、恐惧、悲伤、愤怒、快乐和中性。有必要识别语音信号中的情绪以及情绪的深刻程度。为了迎合上述想法,我们提出了一种方法,利用 CNN 和 Bi-LSTM 技术的组合以及嵌入层,定义了一个面向文本的 SA 模型,并将其应用于从语音信号中获取的文本;准确率达到了 84.49%。此外,该方法还提出了一种基于 CNN 技术的情感分析(EA)模型,可突出语音信号中的情感类型,准确率高达 95.12%。所提出的架构还可应用于其他不同领域,如产品评论系统、视频推荐系统、教育、健康、安全等。
{"title":"CNN-Based Models for Emotion and Sentiment Analysis Using Speech Data","authors":"Anjum Madan, Devender Kumar","doi":"10.1145/3687303","DOIUrl":"https://doi.org/10.1145/3687303","url":null,"abstract":"The study aims to present an in-depth Sentiment Analysis (SA) grounded by the presence of emotions in the speech signals. Nowadays, all kinds of web-based applications ranging from social media platforms and video-sharing sites to e-commerce applications provide support for Human-Computer Interfaces (HCIs). These media applications allow users to share their experiences in all forms such as text, audio, video, GIF, etc. The most natural and fundamental form of expressing oneself is through speech. Speech-Based Sentiment Analysis (SBSA) is the task of gaining insights into speech signals. It aims to classify the statement as neutral, negative, or positive. On the other hand, Speech Emotion Recognition (SER) categorizes speech signals into the following emotions: disgust, fear, sadness, anger, happiness, and neutral. It is necessary to recognize the sentiments along with the profoundness of the emotions in the speech signals. To cater to the above idea, a methodology is proposed defining a text-oriented SA model using the combination of CNN and Bi-LSTM techniques along with an embedding layer, applied to the text obtained from speech signals; achieving an accuracy of 84.49%. Also, the proposed methodology suggests an Emotion Analysis (EA) model based on the CNN technique highlighting the type of emotion present in the speech signal with an accuracy measure of 95.12%. The presented architecture can also be applied to different other domains like product review systems, video recommendation systems, education, health, security, etc.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141927347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Semantic Information Extraction of Tibetan Opera Mask with Recall Loss 有召回损失的藏戏面具自适应语义信息提取
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-26 DOI: 10.1145/3666041
yao wen, jie li, Donghong Cai, Zhicheng Dong, Fangkai Cai, Ping Lan, quan zhou
With the development of artificial intelligence, natural language processing enables us to better understand and utilize semantic information. However, traditional object detection algorithms cannot get an effective performance, when dealed with Tibetan opera mask datasets which have the properties of limited samples, symmetrical patterns and high inter-class distances. In order to solve this issue, we propose a novel feature representation model with recall loss function for detecting different marks. In the model, we develop an adaptive feature extraction network with fused layers to extract features. Furthermore, a lightweight efficient attention mechanism is designed to enhance the significance of key features. Additionally, a recall loss function is proposed to increase the differences among classes. Finally, experimental results on the dataset of Tibetan opera mask demonstrate that our proposed model outperforms compared models.
随着人工智能的发展,自然语言处理使我们能够更好地理解和利用语义信息。然而,传统的对象检测算法在处理藏戏面具数据集时无法获得有效的性能,因为藏戏面具数据集具有样本有限、模式对称和类间距离大的特点。为了解决这个问题,我们提出了一种带有召回损失函数的新型特征表示模型,用于检测不同的标记。在该模型中,我们开发了一个具有融合层的自适应特征提取网络来提取特征。此外,我们还设计了一种轻量级高效关注机制,以增强关键特征的重要性。此外,我们还提出了一个召回损失函数,以增加类别之间的差异。最后,在藏戏面具数据集上的实验结果表明,我们提出的模型优于同类模型。
{"title":"Adaptive Semantic Information Extraction of Tibetan Opera Mask with Recall Loss","authors":"yao wen, jie li, Donghong Cai, Zhicheng Dong, Fangkai Cai, Ping Lan, quan zhou","doi":"10.1145/3666041","DOIUrl":"https://doi.org/10.1145/3666041","url":null,"abstract":"With the development of artificial intelligence, natural language processing enables us to better understand and utilize semantic information. However, traditional object detection algorithms cannot get an effective performance, when dealed with Tibetan opera mask datasets which have the properties of limited samples, symmetrical patterns and high inter-class distances. In order to solve this issue, we propose a novel feature representation model with recall loss function for detecting different marks. In the model, we develop an adaptive feature extraction network with fused layers to extract features. Furthermore, a lightweight efficient attention mechanism is designed to enhance the significance of key features. Additionally, a recall loss function is proposed to increase the differences among classes. Finally, experimental results on the dataset of Tibetan opera mask demonstrate that our proposed model outperforms compared models.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141799656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TRGCN: A Prediction Model for Information Diffusion Based on Transformer and Relational Graph Convolutional Network TRGCN:基于变换器和关系图卷积网络的信息扩散预测模型
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-26 DOI: 10.1145/3672074
Jinghua Zhao, Xiting Lyu, Haiying Rong, Jiale Zhao
In order to capture and integrate structural features and temporal features contained in social graph and diffusion cascade more effectively, an information diffusion prediction model based on Transformer and Relational Graph Convolutional Network (TRGCN) is proposed. Firstly, a dynamic heterogeneous graph composed of the social network graph and the diffusion cascade graph was constructed, and it was input into the Relational Graph Convolutional Network (RGCN) to extract the structural features of each node. Secondly, the time embedding of each node was re-encoded using Bi-directional Long Short-Term Memory (Bi-LSTM). The time decay function was introduced to give different weights to nodes at different time positions, so as to obtain the temporal features of nodes. Finally, structural features and temporal features were input into Transformer and then merged. The spatial-temporal features are obtained for information diffusion prediction. The experimental results on three real data sets of Twitter, Douban and Memetracker show that compared with the optimal model in the comparison experiment, the TRGCN model has an average increase of 4.16% in Hits@100 metric and 13.26% in map@100 metric. The validity and rationality of the model are proved.
为了更有效地捕捉和整合社交图谱和扩散级联中包含的结构特征和时间特征,本文提出了一种基于变换器和关系图卷积网络(TRGCN)的信息扩散预测模型。首先,构建由社交网络图和扩散级联图组成的动态异构图,并将其输入关系图卷积网络(RGCN),提取每个节点的结构特征。其次,使用双向长短期记忆(Bi-LSTM)对每个节点的时间嵌入进行重新编码。引入时间衰减函数,对不同时间位置的节点赋予不同权重,从而获得节点的时间特征。最后,将结构特征和时间特征输入变换器,然后进行合并。得到的时空特征用于信息扩散预测。在 Twitter、豆瓣和 Memetracker 三个真实数据集上的实验结果表明,与对比实验中的最优模型相比,TRGCN 模型在 Hits@100 指标上平均提高了 4.16%,在 map@100 指标上平均提高了 13.26%。这证明了该模型的有效性和合理性。
{"title":"TRGCN: A Prediction Model for Information Diffusion Based on Transformer and Relational Graph Convolutional Network","authors":"Jinghua Zhao, Xiting Lyu, Haiying Rong, Jiale Zhao","doi":"10.1145/3672074","DOIUrl":"https://doi.org/10.1145/3672074","url":null,"abstract":"In order to capture and integrate structural features and temporal features contained in social graph and diffusion cascade more effectively, an information diffusion prediction model based on Transformer and Relational Graph Convolutional Network (TRGCN) is proposed. Firstly, a dynamic heterogeneous graph composed of the social network graph and the diffusion cascade graph was constructed, and it was input into the Relational Graph Convolutional Network (RGCN) to extract the structural features of each node. Secondly, the time embedding of each node was re-encoded using Bi-directional Long Short-Term Memory (Bi-LSTM). The time decay function was introduced to give different weights to nodes at different time positions, so as to obtain the temporal features of nodes. Finally, structural features and temporal features were input into Transformer and then merged. The spatial-temporal features are obtained for information diffusion prediction. The experimental results on three real data sets of Twitter, Douban and Memetracker show that compared with the optimal model in the comparison experiment, the TRGCN model has an average increase of 4.16% in Hits@100 metric and 13.26% in map@100 metric. The validity and rationality of the model are proved.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141799053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Word Sense Disambiguation Combining Knowledge Graph And Text Hierarchical Structure 结合知识图谱和文本层次结构的词义消歧技术
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-25 DOI: 10.1145/3677524
Yukun Cao, Chengkun Jin, Yijia Tang, Ziyue Wei
Current supervised word sense disambiguation models have obtained high disambiguation results using annotated information of different word senses and pre-trained language models. However, the semantic data of the supervised word sense disambiguation models are in the form of short texts, and many of the corpus information is not rich enough to distinguish the semantics in different scenarios. The paper proposes a bi-encoder word sense disambiguation method combining knowledge graph and text hierarchy structure, by introducing structured knowledge from the knowledge graph to supplement more extended semantic information, using the hierarchy of contextual input text to describe the meaning of words and phrases, and constructing a BERT-based bi-encoder, introducing a graph attention network to reduce the noise information in the contextual input text, so as to improve the disambiguation accuracy of the target words in phrase form and ultimately improve the disambiguation effectiveness of the method. By comparing the method with the latest nine comparison algorithms in five test datasets, the disambiguation accuracy of the method mostly outperformed the comparison algorithms and achieved better results.
目前的有监督词义消歧模型利用不同词义的注释信息和预训练的语言模型获得了较高的消歧结果。然而,有监督词义消歧模型的语义数据都是以短文的形式存在,很多语料信息不够丰富,无法区分不同场景下的语义。本文提出了一种结合知识图谱和文本层次结构的双编码器词义消歧方法,通过引入知识图谱中的结构化知识来补充更多的扩展语义信息,利用上下文输入文本的层次结构来描述词和短语的意义,并构建基于BERT的双编码器,引入图注意网络来降低上下文输入文本中的噪声信息,从而提高短语形式目标词的消歧准确率,最终提高该方法的消歧效果。通过在五个测试数据集中与最新的九种对比算法进行比较,该方法的消歧准确率大多优于对比算法,取得了较好的效果。
{"title":"Word Sense Disambiguation Combining Knowledge Graph And Text Hierarchical Structure","authors":"Yukun Cao, Chengkun Jin, Yijia Tang, Ziyue Wei","doi":"10.1145/3677524","DOIUrl":"https://doi.org/10.1145/3677524","url":null,"abstract":"Current supervised word sense disambiguation models have obtained high disambiguation results using annotated information of different word senses and pre-trained language models. However, the semantic data of the supervised word sense disambiguation models are in the form of short texts, and many of the corpus information is not rich enough to distinguish the semantics in different scenarios. The paper proposes a bi-encoder word sense disambiguation method combining knowledge graph and text hierarchy structure, by introducing structured knowledge from the knowledge graph to supplement more extended semantic information, using the hierarchy of contextual input text to describe the meaning of words and phrases, and constructing a BERT-based bi-encoder, introducing a graph attention network to reduce the noise information in the contextual input text, so as to improve the disambiguation accuracy of the target words in phrase form and ultimately improve the disambiguation effectiveness of the method. By comparing the method with the latest nine comparison algorithms in five test datasets, the disambiguation accuracy of the method mostly outperformed the comparison algorithms and achieved better results.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141802527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Correlation between Emojis and Mood Expression in Thai Twitter Discourse 探索泰语 Twitter 话语中表情符号与情绪表达之间的相关性
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-24 DOI: 10.1145/3680543
Attapol T. Rutherford, Pawitsapak Akarajaradwong
Mood, a long-lasting affective state detached from specific stimuli, plays an important role in behavior. Although sentiment analysis and emotion classification have garnered attention, research on mood classification remains in its early stages. This study adopts a two-dimensional structure of affect, comprising ”pleasantness” and ”activation,” to classify mood patterns. Emojis, graphic symbols representing emotions and concepts, are widely used in computer-mediated communication. Unlike previous studies that consider emojis as direct labels for emotion or sentiment, this work uses a pre-trained large language model which integrates both text and emojis to develop a mood classification model. Our contributions are three-fold. First, we annotate 10,000 Thai tweets with mood to train the models and release the dataset to the public. Second, we show that emojis contribute to determining mood to a lesser extent than text, far from mapping directly to mood. Third, through the application of the trained model, we observe the correlation of moods during the Thai political turmoil of 2019-2020 on Thai Twitter and find a significant correlation. These moods closely reflect the news events and reveal one side of Thai public opinion during the turmoil.
情绪是一种脱离特定刺激的持久情感状态,在行为中扮演着重要角色。虽然情感分析和情绪分类已引起人们的关注,但情绪分类研究仍处于早期阶段。本研究采用由 "愉快度 "和 "激活度 "组成的二维情感结构对情绪模式进行分类。表情符号是代表情绪和概念的图形符号,在以计算机为媒介的交流中被广泛使用。与以往将表情符号作为情绪或情感的直接标签的研究不同,这项工作使用了一个预先训练好的大型语言模型,该模型将文本和表情符号整合在一起,从而开发出一种情绪分类模型。我们的贡献有三方面。首先,我们对 10,000 条泰国推文进行了情绪注释以训练模型,并向公众发布了数据集。其次,我们证明了表情符号在确定情绪方面的作用小于文本,远没有直接映射到情绪上。第三,通过应用训练有素的模型,我们观察了泰国推特上 2019-2020 年泰国政治动荡期间的情绪相关性,并发现了显著的相关性。这些情绪密切反映了新闻事件,揭示了动荡期间泰国舆论的一个侧面。
{"title":"Exploring the Correlation between Emojis and Mood Expression in Thai Twitter Discourse","authors":"Attapol T. Rutherford, Pawitsapak Akarajaradwong","doi":"10.1145/3680543","DOIUrl":"https://doi.org/10.1145/3680543","url":null,"abstract":"Mood, a long-lasting affective state detached from specific stimuli, plays an important role in behavior. Although sentiment analysis and emotion classification have garnered attention, research on mood classification remains in its early stages. This study adopts a two-dimensional structure of affect, comprising ”pleasantness” and ”activation,” to classify mood patterns. Emojis, graphic symbols representing emotions and concepts, are widely used in computer-mediated communication. Unlike previous studies that consider emojis as direct labels for emotion or sentiment, this work uses a pre-trained large language model which integrates both text and emojis to develop a mood classification model. Our contributions are three-fold. First, we annotate 10,000 Thai tweets with mood to train the models and release the dataset to the public. Second, we show that emojis contribute to determining mood to a lesser extent than text, far from mapping directly to mood. Third, through the application of the trained model, we observe the correlation of moods during the Thai political turmoil of 2019-2020 on Thai Twitter and find a significant correlation. These moods closely reflect the news events and reveal one side of Thai public opinion during the turmoil.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141809824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translation from Tunisian Dialect to Modern Standard Arabic: Exploring Finite-State Transducers and Sequence-to-Sequence Transformer Approaches 从突尼斯方言到现代标准阿拉伯语的翻译:探索有限状态转换器和序列到序列转换器方法
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-24 DOI: 10.1145/3681788
Roua Torjmen, K. Haddar
Translation from the mother tongue, including the Tunisian dialect, to modern standard Arabic is a highly significant field in natural language processing due to its wide range of applications and associated benefits. Recently, researchers have shown increased interest in the Tunisian dialect, primarily driven by the massive volume of content generated spontaneously by Tunisians on social media follow-ing the revolution. This paper presents two distinct translators for converting the Tunisian dialect into Modern Standard Arabic. The first translator utilizes a rule-based approach, employing a collection of finite state transducers and a bilingual dictionary derived from the study corpus. On the other hand, the second translator relies on deep learning models, specifically the sequence-to-sequence trans-former model and a parallel corpus. To assess, evaluate, and compare the performance of the two translators, we conducted tests using a parallel corpus comprising 8,599 words. The results achieved by both translators are noteworthy. The translator based on finite state transducers achieved a blue score of 56.65, while the transformer model-based translator achieved a higher score of 66.07.
从母语(包括突尼斯方言)到现代标准阿拉伯语的翻译是自然语言处理中一个非常重要的领域,因为它具有广泛的应用范围和相关优势。最近,研究人员对突尼斯方言的兴趣与日俱增,主要原因是突尼斯革命后突尼斯人在社交媒体上自发产生了大量内容。本文介绍了两种将突尼斯方言转换为现代标准阿拉伯语的不同翻译器。第一个翻译器采用基于规则的方法,使用了一系列有限状态转换器和从研究语料库中提取的双语词典。另一方面,第二个翻译器依赖于深度学习模型,特别是序列到序列转换器模型和平行语料库。为了评估、评价和比较两个翻译器的性能,我们使用包含 8,599 个单词的平行语料库进行了测试。两个翻译器取得的结果都值得注意。基于有限状态转换器的翻译器获得了 56.65 的蓝色分数,而基于转换器模型的翻译器获得了 66.07 的较高分数。
{"title":"Translation from Tunisian Dialect to Modern Standard Arabic: Exploring Finite-State Transducers and Sequence-to-Sequence Transformer Approaches","authors":"Roua Torjmen, K. Haddar","doi":"10.1145/3681788","DOIUrl":"https://doi.org/10.1145/3681788","url":null,"abstract":"Translation from the mother tongue, including the Tunisian dialect, to modern standard Arabic is a highly significant field in natural language processing due to its wide range of applications and associated benefits. Recently, researchers have shown increased interest in the Tunisian dialect, primarily driven by the massive volume of content generated spontaneously by Tunisians on social media follow-ing the revolution. This paper presents two distinct translators for converting the Tunisian dialect into Modern Standard Arabic. The first translator utilizes a rule-based approach, employing a collection of finite state transducers and a bilingual dictionary derived from the study corpus. On the other hand, the second translator relies on deep learning models, specifically the sequence-to-sequence trans-former model and a parallel corpus. To assess, evaluate, and compare the performance of the two translators, we conducted tests using a parallel corpus comprising 8,599 words. The results achieved by both translators are noteworthy. The translator based on finite state transducers achieved a blue score of 56.65, while the transformer model-based translator achieved a higher score of 66.07.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141809007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing the Effects of Transcription Errors on Summary Generation of Bengali Spoken Documents 分析转录错误对孟加拉语口语文件摘要生成的影响
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-17 DOI: 10.1145/3678005
Priyanjana Chowdhury, Nabanika Sarkar, Sanghamitra Nath, Utpal Sharma
Automatic speech recognition (ASR) has become an indispensable part of the AI domain, with various speech technologies reliant on it. The quality of speech recognition depends on the amount of annotated data used to train an ASR system, among other factors. For a low-resourced language, this is a severe constraint and thus ASR quality is often poor. Humans can read through text containing ASR-errors, provided the context of the sentence is preserved. Yet in cases of transcripts generated by ASR systems of low-resource languages, multiple important words are misrecognized and the context is mostly lost; discerning such a text becomes nearly impossible. This paper analyzes the types of transcription errors that occur while generating ASR transcripts of spoken documents in Bengali, an under-resourced language predominantly spoken in India and Bangladesh. The transcripts of the Bengali spoken document are generated using the ASR of Google Cloud Speech. The paper also explores if there is an effect of such transcription errors in generating speech summaries of these spoken documents. Summarization is carried out extractively; sentences are selected from the ASR-generated text of the spoken document. Speech summaries are created by aggregating the speech-segments from the original speech of the selected sentences. Subjective evaluation shows the ‘readability’ of the spoken summaries are not degraded by ASR errors, but the quality is affected due to the reliance on intermediate text-summary containing transcription errors.
自动语音识别(ASR)已成为人工智能领域不可或缺的一部分,各种语音技术都依赖于它。除其他因素外,语音识别的质量取决于用于训练 ASR 系统的注释数据量。对于资源匮乏的语言来说,这是一个严重的制约因素,因此 ASR 的质量往往很差。只要保留句子的上下文,人类就可以阅读包含 ASR 错误的文本。然而,在低资源语言的 ASR 系统生成的转录文本中,多个重要单词被错误识别,上下文大部分丢失;要辨别这样的文本几乎是不可能的。本文分析了在生成孟加拉语口语文档 ASR 转录本时出现的转录错误类型,孟加拉语是一种资源匮乏的语言,主要在印度和孟加拉国使用。孟加拉语口语文档的转录本是使用谷歌云语音的 ASR 生成的。本文还探讨了在生成这些口语文档的语音摘要时,此类转录错误是否会产生影响。摘要是以提取方式进行的;从 ASR 生成的口语文档文本中选取句子。从所选句子的原始语音中汇总语音片段,创建语音摘要。主观评估表明,口语摘要的 "可读性 "不会因 ASR 错误而降低,但由于依赖包含转录错误的中间文本摘要,其质量会受到影响。
{"title":"Analyzing the Effects of Transcription Errors on Summary Generation of Bengali Spoken Documents","authors":"Priyanjana Chowdhury, Nabanika Sarkar, Sanghamitra Nath, Utpal Sharma","doi":"10.1145/3678005","DOIUrl":"https://doi.org/10.1145/3678005","url":null,"abstract":"Automatic speech recognition (ASR) has become an indispensable part of the AI domain, with various speech technologies reliant on it. The quality of speech recognition depends on the amount of annotated data used to train an ASR system, among other factors. For a low-resourced language, this is a severe constraint and thus ASR quality is often poor. Humans can read through text containing ASR-errors, provided the context of the sentence is preserved. Yet in cases of transcripts generated by ASR systems of low-resource languages, multiple important words are misrecognized and the context is mostly lost; discerning such a text becomes nearly impossible. This paper analyzes the types of transcription errors that occur while generating ASR transcripts of spoken documents in Bengali, an under-resourced language predominantly spoken in India and Bangladesh. The transcripts of the Bengali spoken document are generated using the ASR of Google Cloud Speech. The paper also explores if there is an effect of such transcription errors in generating speech summaries of these spoken documents. Summarization is carried out extractively; sentences are selected from the ASR-generated text of the spoken document. Speech summaries are created by aggregating the speech-segments from the original speech of the selected sentences. Subjective evaluation shows the ‘readability’ of the spoken summaries are not degraded by ASR errors, but the quality is affected due to the reliance on intermediate text-summary containing transcription errors.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141830350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoMix: Confronting with Noisy Label Learning with Co-training Strategies on Textual Mislabeling CoMix:利用文本误标的协同训练策略应对噪声标签学习
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-15 DOI: 10.1145/3678175
Shu Zhao, Zhuoer Zhao, Yangyang Xu, Xiao Sun
The existence of noisy labels is inevitable in real-world large-scale corpora. As deep neural networks are notably vulnerable to overfitting on noisy samples, this highlights the importance of the ability of language models to resist noise for efficient training. However, little attention has been paid to alleviating the influence of label noise in natural language processing. To address this problem, we present CoMix, a robust Noise-Against training strategy taking advantage of Co-training that deals with textual annotation errors in text classification tasks. In our proposed framework, the original training set is first split into labeled and unlabeled subsets according to a sample partition criteria and then applies label refurbishment on the unlabeled subsets. We implement textual interpolation in hidden space between samples on the updated subsets. Meanwhile, we employ peer diverged networks simultaneously leveraging co-training strategies to avoid the accumulation of confirm bias. Experimental results on three popular text classification benchmarks demonstrate the effectiveness of CoMix in bolstering the network’s resistance to label mislabeling under various noise types and ratios, which also outperforms the state-of-the-art methods.
在真实世界的大规模语料库中,噪声标签的存在是不可避免的。由于深度神经网络在噪声样本上很容易出现过拟合,这就凸显了语言模型抗噪声能力对于高效训练的重要性。然而,在自然语言处理中,人们很少关注如何减轻标签噪声的影响。为了解决这个问题,我们提出了 CoMix,这是一种稳健的抗噪声训练策略,它利用联合训练(Co-training)的优势来处理文本分类任务中的文本注释错误。在我们提出的框架中,首先根据样本分割标准将原始训练集分割为已标注和未标注子集,然后在未标注子集上应用标签翻新。我们在更新后的子集上的样本之间的隐藏空间中实施文本插值。与此同时,我们同时利用同侪发散网络和协同训练策略来避免确认偏差的积累。在三个流行的文本分类基准上的实验结果表明,CoMix 在各种噪声类型和比率下都能有效增强网络对标签误标的抵抗力,其性能也优于最先进的方法。
{"title":"CoMix: Confronting with Noisy Label Learning with Co-training Strategies on Textual Mislabeling","authors":"Shu Zhao, Zhuoer Zhao, Yangyang Xu, Xiao Sun","doi":"10.1145/3678175","DOIUrl":"https://doi.org/10.1145/3678175","url":null,"abstract":"The existence of noisy labels is inevitable in real-world large-scale corpora. As deep neural networks are notably vulnerable to overfitting on noisy samples, this highlights the importance of the ability of language models to resist noise for efficient training. However, little attention has been paid to alleviating the influence of label noise in natural language processing. To address this problem, we present CoMix, a robust Noise-Against training strategy taking advantage of Co-training that deals with textual annotation errors in text classification tasks. In our proposed framework, the original training set is first split into labeled and unlabeled subsets according to a sample partition criteria and then applies label refurbishment on the unlabeled subsets. We implement textual interpolation in hidden space between samples on the updated subsets. Meanwhile, we employ peer diverged networks simultaneously leveraging co-training strategies to avoid the accumulation of confirm bias. Experimental results on three popular text classification benchmarks demonstrate the effectiveness of CoMix in bolstering the network’s resistance to label mislabeling under various noise types and ratios, which also outperforms the state-of-the-art methods.","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141645329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Asian and Low-Resource Language Information Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1