首页 > 最新文献

Int. J. Semantic Comput.最新文献

英文 中文
Deep Fusion of a Skewed Redundant Magnetic and Inertial Sensor for Heading State Estimation in a Saturated Indoor Environment 饱和室内环境下倾斜冗余磁惯性传感器深度融合的航向状态估计
Pub Date : 2021-09-01 DOI: 10.1142/s1793351x21400079
M. Karimi, Edwin Babaians, Martin Oelsch, E. Steinbach
Robust attitude and heading estimation in an indoor environment with respect to a known reference are essential components for various robotic applications. Affordable Attitude and Heading Reference Systems (AHRS) are typically using low-cost solid-state MEMS-based sensors. The precision of heading estimation on such a system is typically degraded due to the encountered drift from the gyro measurements and distortions of the Earth’s magnetic field sensing. This paper presents a novel approach for robust indoor heading estimation based on skewed redundant inertial and magnetic sensors. Recurrent Neural Network-based (RNN) fusion is used to perform robust heading estimation with the ability to compensate for the external magnetic field anomalies. We use our previously described correlation-based filter model for preprocessing the data and for empowering perturbation mitigation. Our experimental results show that the proposed scheme is able to successfully mitigate the anomalies in the saturated indoor environment and achieve a Root-Mean-Square Error of less than [Formula: see text] for long-term use.
在室内环境中基于已知参考的鲁棒姿态和航向估计是各种机器人应用的基本组成部分。经济实惠的姿态和航向参考系统(AHRS)通常使用低成本的固态mems传感器。由于陀螺测量的漂移和地球磁场感知的畸变,这种系统的航向估计精度通常会降低。提出了一种基于倾斜冗余惯性和磁传感器的室内航向鲁棒估计方法。采用基于递归神经网络(RNN)的融合进行鲁棒航向估计,并具有对外部磁场异常的补偿能力。我们使用之前描述的基于相关性的过滤器模型来预处理数据并增强扰动缓解能力。实验结果表明,所提出的方案能够成功地缓解饱和室内环境下的异常,长期使用的均方根误差小于[公式:见文]。
{"title":"Deep Fusion of a Skewed Redundant Magnetic and Inertial Sensor for Heading State Estimation in a Saturated Indoor Environment","authors":"M. Karimi, Edwin Babaians, Martin Oelsch, E. Steinbach","doi":"10.1142/s1793351x21400079","DOIUrl":"https://doi.org/10.1142/s1793351x21400079","url":null,"abstract":"Robust attitude and heading estimation in an indoor environment with respect to a known reference are essential components for various robotic applications. Affordable Attitude and Heading Reference Systems (AHRS) are typically using low-cost solid-state MEMS-based sensors. The precision of heading estimation on such a system is typically degraded due to the encountered drift from the gyro measurements and distortions of the Earth’s magnetic field sensing. This paper presents a novel approach for robust indoor heading estimation based on skewed redundant inertial and magnetic sensors. Recurrent Neural Network-based (RNN) fusion is used to perform robust heading estimation with the ability to compensate for the external magnetic field anomalies. We use our previously described correlation-based filter model for preprocessing the data and for empowering perturbation mitigation. Our experimental results show that the proposed scheme is able to successfully mitigate the anomalies in the saturated indoor environment and achieve a Root-Mean-Square Error of less than [Formula: see text] for long-term use.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132419846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Diffusion-Based Influence Maximization in GOLAP 在GOLAP中基于扩散的影响最大化
Pub Date : 2021-09-01 DOI: 10.1142/s1793351x21500069
Mira Kim, Hsiang-Shun Shih, P. Sheu
Influence analysis is one of the most important research in social network. Specifically, more and more researchers and advertisers are interested in the area of influence maximization (IM). The concept of influence among people or organizations has been the core basis for making business decisions as well as performing everyday social activities. In this research, we begin by extending a new influence diffusion model information diffusion model (IDM) using various constraints. We incorporate colors and additional nodes constraints. By adding colors and constraints for different types of nodes in a graph, we would be able to answer complex queries on multi-dimensional graphs such as ‘find at most two most important genes that are related to lung disease and heart disease’. More specifically, we discuss the following variations of IM-IDM; Colorblind IM-IDM, Colored IM-IDM and Colored IM-IDM with constraints. We also present our experiment results to prove the effectiveness of our model and algorithms.
影响分析是社会网络研究的重要内容之一。具体来说,越来越多的研究人员和广告主对影响力最大化(IM)领域感兴趣。个人或组织之间的影响力概念一直是制定商业决策以及进行日常社会活动的核心基础。在本研究中,我们首先扩展了一种新的影响扩散模型——信息扩散模型(IDM)。我们结合了颜色和额外的节点约束。通过为图中不同类型的节点添加颜色和约束,我们将能够回答多维图上的复杂查询,例如“最多找到两个与肺病和心脏病相关的最重要基因”。更具体地说,我们讨论了IM-IDM的以下变体;色盲IM-IDM、有色IM-IDM和带约束的有色IM-IDM。实验结果证明了模型和算法的有效性。
{"title":"Diffusion-Based Influence Maximization in GOLAP","authors":"Mira Kim, Hsiang-Shun Shih, P. Sheu","doi":"10.1142/s1793351x21500069","DOIUrl":"https://doi.org/10.1142/s1793351x21500069","url":null,"abstract":"Influence analysis is one of the most important research in social network. Specifically, more and more researchers and advertisers are interested in the area of influence maximization (IM). The concept of influence among people or organizations has been the core basis for making business decisions as well as performing everyday social activities. In this research, we begin by extending a new influence diffusion model information diffusion model (IDM) using various constraints. We incorporate colors and additional nodes constraints. By adding colors and constraints for different types of nodes in a graph, we would be able to answer complex queries on multi-dimensional graphs such as ‘find at most two most important genes that are related to lung disease and heart disease’. More specifically, we discuss the following variations of IM-IDM; Colorblind IM-IDM, Colored IM-IDM and Colored IM-IDM with constraints. We also present our experiment results to prove the effectiveness of our model and algorithms.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115000182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Experimental Analysis of the Effects of Different Hardware Setups on Stereo Camera Systems 不同硬件设置对立体摄像系统影响的实验分析
Pub Date : 2021-09-01 DOI: 10.1142/s1793351x21400080
A. J. Golkowski, M. Handte, Peter Roch, P. Marrón
For many application areas such as autonomous navigation, the ability to accurately perceive the environment is essential. For this purpose, a wide variety of well-researched sensor systems are available that can be used to detect obstacles or navigation targets. Stereo cameras have emerged as a very versatile sensing technology in this regard due to their low hardware cost and high fidelity. Consequently, much work has been done to integrate them into mobile robots. However, the existing literature focuses on presenting the concepts and algorithms used to implement the desired robot functions on top of a given camera setup. As a result, the rationale and impact of choosing this camera setup are usually neither discussed nor described. Thus, when designing the stereo camera system for a mobile robot, there is not much general guidance beyond isolated setups that worked for a specific robot. To close the gap, this paper studies the impact of the physical setup of a stereo camera system in indoor environments. To do this, we present the results of an experimental analysis in which we use a given software setup to estimate the distance to an object while systematically changing the camera setup. Thereby, we vary the three main parameters of the physical camera setup, namely the angle and distance between the cameras as well as the field of view and a rather soft parameter, the resolution. Based on the results, we derive several guidelines on how to choose the parameters for an application.
对于许多应用领域,如自主导航,准确感知环境的能力是必不可少的。为此,各种各样的经过充分研究的传感器系统可用于检测障碍物或导航目标。立体相机由于其低硬件成本和高保真度,在这方面已经成为一种非常通用的传感技术。因此,将它们集成到移动机器人中已经做了很多工作。然而,现有文献的重点是在给定的相机设置之上呈现用于实现所需机器人功能的概念和算法。因此,选择这种相机设置的基本原理和影响通常既不讨论也不描述。因此,在为移动机器人设计立体摄像系统时,除了针对特定机器人的孤立设置之外,没有太多的通用指导。为了缩小这一差距,本文研究了立体摄像机系统在室内环境中物理设置的影响。为此,我们提出了一项实验分析的结果,其中我们使用给定的软件设置来估计到物体的距离,同时系统地改变相机设置。因此,我们改变了物理相机设置的三个主要参数,即相机之间的角度和距离,以及视野和一个相当软的参数,分辨率。基于这些结果,我们得出了一些关于如何为应用程序选择参数的指导原则。
{"title":"An Experimental Analysis of the Effects of Different Hardware Setups on Stereo Camera Systems","authors":"A. J. Golkowski, M. Handte, Peter Roch, P. Marrón","doi":"10.1142/s1793351x21400080","DOIUrl":"https://doi.org/10.1142/s1793351x21400080","url":null,"abstract":"For many application areas such as autonomous navigation, the ability to accurately perceive the environment is essential. For this purpose, a wide variety of well-researched sensor systems are available that can be used to detect obstacles or navigation targets. Stereo cameras have emerged as a very versatile sensing technology in this regard due to their low hardware cost and high fidelity. Consequently, much work has been done to integrate them into mobile robots. However, the existing literature focuses on presenting the concepts and algorithms used to implement the desired robot functions on top of a given camera setup. As a result, the rationale and impact of choosing this camera setup are usually neither discussed nor described. Thus, when designing the stereo camera system for a mobile robot, there is not much general guidance beyond isolated setups that worked for a specific robot. To close the gap, this paper studies the impact of the physical setup of a stereo camera system in indoor environments. To do this, we present the results of an experimental analysis in which we use a given software setup to estimate the distance to an object while systematically changing the camera setup. Thereby, we vary the three main parameters of the physical camera setup, namely the angle and distance between the cameras as well as the field of view and a rather soft parameter, the resolution. Based on the results, we derive several guidelines on how to choose the parameters for an application.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116934860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guest Editor's Introduction 特邀编辑简介
Pub Date : 2021-09-01 DOI: 10.1142/s1793351x21020025
Chun-Ming Chang
{"title":"Guest Editor's Introduction","authors":"Chun-Ming Chang","doi":"10.1142/s1793351x21020025","DOIUrl":"https://doi.org/10.1142/s1793351x21020025","url":null,"abstract":"","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"32 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133390374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Twins Talk and Alternative Calculations 神经双胞胎谈话和替代计算
Pub Date : 2021-08-05 DOI: 10.1142/S1793351X21500045
Zanyar Zohourianshahzadi, J. Kalita
Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into a particular region in the input image. After a deep learning model is trained on visual grounding task, the model employs the learned patterns regarding the visual grounding and the order of objects in the caption sentences, when generating captions. We report the results of our experiments in three image captioning tasks on the COCO dataset. The results are reported using standard image captioning metrics to show the improvements achieved by our model over the previous image captioning model. The results gathered from our experiments suggest that employing more parallel attention pathways in a deep neural network leads to higher performance. Our implementation of NTT is publicly available at: https://github.com/zanyarz/NeuralTwinsTalk.
受人类大脑在增加对主题的关注时如何使用更多神经通路的启发,我们引入了一种新的双级联注意力模型,该模型优于最先进的图像字幕模型,该模型最初使用一个注意力通道来实现视觉基础任务。视觉接地确保标题句子中的单词存在于输入图像的特定区域。在视觉基础任务上训练深度学习模型后,该模型在生成标题时使用学习到的关于视觉基础和标题句子中对象顺序的模式。我们报告了我们在COCO数据集上的三个图像字幕任务的实验结果。使用标准图像字幕度量来报告结果,以显示我们的模型相对于以前的图像字幕模型所取得的改进。从我们的实验中收集的结果表明,在深度神经网络中使用更多的平行注意力路径会导致更高的性能。我们的NTT实现可以在:https://github.com/zanyarz/NeuralTwinsTalk上公开获得。
{"title":"Neural Twins Talk and Alternative Calculations","authors":"Zanyar Zohourianshahzadi, J. Kalita","doi":"10.1142/S1793351X21500045","DOIUrl":"https://doi.org/10.1142/S1793351X21500045","url":null,"abstract":"Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into a particular region in the input image. After a deep learning model is trained on visual grounding task, the model employs the learned patterns regarding the visual grounding and the order of objects in the caption sentences, when generating captions. We report the results of our experiments in three image captioning tasks on the COCO dataset. The results are reported using standard image captioning metrics to show the improvements achieved by our model over the previous image captioning model. The results gathered from our experiments suggest that employing more parallel attention pathways in a deep neural network leads to higher performance. Our implementation of NTT is publicly available at: https://github.com/zanyarz/NeuralTwinsTalk.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116024338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Syntactic Coherence in Word Embedding Spaces 词嵌入空间中的句法连贯
Pub Date : 2021-06-01 DOI: 10.1142/S1793351X21500057
Renjith P. Ravindran, K. N. Murthy
Word embeddings have recently become a vital part of many Natural Language Processing (NLP) systems. Word embeddings are a suite of techniques that represent words in a language as vectors in an n-dimensional real space that has been shown to encode a significant amount of syntactic and semantic information. When used in NLP systems, these representations have resulted in improved performance across a wide range of NLP tasks. However, it is not clear how syntactic properties interact with the more widely studied semantic properties of words. Or what the main factors in the modeling formulation are that encourages embedding spaces to pick up more of syntactic behavior as opposed to semantic behavior of words. We investigate several aspects of word embedding spaces and modeling assumptions that maximize syntactic coherence — the degree to which words with similar syntactic properties form distinct neighborhoods in the embedding space. We do so in order to understand which of the existing models maximize syntactic coherence making it a more reliable source for extracting syntactic category (POS) information. Our analysis shows that syntactic coherence of S-CODE is superior to the other more popular and more recent embedding techniques such as Word2vec, fastText, GloVe and LexVec, when measured under compatible parameter settings. Our investigation also gives deeper insights into the geometry of the embedding space with respect to syntactic coherence, and how this is influenced by context size, frequency of words, and dimensionality of the embedding space.
词嵌入最近成为许多自然语言处理(NLP)系统的重要组成部分。单词嵌入是一套技术,它将语言中的单词表示为n维实际空间中的向量,这些向量已被证明可以编码大量的语法和语义信息。当在NLP系统中使用时,这些表示在广泛的NLP任务中提高了性能。然而,目前尚不清楚句法属性如何与被广泛研究的词的语义属性相互作用。或者建模公式中的主要因素是什么,它鼓励嵌入空间获取更多的句法行为,而不是单词的语义行为。我们研究了词嵌入空间和建模假设的几个方面,以最大限度地提高句法一致性-具有相似句法属性的词在嵌入空间中形成不同邻域的程度。我们这样做是为了了解哪些现有模型最大化句法一致性,使其成为提取句法类别(POS)信息的更可靠的来源。我们的分析表明,当在兼容参数设置下测量时,S-CODE的句法一致性优于其他更流行和最新的嵌入技术,如Word2vec, fastText, GloVe和LexVec。我们的研究还深入了解了嵌入空间在句法连贯性方面的几何形状,以及这是如何受到上下文大小、单词频率和嵌入空间维度的影响的。
{"title":"Syntactic Coherence in Word Embedding Spaces","authors":"Renjith P. Ravindran, K. N. Murthy","doi":"10.1142/S1793351X21500057","DOIUrl":"https://doi.org/10.1142/S1793351X21500057","url":null,"abstract":"Word embeddings have recently become a vital part of many Natural Language Processing (NLP) systems. Word embeddings are a suite of techniques that represent words in a language as vectors in an n-dimensional real space that has been shown to encode a significant amount of syntactic and semantic information. When used in NLP systems, these representations have resulted in improved performance across a wide range of NLP tasks. However, it is not clear how syntactic properties interact with the more widely studied semantic properties of words. Or what the main factors in the modeling formulation are that encourages embedding spaces to pick up more of syntactic behavior as opposed to semantic behavior of words. We investigate several aspects of word embedding spaces and modeling assumptions that maximize syntactic coherence — the degree to which words with similar syntactic properties form distinct neighborhoods in the embedding space. We do so in order to understand which of the existing models maximize syntactic coherence making it a more reliable source for extracting syntactic category (POS) information. Our analysis shows that syntactic coherence of S-CODE is superior to the other more popular and more recent embedding techniques such as Word2vec, fastText, GloVe and LexVec, when measured under compatible parameter settings. Our investigation also gives deeper insights into the geometry of the embedding space with respect to syntactic coherence, and how this is influenced by context size, frequency of words, and dimensionality of the embedding space.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130540843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events 利用三维卷积神经网络实时检测足球赛事
Pub Date : 2021-06-01 DOI: 10.1142/S1793351X2140002X
Olav A. Norgård Rongved, S. Hicks, Vajira Lasantha Thambawita, H. Stensland, E. Zouganeli, Dag Johansen, Cise Midoglu, M. Riegler, P. Halvorsen
Developing systems for the automatic detection of events in video is a task which has gained attention in many areas including sports. More specifically, event detection for soccer videos has been studied widely in the literature. However, there are still a number of shortcomings in the state-of-the-art such as high latency, making it challenging to operate at the live edge. In this paper, we present an algorithm to detect events in soccer videos in real time, using 3D convolutional neural networks. We test our algorithm on three different datasets from SoccerNet, the Swedish Allsvenskan, and the Norwegian Eliteserien. Overall, the results show that we can detect events with high recall, low latency, and accurate time estimation. The trade-off is a slightly lower precision compared to the current state-of-the-art, which has higher latency and performs better when a less accurate time estimation can be accepted. In addition to the presented algorithm, we perform an extensive ablation study on how the different parts of the training pipeline affect the final results.
视频事件自动检测系统的开发已经引起了包括体育在内的许多领域的关注。更具体地说,足球视频的事件检测已经在文献中得到了广泛的研究。然而,在最先进的技术中仍然存在许多缺点,例如高延迟,这使得在实时边缘操作具有挑战性。在本文中,我们提出了一种利用三维卷积神经网络实时检测足球视频事件的算法。我们在来自SoccerNet、瑞典Allsvenskan和挪威Eliteserien的三个不同数据集上测试了我们的算法。总的来说,结果表明我们可以检测到具有高召回率、低延迟和准确的时间估计的事件。与当前最先进的技术相比,代价是精度略低,后者具有更高的延迟,并且在可以接受较不准确的时间估计时性能更好。除了提出的算法外,我们还对训练管道的不同部分如何影响最终结果进行了广泛的烧蚀研究。
{"title":"Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events","authors":"Olav A. Norgård Rongved, S. Hicks, Vajira Lasantha Thambawita, H. Stensland, E. Zouganeli, Dag Johansen, Cise Midoglu, M. Riegler, P. Halvorsen","doi":"10.1142/S1793351X2140002X","DOIUrl":"https://doi.org/10.1142/S1793351X2140002X","url":null,"abstract":"Developing systems for the automatic detection of events in video is a task which has gained attention in many areas including sports. More specifically, event detection for soccer videos has been studied widely in the literature. However, there are still a number of shortcomings in the state-of-the-art such as high latency, making it challenging to operate at the live edge. In this paper, we present an algorithm to detect events in soccer videos in real time, using 3D convolutional neural networks. We test our algorithm on three different datasets from SoccerNet, the Swedish Allsvenskan, and the Norwegian Eliteserien. Overall, the results show that we can detect events with high recall, low latency, and accurate time estimation. The trade-off is a slightly lower precision compared to the current state-of-the-art, which has higher latency and performs better when a less accurate time estimation can be accepted. In addition to the presented algorithm, we perform an extensive ablation study on how the different parts of the training pipeline affect the final results.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Audio Captioning with Composition of Acoustic and Semantic Information 声学和语义信息组合的音频字幕
Pub Date : 2021-05-13 DOI: 10.1142/S1793351X21400018
Aysegül Özkaya Eren, M. Sert
Generating audio captions is a new research area that combines audio and natural language processing to create meaningful textual descriptions for audio clips. To address this problem, previous studies mostly use the encoder–decoder-based models without considering semantic information. To fill this gap, we present a novel encoder–decoder architecture using bi-directional Gated Recurrent Units (BiGRU) with audio and semantic embeddings. We extract semantic embedding by obtaining subjects and verbs from the audio clip captions and combine these embedding with audio embedding to feed the BiGRU-based encoder–decoder model. To enable semantic embeddings for the test audios, we introduce a Multilayer Perceptron classifier to predict the semantic embeddings of those clips. We also present exhaustive experiments to show the efficiency of different features and datasets for our proposed model the audio captioning task. To extract audio features, we use the log Mel energy features, VGGish embeddings, and a pretrained audio neural network (PANN) embeddings. Extensive experiments on two audio captioning datasets Clotho and AudioCaps show that our proposed model outperforms state-of-the-art audio captioning models across different evaluation metrics and using the semantic information improves the captioning performance.
生成音频字幕是将音频和自然语言处理相结合,为音频片段创建有意义的文本描述的一个新的研究领域。为了解决这一问题,以往的研究大多采用基于编码器-解码器的模型,没有考虑语义信息。为了填补这一空白,我们提出了一种新的编码器-解码器架构,使用带有音频和语义嵌入的双向门控循环单元(BiGRU)。我们通过从音频片段字幕中获取主语和动词来提取语义嵌入,并将这些嵌入与音频嵌入相结合,以提供基于bigru的编码器-解码器模型。为了对测试音频进行语义嵌入,我们引入了一个多层感知器分类器来预测这些片段的语义嵌入。我们还提供了详尽的实验来证明不同特征和数据集对我们提出的音频字幕任务模型的效率。为了提取音频特征,我们使用了对数Mel能量特征、VGGish嵌入和预训练的音频神经网络(PANN)嵌入。在两个音频字幕数据集Clotho和AudioCaps上进行的大量实验表明,我们提出的模型在不同的评估指标上优于最先进的音频字幕模型,并且使用语义信息提高了字幕性能。
{"title":"Audio Captioning with Composition of Acoustic and Semantic Information","authors":"Aysegül Özkaya Eren, M. Sert","doi":"10.1142/S1793351X21400018","DOIUrl":"https://doi.org/10.1142/S1793351X21400018","url":null,"abstract":"Generating audio captions is a new research area that combines audio and natural language processing to create meaningful textual descriptions for audio clips. To address this problem, previous studies mostly use the encoder–decoder-based models without considering semantic information. To fill this gap, we present a novel encoder–decoder architecture using bi-directional Gated Recurrent Units (BiGRU) with audio and semantic embeddings. We extract semantic embedding by obtaining subjects and verbs from the audio clip captions and combine these embedding with audio embedding to feed the BiGRU-based encoder–decoder model. To enable semantic embeddings for the test audios, we introduce a Multilayer Perceptron classifier to predict the semantic embeddings of those clips. We also present exhaustive experiments to show the efficiency of different features and datasets for our proposed model the audio captioning task. To extract audio features, we use the log Mel energy features, VGGish embeddings, and a pretrained audio neural network (PANN) embeddings. Extensive experiments on two audio captioning datasets Clotho and AudioCaps show that our proposed model outperforms state-of-the-art audio captioning models across different evaluation metrics and using the semantic information improves the captioning performance.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132774798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Spatial Data Management in IoT Systems: Solutions and Evaluation 物联网系统中的空间数据管理:解决方案和评估
Pub Date : 2021-04-25 DOI: 10.1142/S1793351X21300016
Maria Krommyda, Verena Kantere
As the Internet of Things (IoT) systems gain in popularity, an increasing number of Big Data sources are available. Ranging from small sensor networks designed for household use to large fully auto...
随着物联网(IoT)系统的普及,越来越多的大数据源可用。从小型家用传感器网络到大型全自动…
{"title":"Spatial Data Management in IoT Systems: Solutions and Evaluation","authors":"Maria Krommyda, Verena Kantere","doi":"10.1142/S1793351X21300016","DOIUrl":"https://doi.org/10.1142/S1793351X21300016","url":null,"abstract":"As the Internet of Things (IoT) systems gain in popularity, an increasing number of Big Data sources are available. Ranging from small sensor networks designed for household use to large fully auto...","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126628083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Group-Specific Models of Hospital Workers' Well-Being: Implications for Algorithmic Bias 调查医院工作人员幸福感的群体特定模型:对算法偏差的影响
Pub Date : 2020-12-01 DOI: 10.1142/S1793351X20500075
Vinesh Ravuri, Projna Paromita, Karel Mundnich, Amrutha Nadarajan, Brandon M. Booth, Shrikanth S. Narayanan, Theodora Chaspari
Hospital workers often experience burnout due to the demanding job responsibilities and long work hours. Data yielding from ambulatory monitoring combined with machine learning algorithms can afford us a better understanding of the naturalistic processes that contribute to this burnout. Motivated by the challenges related to the accurate tracking of well-being in real-life, prior work has investigated group-specific machine learning (GS-ML) models that are tailored to groups of participants. We examine a novel GS-ML for estimating well-being from real-life multimodal measures collected in situ from hospital workers. In contrast to the majority of prior work that uses pre-determined clustering criteria, we propose an iterative procedure that refines participant clusters based on the representations learned by the GS-ML models. Motivated by prior work that highlights the differential impact of job demands on well-being, we further explore the participant clusters in terms of demography and job-related attributes. Results indicate that the GS-ML models mostly outperform general models in estimating well-being constructs. The GS-ML models further depict different degrees of predictive power for each participant cluster, as distinguished upon age, education, occupational role, and number of supervisees. The observed discrepancies with respect to the GS-ML model decisions are discussed in association with algorithmic bias.
由于高要求的工作职责和长时间的工作,医院工作人员经常感到倦怠。结合机器学习算法的动态监测数据可以让我们更好地理解导致这种倦怠的自然过程。由于在现实生活中准确追踪幸福感所面临的挑战,之前的工作已经研究了针对参与者群体量身定制的特定群体机器学习(GS-ML)模型。我们研究了一种新的GS-ML,用于估计从医院工作人员现场收集的现实生活中的多模式测量的福祉。与之前使用预先确定的聚类标准的大多数工作相反,我们提出了一个迭代过程,该过程基于GS-ML模型学习的表示来改进参与者聚类。在前人研究强调工作需求对幸福感的差异影响的激励下,我们进一步从人口统计学和工作相关属性的角度探讨了参与者集群。结果表明,GS-ML模型在估计幸福感结构方面大多优于一般模型。GS-ML模型进一步描述了每个参与者集群的不同程度的预测能力,根据年龄、教育程度、职业角色和被监管人员的数量来区分。观察到的关于GS-ML模型决策的差异与算法偏差有关。
{"title":"Investigating Group-Specific Models of Hospital Workers' Well-Being: Implications for Algorithmic Bias","authors":"Vinesh Ravuri, Projna Paromita, Karel Mundnich, Amrutha Nadarajan, Brandon M. Booth, Shrikanth S. Narayanan, Theodora Chaspari","doi":"10.1142/S1793351X20500075","DOIUrl":"https://doi.org/10.1142/S1793351X20500075","url":null,"abstract":"Hospital workers often experience burnout due to the demanding job responsibilities and long work hours. Data yielding from ambulatory monitoring combined with machine learning algorithms can afford us a better understanding of the naturalistic processes that contribute to this burnout. Motivated by the challenges related to the accurate tracking of well-being in real-life, prior work has investigated group-specific machine learning (GS-ML) models that are tailored to groups of participants. We examine a novel GS-ML for estimating well-being from real-life multimodal measures collected in situ from hospital workers. In contrast to the majority of prior work that uses pre-determined clustering criteria, we propose an iterative procedure that refines participant clusters based on the representations learned by the GS-ML models. Motivated by prior work that highlights the differential impact of job demands on well-being, we further explore the participant clusters in terms of demography and job-related attributes. Results indicate that the GS-ML models mostly outperform general models in estimating well-being constructs. The GS-ML models further depict different degrees of predictive power for each participant cluster, as distinguished upon age, education, occupational role, and number of supervisees. The observed discrepancies with respect to the GS-ML model decisions are discussed in association with algorithmic bias.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127428778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Int. J. Semantic Comput.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1