首页 > 最新文献

International Conference on Pattern Recognition Applications and Methods最新文献

英文 中文
Faster RBF Network Learning Utilizing Singular Regions 利用奇异区域实现更快的RBF网络学习
Pub Date : 2019-02-19 DOI: 10.5220/0007367205010508
Seiya Satoh, R. Nakano
There are two ways to learn radial basis function (RBF) networks: one-stage and two-stage learnings. Recently a very powerful one-stage learning method called RBF-SSF has been proposed, which can stably find a series of excellent solutions, making good use of singular regions, and can monotonically decrease training error along with the increase of hidden units. RBF-SSF was built by applying the SSF (singularity stairs following) paradigm to RBF networks; the SSF paradigm was originally and successfully proposed for multilayer perceptrons. Although RBF-SSF has the strong capability to find excellent solutions, it required a lot of time mainly because it computes the Hessian. This paper proposes a faster version of RBF-SSF called RBF-SSF(pH) by introducing partial calculation of the Hessian. The experiments using two datasets showed RBF-SSF(pH) ran as fast as usual one-stage learning methods while keeping the excellent solution quality.
径向基函数(RBF)网络有两种学习方法:一阶段学习和两阶段学习。最近提出了一种非常强大的单阶段学习方法RBF-SSF,它可以稳定地找到一系列优秀的解,很好地利用了奇异区域,并且可以随着隐藏单元的增加单调地减少训练误差。将SSF (singularity stairs following)模型应用于RBF网络,构建了RBF-SSF;SSF范式最初是针对多层感知器成功提出的。虽然RBF-SSF具有很强的寻找优秀解的能力,但主要是因为它需要计算Hessian,所以需要大量的时间。本文通过引入Hessian的部分计算,提出了一种更快的RBF-SSF,称为RBF-SSF(pH)。使用两个数据集的实验表明,RBF-SSF(pH)的运行速度与通常的单阶段学习方法一样快,同时保持了优异的解决方案质量。
{"title":"Faster RBF Network Learning Utilizing Singular Regions","authors":"Seiya Satoh, R. Nakano","doi":"10.5220/0007367205010508","DOIUrl":"https://doi.org/10.5220/0007367205010508","url":null,"abstract":"There are two ways to learn radial basis function (RBF) networks: one-stage and two-stage learnings. Recently a very powerful one-stage learning method called RBF-SSF has been proposed, which can stably find a series of excellent solutions, making good use of singular regions, and can monotonically decrease training error along with the increase of hidden units. RBF-SSF was built by applying the SSF (singularity stairs following) paradigm to RBF networks; the SSF paradigm was originally and successfully proposed for multilayer perceptrons. Although RBF-SSF has the strong capability to find excellent solutions, it required a lot of time mainly because it computes the Hessian. This paper proposes a faster version of RBF-SSF called RBF-SSF(pH) by introducing partial calculation of the Hessian. The experiments using two datasets showed RBF-SSF(pH) ran as fast as usual one-stage learning methods while keeping the excellent solution quality.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114435235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction AveRobot:人机交互中人再识别与验证的视听数据集
Pub Date : 2019-02-19 DOI: 10.5220/0007690902550265
M. Marras, Pedro A. Marín-Reyes, J. Lorenzo-Navarro, M. C. Santana, G. Fenu
Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.
智能技术已经渗透到我们的日常生活中,使人们更容易完成他们的活动。一个新兴的应用是使用机器人来协助人们完成各种任务(例如,参观博物馆)。在这种情况下,使机器人能够正确识别人是至关重要的。现有的机器人经常使用面部信息来确定目标人物的身份。但是,由于姿势、光照、分辨率和记录距离的变化,面部本身可能无法提供足够的相关信息。在这些情况下,语音等其他生物识别模式可以提高识别性能。然而,机器人场景中的现有数据集通常不包括音频提示,并且往往受到一个或多个限制:大多数数据集是在受控条件下获得的,每个用户的身份或样本数量有限,由同一记录设备收集,并且/或者不是免费提供的。在本文中,我们提出了AveRobot,这是一个由111名参与者在机器人辅助场景下发出短句的视听数据集。展览在一栋三层楼高的建筑中进行,通过八个内置麦克风的不同摄像头进行。利用深度学习基线对该数据集的人脸和语音再识别和验证性能进行了评估,并与不同场景的视听数据集进行了比较。结果表明,AveRobot是一个具有挑战性的数据集,用于人们的重新识别和验证。
{"title":"AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction","authors":"M. Marras, Pedro A. Marín-Reyes, J. Lorenzo-Navarro, M. C. Santana, G. Fenu","doi":"10.5220/0007690902550265","DOIUrl":"https://doi.org/10.5220/0007690902550265","url":null,"abstract":"Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122117285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Data for Image Recognition Tasks: An Efficient Tool for Fine-Grained Annotations 图像识别任务的数据:细粒度注释的有效工具
Pub Date : 2019-02-19 DOI: 10.5220/0007688709000907
Marco Filax, Tim Gonschorek, F. Ortmeier
Using large datasets is essential for machine learning. In practice, training a machine learning algorithm requires hundreds of samples. Multiple off-the-shelf datasets from the scientific domain exist to benchmark new approaches. However, when machine learning algorithms transit to industry, e.g., for a particular image classification problem, hundreds of specific purpose images are collected and annotated in laborious manual work. In this paper, we present a novel system to decrease the effort of annotating those large image sets. Therefore, we generate 2D bounding boxes from minimal 3D annotations using the known location and orientation of the camera. We annotate a particular object of interest in 3D once and project these annotations on to every frame of a video stream. The proposed approach is designed to work with off-the-shelf hardware. We demonstrate its applicability with an example from the real world. We generated a more extensive dataset than available in other works for a particular industrial use case: fine-grained recognition of items within grocery stores. Further, we make our dataset available to the interested vision community consisting of over 60,000 images. Some images were taken under ideal conditions for training while others were taken with the proposed approach in the wild.
使用大型数据集对于机器学习至关重要。在实践中,训练机器学习算法需要数百个样本。科学领域存在多个现成的数据集,可以对新方法进行基准测试。然而,当机器学习算法转移到工业领域时,例如,对于特定的图像分类问题,需要收集数百张特定用途的图像,并在费力的手工工作中进行注释。在本文中,我们提出了一个新的系统,以减少标注这些大图像集的工作量。因此,我们使用已知的相机位置和方向,从最小的3D注释生成2D边界框。我们在3D中注释一个感兴趣的特定对象一次,并将这些注释投影到视频流的每一帧上。所提出的方法被设计用于使用现成的硬件。我们用一个来自现实世界的例子来证明它的适用性。我们为一个特定的工业用例生成了一个比其他作品更广泛的数据集:杂货店内物品的细粒度识别。此外,我们将我们的数据集提供给感兴趣的视觉社区,其中包含超过60,000张图像。一些图像是在理想的训练条件下拍摄的,而另一些则是在野外使用所提出的方法拍摄的。
{"title":"Data for Image Recognition Tasks: An Efficient Tool for Fine-Grained Annotations","authors":"Marco Filax, Tim Gonschorek, F. Ortmeier","doi":"10.5220/0007688709000907","DOIUrl":"https://doi.org/10.5220/0007688709000907","url":null,"abstract":"Using large datasets is essential for machine learning. In practice, training a machine learning algorithm requires hundreds of samples. Multiple off-the-shelf datasets from the scientific domain exist to benchmark new approaches. However, when machine learning algorithms transit to industry, e.g., for a particular image classification problem, hundreds of specific purpose images are collected and annotated in laborious manual work. In this paper, we present a novel system to decrease the effort of annotating those large image sets. Therefore, we generate 2D bounding boxes from minimal 3D annotations using the known location and orientation of the camera. We annotate a particular object of interest in 3D once and project these annotations on to every frame of a video stream. The proposed approach is designed to work with off-the-shelf hardware. We demonstrate its applicability with an example from the real world. We generated a more extensive dataset than available in other works for a particular industrial use case: fine-grained recognition of items within grocery stores. Further, we make our dataset available to the interested vision community consisting of over 60,000 images. Some images were taken under ideal conditions for training while others were taken with the proposed approach in the wild.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129637016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Simple Domain Adaptation for CAD based Object Recognition 基于CAD的简单领域自适应目标识别
Pub Date : 2019-02-19 DOI: 10.5220/0007346504290437
Kripasindhu Sarkar, D. Stricker
We present a simple method of domain adaptation between synthetic images and real images - by high quality rendering of the 3D models and correlation alignment. Using this method, we solve the problem of 3D object recognition in 2D images by fine-tuning existing pretrained CNN models for the object categories using the rendered images. Experimentally, we show that our rendering pipeline along with the correlation alignment improve the recognition accuracy of existing CNN based recognition trained on rendered images - by a canonical renderer - by a large margin. Using the same idea we present a general image classifier of common objects which is trained only on the 3D models from the publicly available databases, and show that a small number of training models are sufficient to capture different variations within and across the classes.
我们提出了一种简单的合成图像和真实图像之间的域自适应方法-通过高质量的三维模型渲染和相关对齐。使用该方法,我们通过使用渲染图像对现有的预训练CNN模型进行对象类别微调,解决了2D图像中3D对象识别问题。实验表明,我们的渲染管道和相关对齐极大地提高了现有的基于CNN的识别的识别精度,这些识别是通过一个规范的渲染器在渲染图像上训练的。使用相同的思想,我们提出了一个通用的图像分类器,该分类器仅在公开可用的数据库中的3D模型上进行训练,并表明少量的训练模型足以捕获类内和类间的不同变化。
{"title":"Simple Domain Adaptation for CAD based Object Recognition","authors":"Kripasindhu Sarkar, D. Stricker","doi":"10.5220/0007346504290437","DOIUrl":"https://doi.org/10.5220/0007346504290437","url":null,"abstract":"We present a simple method of domain adaptation between synthetic images and real images - by high quality rendering of the 3D models and correlation alignment. Using this method, we solve the problem of 3D object recognition in 2D images by fine-tuning existing pretrained CNN models for the object categories using the rendered images. Experimentally, we show that our rendering pipeline along with the correlation alignment improve the recognition accuracy of existing CNN based recognition trained on rendered images - by a canonical renderer - by a large margin. Using the same idea we present a general image classifier of common objects which is trained only on the 3D models from the publicly available databases, and show that a small number of training models are sufficient to capture different variations within and across the classes.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114562434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fourier Spectral Domain Functional Principal Component Analysis of EEG Signals 脑电信号的傅里叶谱域泛函主成分分析
Pub Date : 2019-02-19 DOI: 10.1007/978-3-030-40014-9_1
Shengkun Xie, A. Lawniczak
{"title":"Fourier Spectral Domain Functional Principal Component Analysis of EEG Signals","authors":"Shengkun Xie, A. Lawniczak","doi":"10.1007/978-3-030-40014-9_1","DOIUrl":"https://doi.org/10.1007/978-3-030-40014-9_1","url":null,"abstract":"","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126697607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Forecasting Hotel Room Sales within Online Travel Agencies by Combining Multiple Feature Sets 结合多个功能集预测在线旅行社的酒店客房销售
Pub Date : 2019-02-19 DOI: 10.5220/0007383205650573
Gizem Aras, G. Ayhan, Mehmet Sarıkaya, A. A. Tokuç, C. O. Sakar
Hotel Room Sales prediction using previous booking data is a prominent research topic for the online travel agency (OTA) sector. Various approaches have been proposed to predict hotel room sales for different prediction horizons, such as yearly demand or daily number of reservations. An OTA website includes offers of many companies for the same hotel, and the position of the company’s offer in OTA website depends on the bid amount given for each click by the company. Therefore, the accurate prediction of the sales amount for a given bid is a crucial need in revenue and cost management for the companies in the sector. In this paper, we forecast the next day’s sales amount in order to provide an estimate of daily revenue generated per hotel. An important contribution of our study is to use an enriched dataset constructed by combining the most informative features proposed in various related studies for hotel sales prediction. Moreover, we enrich this dataset with a set of OTA specific features that possess information about the relative position of the company’s offers to that of its competitors in a travel metasearch engine website. We provide a real application on the hotel room sales data of a large OTA in Turkey. The comparative results show that enrichment of the input representation with the OTA-specific additional features increases the generalization ability of the prediction models, and tree-based boosting algorithms perform the best results on this task.
利用以往预订数据进行酒店客房销售预测是在线旅行社(OTA)领域的一个重要研究课题。已经提出了各种方法来预测不同预测范围的酒店客房销售,例如年需求或每日预订数量。一个OTA网站包含了许多公司对同一家酒店的报价,该公司的报价在OTA网站上的位置取决于该公司每次点击的出价金额。因此,准确预测给定投标的销售额是该行业公司收入和成本管理的关键需求。在本文中,我们预测了第二天的销售额,以提供每个酒店每天产生的收入的估计。我们的研究的一个重要贡献是使用了一个丰富的数据集,该数据集是通过结合各种相关研究中提出的最具信息量的特征来构建的,用于酒店销售预测。此外,我们用一组OTA特有的特征来丰富这个数据集,这些特征包含了该公司在旅游元搜索引擎网站上与其竞争对手的相对位置信息。我们提供了一个关于土耳其一家大型OTA酒店房间销售数据的真实应用。对比结果表明,使用特定于ota的附加特征丰富输入表示可以提高预测模型的泛化能力,而基于树的增强算法在此任务上表现最佳。
{"title":"Forecasting Hotel Room Sales within Online Travel Agencies by Combining Multiple Feature Sets","authors":"Gizem Aras, G. Ayhan, Mehmet Sarıkaya, A. A. Tokuç, C. O. Sakar","doi":"10.5220/0007383205650573","DOIUrl":"https://doi.org/10.5220/0007383205650573","url":null,"abstract":"Hotel Room Sales prediction using previous booking data is a prominent research topic for the online travel agency (OTA) sector. Various approaches have been proposed to predict hotel room sales for different prediction horizons, such as yearly demand or daily number of reservations. An OTA website includes offers of many companies for the same hotel, and the position of the company’s offer in OTA website depends on the bid amount given for each click by the company. Therefore, the accurate prediction of the sales amount for a given bid is a crucial need in revenue and cost management for the companies in the sector. In this paper, we forecast the next day’s sales amount in order to provide an estimate of daily revenue generated per hotel. An important contribution of our study is to use an enriched dataset constructed by combining the most informative features proposed in various related studies for hotel sales prediction. Moreover, we enrich this dataset with a set of OTA specific features that possess information about the relative position of the company’s offers to that of its competitors in a travel metasearch engine website. We provide a real application on the hotel room sales data of a large OTA in Turkey. The comparative results show that enrichment of the input representation with the OTA-specific additional features increases the generalization ability of the prediction models, and tree-based boosting algorithms perform the best results on this task.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"255 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133390055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Cascaded Acoustic Group and Individual Feature Selection for Recognition of Food Likability 食物喜爱度识别的级联声学群和个体特征选择
Pub Date : 2019-02-19 DOI: 10.5220/0007683708810886
Dara Pir
This paper presents the novel Cascaded acoustic Group and Individual Feature Selection (CGI-FS) method for automatic recognition of food likability rating addressed in the ICMI 2018 Eating Analysis and Tracking Challenge’s Likability Sub-Challenge. Employing the speech and video recordings of the iHEARu-EAT database, the Likability Sub-Challenge attempts to recognize self-reported binary labels, ‘Neutral’ and ‘Like’, assigned by subjects to food they consumed while speaking. CGI-FS uses an audio approach and performs a sequence of two feature selection operations by considering the acoustic feature space first in groups and then individually. In CGI-FS, an acoustic group feature is defined as a collection of features generated by the application of a single statistical functional to a specified set of audio low-level descriptors. We investigate the performance of CGI-FS using four different classifiers and evaluate the relevance of group features to the task. All four CGI-FS system results outperform the Likability Sub-Challenge baseline on iHEARu-EAT development data with the best performance achieving a 9.8% relative Unweighted Average Recall improvement over it.
本文提出了一种新的级联声学组和个体特征选择(CGI-FS)方法,用于自动识别ICMI 2018饮食分析和跟踪挑战的喜爱度子挑战中提出的食物喜爱度评级。利用iHEARu-EAT数据库的演讲和视频记录,“受欢迎程度子挑战”试图识别自我报告的二元标签,“中性”和“喜欢”,这些标签是由受试者在说话时分配给他们吃的食物的。CGI-FS使用音频方法,通过先分组后单独考虑声学特征空间,执行一系列的两个特征选择操作。在CGI-FS中,声学群特征被定义为对一组指定的音频低级描述符应用单个统计函数生成的特征集合。我们使用四种不同的分类器来研究CGI-FS的性能,并评估组特征与任务的相关性。在iHEARu-EAT开发数据上,所有四个CGI-FS系统结果都优于讨人喜欢度子挑战基线,最佳性能达到了9.8%的相对未加权平均召回率改进。
{"title":"Cascaded Acoustic Group and Individual Feature Selection for Recognition of Food Likability","authors":"Dara Pir","doi":"10.5220/0007683708810886","DOIUrl":"https://doi.org/10.5220/0007683708810886","url":null,"abstract":"This paper presents the novel Cascaded acoustic Group and Individual Feature Selection (CGI-FS) method for automatic recognition of food likability rating addressed in the ICMI 2018 Eating Analysis and Tracking Challenge’s Likability Sub-Challenge. Employing the speech and video recordings of the iHEARu-EAT database, the Likability Sub-Challenge attempts to recognize self-reported binary labels, ‘Neutral’ and ‘Like’, assigned by subjects to food they consumed while speaking. CGI-FS uses an audio approach and performs a sequence of two feature selection operations by considering the acoustic feature space first in groups and then individually. In CGI-FS, an acoustic group feature is defined as a collection of features generated by the application of a single statistical functional to a specified set of audio low-level descriptors. We investigate the performance of CGI-FS using four different classifiers and evaluate the relevance of group features to the task. All four CGI-FS system results outperform the Likability Sub-Challenge baseline on iHEARu-EAT development data with the best performance achieving a 9.8% relative Unweighted Average Recall improvement over it.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122018336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic Perception Enhancement for Simulated Retinal Implants 模拟视网膜植入物的自动感知增强
Pub Date : 2019-02-19 DOI: 10.5220/0007695409080914
Johannes Steffen, Georg Hille, Klaus D. Tönnies
This work addresses the automatic enhancement of visual percepts of virtual patients with retinal implants. Specifically, we render the task as an image transformation problem within an artificial neural network. The neurophysiological model of (Nanduri et al., 2012) was implemented as a tensor network to simulate a virtual patient’s visual percept and used together with an image transformation network in order to perform end-to-end learning on an image reconstruction and a classification task. The image reconstruction task was evaluated using the MNIST data set and yielded plausible results w.r.t. the learned transformations while halving the dissimilarity (mean-squared-error) of an input image to its simulated visual percept. Furthermore, the classification task was evaluated on the cifar-10 data set. Experiments show, that classification accuracy increases by approximately 12.9% when a suitable input image transformation is learned.
这项工作解决了视网膜植入的虚拟患者视觉感知的自动增强。具体来说,我们将该任务呈现为人工神经网络中的图像变换问题。(Nanduri et al., 2012)的神经生理学模型被实现为一个张量网络来模拟虚拟患者的视觉感知,并与图像变换网络一起使用,以便对图像重建和分类任务进行端到端学习。使用MNIST数据集对图像重建任务进行了评估,并在将输入图像与其模拟视觉感知的不相似性(均方误差)减半的同时,在学习转换的基础上产生了可信的结果。此外,在cifar-10数据集上对分类任务进行了评估。实验表明,当学习到合适的输入图像变换后,分类准确率提高了约12.9%。
{"title":"Automatic Perception Enhancement for Simulated Retinal Implants","authors":"Johannes Steffen, Georg Hille, Klaus D. Tönnies","doi":"10.5220/0007695409080914","DOIUrl":"https://doi.org/10.5220/0007695409080914","url":null,"abstract":"This work addresses the automatic enhancement of visual percepts of virtual patients with retinal implants. Specifically, we render the task as an image transformation problem within an artificial neural network. The neurophysiological model of (Nanduri et al., 2012) was implemented as a tensor network to simulate a virtual patient’s visual percept and used together with an image transformation network in order to perform end-to-end learning on an image reconstruction and a classification task. The image reconstruction task was evaluated using the MNIST data set and yielded plausible results w.r.t. the learned transformations while halving the dissimilarity (mean-squared-error) of an input image to its simulated visual percept. Furthermore, the classification task was evaluated on the cifar-10 data set. Experiments show, that classification accuracy increases by approximately 12.9% when a suitable input image transformation is learned.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122401661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annealing by Increasing Resampling in the Unified View of Simulated Annealing 模拟退火统一观点下增加重采样退火
Pub Date : 2019-02-19 DOI: 10.5220/0007380701730180
Yasunobu Imamura, N. Higuchi, T. Shinohara, K. Hirata, T. Kuboyama
Annealing by Increasing Resampling (AIR) is a stochastic hill-climbing optimization by resampling with increasing size for evaluating an objective function. In this paper, we introduce a unified view of the conventional Simulated Annealing (SA) and AIR. In this view, we generalize both SA and AIR to a stochastic hill-climbing for objective functions with stochastic fluctuations, i.e., logit and probit, respectively. Since the logit function is approximated by the probit function, we show that AIR is regarded as an approximation of SA. The experimental results on sparse pivot selection and annealing-based clustering also support that AIR is an approximation of SA. Moreover, when an objective function requires a large number of samples, AIR is much faster than SA without sacrificing the quality of the results.
增加重采样退火(AIR)是一种随机爬坡优化方法,通过增加重采样的大小来评估目标函数。本文介绍了传统模拟退火(SA)和模拟退火(AIR)的统一观点。在这种观点下,我们将SA和AIR分别推广为具有随机波动的目标函数的随机爬坡,即logit和probit。由于logit函数由probit函数近似,我们证明AIR被视为SA的近似。稀疏枢轴选择和退火聚类的实验结果也支持AIR是SA的近似。此外,当目标函数需要大量样本时,AIR在不牺牲结果质量的情况下比SA快得多。
{"title":"Annealing by Increasing Resampling in the Unified View of Simulated Annealing","authors":"Yasunobu Imamura, N. Higuchi, T. Shinohara, K. Hirata, T. Kuboyama","doi":"10.5220/0007380701730180","DOIUrl":"https://doi.org/10.5220/0007380701730180","url":null,"abstract":"Annealing by Increasing Resampling (AIR) is a stochastic hill-climbing optimization by resampling with increasing size for evaluating an objective function. In this paper, we introduce a unified view of the conventional Simulated Annealing (SA) and AIR. In this view, we generalize both SA and AIR to a stochastic hill-climbing for objective functions with stochastic fluctuations, i.e., logit and probit, respectively. Since the logit function is approximated by the probit function, we show that AIR is regarded as an approximation of SA. The experimental results on sparse pivot selection and annealing-based clustering also support that AIR is an approximation of SA. Moreover, when an objective function requires a large number of samples, AIR is much faster than SA without sacrificing the quality of the results.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116516995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using Recurrent Neural Networks for Action and Intention Recognition of Car Drivers 基于递归神经网络的汽车驾驶员行为与意图识别
Pub Date : 2019-02-19 DOI: 10.5220/0007682502320242
Martin Torstensson, B. Durán, Cristofer Englund
Traffic situations leading up to accidents have been shown to be greatly affected by human errors. To reduce these errors, warning systems such as Driver Alert Control, Collision Warning and Lane Departure Warning have been introduced. However, there is still room for improvement, both regarding the timing of when a warning should be given as well as the time needed to detect a hazardous situation in advance. Two factors that affect when a warning should be given are the environment and the actions of the driver. This study proposes an artificial neural network-based approach consisting of a convolutional neural network and a recurrent neural network with long short-term memory to detect and predict different actions of a driver inside a vehicle. The network achieved an accuracy of 84% while predicting the actions of the driver in the next frame, and an accuracy of 58% 20 frames ahead with a sampling rate of approximately 30 frames per second.
已经证明,导致事故的交通状况在很大程度上受到人为失误的影响。为了减少这些错误,引入了驾驶员警报控制、碰撞警告和车道偏离警告等警告系统。然而,在发出警告的时机以及提前发现危险情况所需的时间方面,仍有改进的余地。影响何时发出警告的两个因素是环境和驾驶员的行为。本研究提出了一种基于人工神经网络的方法,该方法由卷积神经网络和具有长短期记忆的递归神经网络组成,用于检测和预测车内驾驶员的不同动作。在预测下一帧驾驶员的动作时,该网络的准确率达到84%,在采样率约为每秒30帧的情况下,提前20帧的准确率达到58%。
{"title":"Using Recurrent Neural Networks for Action and Intention Recognition of Car Drivers","authors":"Martin Torstensson, B. Durán, Cristofer Englund","doi":"10.5220/0007682502320242","DOIUrl":"https://doi.org/10.5220/0007682502320242","url":null,"abstract":"Traffic situations leading up to accidents have been shown to be greatly affected by human errors. To reduce these errors, warning systems such as Driver Alert Control, Collision Warning and Lane Departure Warning have been introduced. However, there is still room for improvement, both regarding the timing of when a warning should be given as well as the time needed to detect a hazardous situation in advance. Two factors that affect when a warning should be given are the environment and the actions of the driver. This study proposes an artificial neural network-based approach consisting of a convolutional neural network and a recurrent neural network with long short-term memory to detect and predict different actions of a driver inside a vehicle. The network achieved an accuracy of 84% while predicting the actions of the driver in the next frame, and an accuracy of 58% 20 frames ahead with a sampling rate of approximately 30 frames per second.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114486880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
International Conference on Pattern Recognition Applications and Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1