首页 > 最新文献

Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)最新文献

英文 中文
Dual Branch Network Towards Accurate Printed Mathematical Expression Recognition 面向印刷数学表达式精确识别的双分支网络
Yuqing Wang, Zhenyu Weng, Zhaokun Zhou, Shuaijian Ji, Zhongjie Ye, Yuesheng Zhu
{"title":"Dual Branch Network Towards Accurate Printed Mathematical Expression Recognition","authors":"Yuqing Wang, Zhenyu Weng, Zhaokun Zhou, Shuaijian Ji, Zhongjie Ye, Yuesheng Zhu","doi":"10.1007/978-3-031-15937-4_50","DOIUrl":"https://doi.org/10.1007/978-3-031-15937-4_50","url":null,"abstract":"","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"214 1","pages":"594-606"},"PeriodicalIF":0.0,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74950127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PE-YOLO: Pyramid Enhancement Network for Dark Object Detection 用于暗目标检测的金字塔增强网络
Xi Yin, Zhen Yu, Zetao Fei, Wen Lv, Xinchen Gao
Current object detection models have achieved good results on many benchmark datasets, detecting objects in dark conditions remains a large challenge. To address this issue, we propose a pyramid enhanced network (PENet) and joint it with YOLOv3 to build a dark object detection framework named PE-YOLO. Firstly, PENet decomposes the image into four components of different resolutions using the Laplacian pyramid. Specifically we propose a detail processing module (DPM) to enhance the detail of images, which consists of context branch and edge branch. In addition, we propose a low-frequency enhancement filter (LEF) to capture low-frequency semantics and prevent high-frequency noise. PE-YOLO adopts an end-to-end joint training approach and only uses normal detection loss to simplify the training process. We conduct experiments on the low-light object detection dataset ExDark to demonstrate the effectiveness of ours. The results indicate that compared with other dark detectors and low-light enhancement models, PE-YOLO achieves the advanced results, achieving 78.0% in mAP and 53.6 in FPS, respectively, which can adapt to object detection under different low-light conditions. The code is available at https://github.com/XiangchenYin/PE-YOLO.
目前的目标检测模型已经在许多基准数据集上取得了很好的效果,但在黑暗条件下检测目标仍然是一个很大的挑战。为了解决这个问题,我们提出了一个金字塔增强网络(PENet),并将其与YOLOv3结合,构建了一个暗目标检测框架PE-YOLO。首先,PENet利用拉普拉斯金字塔将图像分解为四个不同分辨率的分量。具体来说,我们提出了一个细节处理模块(DPM)来增强图像的细节,该模块由上下文分支和边缘分支组成。此外,我们提出了一种低频增强滤波器(LEF)来捕获低频语义并防止高频噪声。PE-YOLO采用端到端联合训练方式,只使用正常的检测损耗,简化了训练过程。我们在低光目标检测数据集ExDark上进行了实验,验证了我们的方法的有效性。结果表明,PE-YOLO与其他暗探测器和弱光增强模型相比,取得了较好的效果,mAP和FPS分别达到78.0%和53.6,能够适应不同弱光条件下的目标检测。代码可在https://github.com/XiangchenYin/PE-YOLO上获得。
{"title":"PE-YOLO: Pyramid Enhancement Network for Dark Object Detection","authors":"Xi Yin, Zhen Yu, Zetao Fei, Wen Lv, Xinchen Gao","doi":"10.48550/arXiv.2307.10953","DOIUrl":"https://doi.org/10.48550/arXiv.2307.10953","url":null,"abstract":"Current object detection models have achieved good results on many benchmark datasets, detecting objects in dark conditions remains a large challenge. To address this issue, we propose a pyramid enhanced network (PENet) and joint it with YOLOv3 to build a dark object detection framework named PE-YOLO. Firstly, PENet decomposes the image into four components of different resolutions using the Laplacian pyramid. Specifically we propose a detail processing module (DPM) to enhance the detail of images, which consists of context branch and edge branch. In addition, we propose a low-frequency enhancement filter (LEF) to capture low-frequency semantics and prevent high-frequency noise. PE-YOLO adopts an end-to-end joint training approach and only uses normal detection loss to simplify the training process. We conduct experiments on the low-light object detection dataset ExDark to demonstrate the effectiveness of ours. The results indicate that compared with other dark detectors and low-light enhancement models, PE-YOLO achieves the advanced results, achieving 78.0% in mAP and 53.6 in FPS, respectively, which can adapt to object detection under different low-light conditions. The code is available at https://github.com/XiangchenYin/PE-YOLO.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"17 1","pages":"163-174"},"PeriodicalIF":0.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84810290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Variational Autoencoders for Anomaly Detection in Respiratory Sounds 呼吸音异常检测的变分自编码器
Michele Cozzatti, Federico Simonetta, S. Ntalampiras
. This paper proposes a weakly-supervised machine learning-based approach aiming at a tool to alert patients about possible respiratory diseases. Various types of pathologies may affect the respiratory system, potentially leading to severe diseases and, in certain cases, death. In general, effective prevention practices are considered as major actors towards the improvement of the patient’s health condition. The proposed method strives to realize an easily accessible tool for the automatic diagnosis of respiratory diseases. Specifically, the method leverages Variational Autoencoder architectures permitting the usage of training pipelines of limited complexity and relatively small-sized datasets. Im-portantly, it offers an accuracy of 57%, which is in line with the existing strongly-supervised approaches.
. 本文提出了一种基于弱监督机器学习的方法,旨在开发一种工具来提醒患者可能的呼吸道疾病。各种类型的病理可能影响呼吸系统,可能导致严重的疾病,在某些情况下,甚至死亡。总的来说,有效的预防措施被认为是改善病人健康状况的主要因素。本方法旨在实现一种易于使用的呼吸系统疾病自动诊断工具。具体来说,该方法利用变分自编码器架构,允许使用有限复杂性和相对较小规模的数据集的训练管道。重要的是,它提供了57%的准确率,这与现有的强监督方法一致。
{"title":"Variational Autoencoders for Anomaly Detection in Respiratory Sounds","authors":"Michele Cozzatti, Federico Simonetta, S. Ntalampiras","doi":"10.48550/arXiv.2208.03326","DOIUrl":"https://doi.org/10.48550/arXiv.2208.03326","url":null,"abstract":". This paper proposes a weakly-supervised machine learning-based approach aiming at a tool to alert patients about possible respiratory diseases. Various types of pathologies may affect the respiratory system, potentially leading to severe diseases and, in certain cases, death. In general, effective prevention practices are considered as major actors towards the improvement of the patient’s health condition. The proposed method strives to realize an easily accessible tool for the automatic diagnosis of respiratory diseases. Specifically, the method leverages Variational Autoencoder architectures permitting the usage of training pipelines of limited complexity and relatively small-sized datasets. Im-portantly, it offers an accuracy of 57%, which is in line with the existing strongly-supervised approaches.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"1 1","pages":"333-345"},"PeriodicalIF":0.0,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78652278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Deep Feature Learning for Medical Acoustics 医学声学的深度特征学习
Alessandro Poire, Federico Simonetta, S. Ntalampiras
. The purpose of this paper is to compare different learnable frontends in medical acoustics tasks. A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies. After obtaining two suitable datasets, we proceeded to classify the sounds using two learnable state-of-art frontends – LEAF and nnAudio – plus a non-learnable baseline frontend, i.e. Mel-filterbanks. The computed features are then fed into two different CNN models, namely VGG16 and EfficientNet. The frontends are care-fully benchmarked in terms of the number of parameters, computational resources, and effectiveness. This work demonstrates how the integration of learnable frontends in neural audio classification systems may improve performance, especially in the field of medical acoustics. However, the usage of such frameworks makes the needed amount of data even larger. Consequently, they are useful if the amount of data available for training is adequately large to assist the feature learning process.
. 本文的目的是比较医学声学任务中不同的可学前沿。已经实施了一个框架,将人类呼吸声音和心跳分为两类,即健康或受病理影响。在获得两个合适的数据集之后,我们继续使用两个可学习的最先进的前端(LEAF和nnAudio)以及一个不可学习的基线前端(即mel -filterbank)对声音进行分类。然后将计算出的特征输入到两个不同的CNN模型中,即VGG16和EfficientNet。前端在参数数量、计算资源和有效性方面进行了仔细的基准测试。这项工作证明了神经音频分类系统中可学习前端的集成如何提高性能,特别是在医学声学领域。然而,使用这样的框架会使所需的数据量变得更大。因此,如果可用于训练的数据量足够大,以辅助特征学习过程,则它们是有用的。
{"title":"Deep Feature Learning for Medical Acoustics","authors":"Alessandro Poire, Federico Simonetta, S. Ntalampiras","doi":"10.48550/arXiv.2208.03084","DOIUrl":"https://doi.org/10.48550/arXiv.2208.03084","url":null,"abstract":". The purpose of this paper is to compare different learnable frontends in medical acoustics tasks. A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies. After obtaining two suitable datasets, we proceeded to classify the sounds using two learnable state-of-art frontends – LEAF and nnAudio – plus a non-learnable baseline frontend, i.e. Mel-filterbanks. The computed features are then fed into two different CNN models, namely VGG16 and EfficientNet. The frontends are care-fully benchmarked in terms of the number of parameters, computational resources, and effectiveness. This work demonstrates how the integration of learnable frontends in neural audio classification systems may improve performance, especially in the field of medical acoustics. However, the usage of such frameworks makes the needed amount of data even larger. Consequently, they are useful if the amount of data available for training is adequately large to assist the feature learning process.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"24 1","pages":"39-50"},"PeriodicalIF":0.0,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86354174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Time Series Forecasting Models Copy the Past: How to Mitigate 时间序列预测模型复制过去:如何缓解
Chrysoula Kosma, Giannis Nikolentzos, Nancy R. Xu, M. Vazirgiannis
Time series forecasting is at the core of important application domains posing significant challenges to machine learning algorithms. Recently neural network architectures have been widely applied to the problem of time series forecasting. Most of these models are trained by minimizing a loss function that measures predictions' deviation from the real values. Typical loss functions include mean squared error (MSE) and mean absolute error (MAE). In the presence of noise and uncertainty, neural network models tend to replicate the last observed value of the time series, thus limiting their applicability to real-world data. In this paper, we provide a formal definition of the above problem and we also give some examples of forecasts where the problem is observed. We also propose a regularization term penalizing the replication of previously seen values. We evaluate the proposed regularization term both on synthetic and real-world datasets. Our results indicate that the regularization term mitigates to some extent the aforementioned problem and gives rise to more robust models.
时间序列预测是对机器学习算法提出重大挑战的重要应用领域的核心。近年来,神经网络结构被广泛应用于时间序列预测问题。这些模型中的大多数都是通过最小化损失函数来训练的,损失函数测量预测值与实际值的偏差。典型的损失函数包括均方误差(MSE)和平均绝对误差(MAE)。在存在噪声和不确定性的情况下,神经网络模型倾向于复制时间序列的最后观测值,从而限制了它们对现实世界数据的适用性。在本文中,我们给出了上述问题的一个正式定义,并给出了一些观测到该问题的预测实例。我们还提出了一个正则化项来惩罚先前看到的值的复制。我们在合成数据集和真实数据集上评估了提出的正则化项。我们的结果表明,正则化项在一定程度上缓解了上述问题,并产生了更鲁棒的模型。
{"title":"Time Series Forecasting Models Copy the Past: How to Mitigate","authors":"Chrysoula Kosma, Giannis Nikolentzos, Nancy R. Xu, M. Vazirgiannis","doi":"10.48550/arXiv.2207.13441","DOIUrl":"https://doi.org/10.48550/arXiv.2207.13441","url":null,"abstract":"Time series forecasting is at the core of important application domains posing significant challenges to machine learning algorithms. Recently neural network architectures have been widely applied to the problem of time series forecasting. Most of these models are trained by minimizing a loss function that measures predictions' deviation from the real values. Typical loss functions include mean squared error (MSE) and mean absolute error (MAE). In the presence of noise and uncertainty, neural network models tend to replicate the last observed value of the time series, thus limiting their applicability to real-world data. In this paper, we provide a formal definition of the above problem and we also give some examples of forecasts where the problem is observed. We also propose a regularization term penalizing the replication of previously seen values. We evaluate the proposed regularization term both on synthetic and real-world datasets. Our results indicate that the regularization term mitigates to some extent the aforementioned problem and gives rise to more robust models.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"887 1","pages":"366-378"},"PeriodicalIF":0.0,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72577063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling 基于重要抽样的不同复杂度神经网络结构的高效搜索
Yuhei Noda, Shota Saito, Shinichi Shirakawa
Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.
神经结构搜索(NAS)旨在实现结构设计过程的自动化,提高深度神经网络的性能。感知平台的NAS方法同时考虑性能和复杂性,可以用较低的计算资源找到性能良好的体系结构。尽管普通的NAS方法导致巨大的计算成本由于模型的重复训练,只有一次的NAS,这列车的重量supernetwork包含所有候选架构只有一次在搜索过程中,据报道导致较低的搜索成本。本文研究的是体系结构复杂性感知的一次性NAS,它优化了由预测性能和参数数量两个指标加权和组成的目标函数。在现有的方法中,为了得到具有不同复杂度的多个结构,必须使用不同的加权和系数进行多次结构搜索。本研究旨在降低查找多个体系结构的搜索成本。该方法利用多个分布生成不同复杂度的体系结构,并基于重要抽样的方法对多个分布得到的样本进行更新。该方法允许我们在单个体系结构搜索中获得具有不同复杂性的多个体系结构,从而降低了搜索成本。将该方法应用于CIAFR-10和ImageNet数据集上卷积神经网络的结构搜索。因此,与基线方法相比,所提出的方法可以在需要较少计算量的情况下发现具有不同复杂性的多个体系结构。
{"title":"Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling","authors":"Yuhei Noda, Shota Saito, Shinichi Shirakawa","doi":"10.48550/arXiv.2207.10334","DOIUrl":"https://doi.org/10.48550/arXiv.2207.10334","url":null,"abstract":"Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"100 1","pages":"607-619"},"PeriodicalIF":0.0,"publicationDate":"2022-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89676567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FedNet2Net: Saving Communication and Computations in Federated Learning with Model Growing FedNet2Net:在模型增长的联邦学习中节省通信和计算
A. Kundu, J. JáJá
Federated learning (FL) is a recently developed area of machine learning, in which the private data of a large number of distributed clients is used to develop a global model under the coordination of a central server without explicitly exposing the data. The standard FL strategy has a number of significant bottlenecks including large communication requirements and high impact on the clients' resources. Several strategies have been described in the literature trying to address these issues. In this paper, a novel scheme based on the notion of"model growing"is proposed. Initially, the server deploys a small model of low complexity, which is trained to capture the data complexity during the initial set of rounds. When the performance of such a model saturates, the server switches to a larger model with the help of function-preserving transformations. The model complexity increases as more data is processed by the clients, and the overall process continues until the desired performance is achieved. Therefore, the most complex model is broadcast only at the final stage in our approach resulting in substantial reduction in communication cost and client computational requirements. The proposed approach is tested extensively on three standard benchmarks and is shown to achieve substantial reduction in communication and client computation while achieving comparable accuracy when compared to the current most effective strategies.
联邦学习(FL)是最近发展起来的机器学习领域,它利用大量分布式客户端的私有数据在中央服务器的协调下开发全局模型,而不显式地公开数据。标准的FL策略有许多明显的瓶颈,包括大量的通信需求和对客户端资源的高影响。文献中描述了几种策略,试图解决这些问题。本文提出了一种基于“模型增长”概念的新方案。最初,服务器部署一个低复杂度的小模型,该模型经过训练以在初始轮集期间捕获数据复杂性。当这种模型的性能达到饱和时,服务器会在保留函数的转换的帮助下切换到更大的模型。随着客户端处理的数据越来越多,模型的复杂性也会增加,整个过程会一直持续下去,直到达到预期的性能。因此,在我们的方法中,最复杂的模型仅在最后阶段进行广播,从而大大降低了通信成本和客户端计算需求。所提出的方法在三个标准基准上进行了广泛的测试,结果表明,与当前最有效的策略相比,该方法大大减少了通信和客户机计算,同时实现了相当的准确性。
{"title":"FedNet2Net: Saving Communication and Computations in Federated Learning with Model Growing","authors":"A. Kundu, J. JáJá","doi":"10.48550/arXiv.2207.09568","DOIUrl":"https://doi.org/10.48550/arXiv.2207.09568","url":null,"abstract":"Federated learning (FL) is a recently developed area of machine learning, in which the private data of a large number of distributed clients is used to develop a global model under the coordination of a central server without explicitly exposing the data. The standard FL strategy has a number of significant bottlenecks including large communication requirements and high impact on the clients' resources. Several strategies have been described in the literature trying to address these issues. In this paper, a novel scheme based on the notion of\"model growing\"is proposed. Initially, the server deploys a small model of low complexity, which is trained to capture the data complexity during the initial set of rounds. When the performance of such a model saturates, the server switches to a larger model with the help of function-preserving transformations. The model complexity increases as more data is processed by the clients, and the overall process continues until the desired performance is achieved. Therefore, the most complex model is broadcast only at the final stage in our approach resulting in substantial reduction in communication cost and client computational requirements. The proposed approach is tested extensively on three standard benchmarks and is shown to achieve substantial reduction in communication and client computation while achieving comparable accuracy when compared to the current most effective strategies.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"51 1","pages":"236-247"},"PeriodicalIF":0.0,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85558609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Boosting Video Super Resolution with Patch-Based Temporal Redundancy Optimization 基于patch的时间冗余优化提高视频超分辨率
Yuhao Huang, Hang Dong, Jin-shan Pan, Chao Zhu, Yu Guo, Ding Liu, L. Fu, Fei Wang
The success of existing video super-resolution (VSR) algorithms stems mainly exploiting the temporal information from the neighboring frames. However, none of these methods have discussed the influence of the temporal redundancy in the patches with stationary objects and background and usually use all the information in the adjacent frames without any discrimination. In this paper, we observe that the temporal redundancy will bring adverse effect to the information propagation,which limits the performance of the most existing VSR methods. Motivated by this observation, we aim to improve existing VSR algorithms by handling the temporal redundancy patches in an optimized manner. We develop two simple yet effective plug and play methods to improve the performance of existing local and non-local propagation-based VSR algorithms on widely-used public videos. For more comprehensive evaluating the robustness and performance of existing VSR algorithms, we also collect a new dataset which contains a variety of public videos as testing set. Extensive evaluations show that the proposed methods can significantly improve the performance of existing VSR methods on the collected videos from wild scenarios while maintain their performance on existing commonly used datasets. The code is available at https://github.com/HYHsimon/Boosted-VSR.
现有的视频超分辨率(VSR)算法的成功主要在于利用相邻帧的时间信息。然而,这些方法都没有考虑到具有固定目标和背景的patch中时间冗余的影响,通常使用相邻帧中的所有信息而不做任何区分。在本文中,我们观察到时间冗余会给信息传播带来不利影响,从而限制了大多数现有VSR方法的性能。基于这一观察结果,我们的目标是通过优化处理时间冗余补丁来改进现有的VSR算法。我们开发了两种简单而有效的即插即用方法来改进现有的基于本地和非本地传播的VSR算法在广泛使用的公共视频上的性能。为了更全面地评估现有VSR算法的鲁棒性和性能,我们还收集了一个包含各种公共视频的新数据集作为测试集。大量的评估表明,所提出的方法可以显著提高现有VSR方法在野生场景中收集视频的性能,同时保持其在现有常用数据集上的性能。代码可在https://github.com/HYHsimon/Boosted-VSR上获得。
{"title":"Boosting Video Super Resolution with Patch-Based Temporal Redundancy Optimization","authors":"Yuhao Huang, Hang Dong, Jin-shan Pan, Chao Zhu, Yu Guo, Ding Liu, L. Fu, Fei Wang","doi":"10.48550/arXiv.2207.08674","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08674","url":null,"abstract":"The success of existing video super-resolution (VSR) algorithms stems mainly exploiting the temporal information from the neighboring frames. However, none of these methods have discussed the influence of the temporal redundancy in the patches with stationary objects and background and usually use all the information in the adjacent frames without any discrimination. In this paper, we observe that the temporal redundancy will bring adverse effect to the information propagation,which limits the performance of the most existing VSR methods. Motivated by this observation, we aim to improve existing VSR algorithms by handling the temporal redundancy patches in an optimized manner. We develop two simple yet effective plug and play methods to improve the performance of existing local and non-local propagation-based VSR algorithms on widely-used public videos. For more comprehensive evaluating the robustness and performance of existing VSR algorithms, we also collect a new dataset which contains a variety of public videos as testing set. Extensive evaluations show that the proposed methods can significantly improve the performance of existing VSR methods on the collected videos from wild scenarios while maintain their performance on existing commonly used datasets. The code is available at https://github.com/HYHsimon/Boosted-VSR.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"43 1","pages":"362-375"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87996777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Flexible Translation between Robot Actions and Language Descriptions 学习机器人动作与语言描述之间的灵活翻译
Ozan Özdemir, Matthias Kerzel, C. Weber, Jae Hee Lee, S. Wermter
{"title":"Learning Flexible Translation between Robot Actions and Language Descriptions","authors":"Ozan Özdemir, Matthias Kerzel, C. Weber, Jae Hee Lee, S. Wermter","doi":"10.1007/978-3-031-15931-2_21","DOIUrl":"https://doi.org/10.1007/978-3-031-15931-2_21","url":null,"abstract":"","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"1 1","pages":"246-257"},"PeriodicalIF":0.0,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81003045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Topic-Grained Text Representation-based Model for Document Retrieval 基于主题粒度文本表示的文档检索模型
Mengxue Du, Shasha Li, Jie Yu, Jun Ma, Bing Ji, Huijun Liu, Wuhang Lin, Zibo Yi
Document retrieval enables users to find their required documents accurately and quickly. To satisfy the requirement of retrieval efficiency, prevalent deep neural methods adopt a representation-based matching paradigm, which saves online matching time by pre-storing document representations offline. However, the above paradigm consumes vast local storage space, especially when storing the document as word-grained representations. To tackle this, we present TGTR, a Topic-Grained Text Representation-based Model for document retrieval. Following the representation-based matching paradigm, TGTR stores the document representations offline to ensure retrieval efficiency, whereas it significantly reduces the storage requirements by using novel topicgrained representations rather than traditional word-grained. Experimental results demonstrate that compared to word-grained baselines, TGTR is consistently competitive with them on TREC CAR and MS MARCO in terms of retrieval accuracy, but it requires less than 1/10 of the storage space required by them. Moreover, TGTR overwhelmingly surpasses global-grained baselines in terms of retrieval accuracy.
文档检索使用户能够准确、快速地找到所需的文档。为了满足检索效率的要求,常用的深度神经方法采用基于表示的匹配范式,通过离线预存储文档表示来节省在线匹配时间。然而,上述范例消耗了大量的本地存储空间,特别是在将文档存储为词粒度表示时。为了解决这个问题,我们提出了TGTR,一个基于主题粒度文本表示的文档检索模型。TGTR遵循基于表示的匹配范例,离线存储文档表示以确保检索效率,同时通过使用新颖的主题粒度表示而不是传统的词粒度表示,显著降低了存储需求。实验结果表明,与词粒度基线相比,TGTR在检索精度上与TREC CAR和MS MARCO具有一致的竞争力,但其所需存储空间不到TREC CAR和MS MARCO的1/10。此外,TGTR在检索精度方面压倒性地超过了全局粒度基线。
{"title":"Topic-Grained Text Representation-based Model for Document Retrieval","authors":"Mengxue Du, Shasha Li, Jie Yu, Jun Ma, Bing Ji, Huijun Liu, Wuhang Lin, Zibo Yi","doi":"10.48550/arXiv.2207.04656","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04656","url":null,"abstract":"Document retrieval enables users to find their required documents accurately and quickly. To satisfy the requirement of retrieval efficiency, prevalent deep neural methods adopt a representation-based matching paradigm, which saves online matching time by pre-storing document representations offline. However, the above paradigm consumes vast local storage space, especially when storing the document as word-grained representations. To tackle this, we present TGTR, a Topic-Grained Text Representation-based Model for document retrieval. Following the representation-based matching paradigm, TGTR stores the document representations offline to ensure retrieval efficiency, whereas it significantly reduces the storage requirements by using novel topicgrained representations rather than traditional word-grained. Experimental results demonstrate that compared to word-grained baselines, TGTR is consistently competitive with them on TREC CAR and MS MARCO in terms of retrieval accuracy, but it requires less than 1/10 of the storage space required by them. Moreover, TGTR overwhelmingly surpasses global-grained baselines in terms of retrieval accuracy.","PeriodicalId":93416,"journal":{"name":"Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)","volume":"81 1","pages":"776-788"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79194569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1