首页 > 最新文献

PeerJ preprints最新文献

英文 中文
Improving the quality of low SNR images using high SNR images 利用高信噪比图像提高低信噪比图像的质量
Pub Date : 2019-06-14 DOI: 10.7287/PEERJ.PREPRINTS.27800V1
Yaohua Xie
It is important to get data with Signal-Noise-Ratios (SNR) as high as possible. Compared to other techniques, filtering methods are fast. But they do not make full use of the characteristics of sample structure which reflected by relevant high SNR images. In this study, we propose a technique termed “TransFiltering”. It transplants the characteristics of a high SNR image to the frequency spectrum of a low SNR image by filtering. Usually, the high SNR and the low SNR image should have similar structure pattern. For example, they all come from the same image sequence. In the proposed method, Fourier transform is first performed on both of the images. Then, the frequency spectrum of the low SNR image is filtered according to that of the high SNR image. Finally, inverse Fourier transform is performed to get the image with improved SNR. Experiment results show that the proposed method is both effective and efficient.
获取尽可能高的信噪比(SNR)数据是很重要的。与其他技术相比,过滤方法速度快。但没有充分利用相关高信噪比图像所反映的样品结构特征。在这项研究中,我们提出了一种称为“跨过滤”的技术。它通过滤波将高信噪比图像的特征移植到低信噪比图像的频谱中。通常,高信噪比和低信噪比图像应该具有相似的结构模式。例如,它们都来自相同的图像序列。在该方法中,首先对两幅图像进行傅里叶变换。然后,根据高信噪比图像对低信噪比图像的频谱进行滤波。最后进行傅里叶反变换,得到信噪比提高的图像。实验结果表明,该方法是有效的。
{"title":"Improving the quality of low SNR images using high SNR images","authors":"Yaohua Xie","doi":"10.7287/PEERJ.PREPRINTS.27800V1","DOIUrl":"https://doi.org/10.7287/PEERJ.PREPRINTS.27800V1","url":null,"abstract":"It is important to get data with Signal-Noise-Ratios (SNR) as high as possible. Compared to other techniques, filtering methods are fast. But they do not make full use of the characteristics of sample structure which reflected by relevant high SNR images. In this study, we propose a technique termed “TransFiltering”. It transplants the characteristics of a high SNR image to the frequency spectrum of a low SNR image by filtering. Usually, the high SNR and the low SNR image should have similar structure pattern. For example, they all come from the same image sequence. In the proposed method, Fourier transform is first performed on both of the images. Then, the frequency spectrum of the low SNR image is filtered according to that of the high SNR image. Finally, inverse Fourier transform is performed to get the image with improved SNR. Experiment results show that the proposed method is both effective and efficient.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"305 1","pages":"e27800"},"PeriodicalIF":0.0,"publicationDate":"2019-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79822092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Streaming stochastic variational Bayes; An improved approach for Bayesian inference with data streams 流随机变分贝叶斯;数据流贝叶斯推理的改进方法
Pub Date : 2019-06-10 DOI: 10.7287/peerj.preprints.27790v1
Nadheesh Jihan, Malith Jayasinghe, S. Perera
Online learning is an essential tool for predictive analysis based on continuous, endless data streams. Adopting Bayesian inference for online settings allows hierarchical modeling while representing the uncertainty of model parameters. Existing online inference techniques are motivated by either the traditional Bayesian updating or the stochastic optimizations. However, traditional Bayesian updating suffers from overconfidence posteriors, where posterior variance becomes too inadequate to adapt to new changes to the posterior. On the other hand, stochastic optimization of variational objective demands exhausting additional analysis to optimize a hyperparameter that controls the posterior variance. In this paper, we present ''Streaming Stochastic Variational Bayes" (SSVB)—a novel online approximation inference framework for data streaming to address the aforementioned shortcomings of the current state-of-the-art. SSVB adjusts its posterior variance duly without any user-specified hyperparameters while efficiently accommodating the drifting patterns to the posteriors. Moreover, SSVB can be easily adopted by practitioners for a wide range of models (i.e. simple regression models to complex hierarchical models) with little additional analysis. We appraised the performance of SSVB against Population Variational Inference (PVI), Stochastic Variational Inference (SVI) and Black-box Streaming Variational Bayes (BB-SVB) using two non-conjugate probabilistic models; multinomial logistic regression and linear mixed effect model. Furthermore, we also discuss the significant accuracy gain with SSVB based inference against conventional online learning models for each task.
在线学习是基于连续、无休止的数据流进行预测分析的重要工具。采用贝叶斯推理对在线设置进行分层建模,同时表示模型参数的不确定性。现有的在线推理技术要么采用传统的贝叶斯更新算法,要么采用随机优化算法。然而,传统的贝叶斯更新存在后验过于自信的问题,后验方差不足以适应后验的新变化。另一方面,变分目标的随机优化需要耗费大量的额外分析来优化控制后验方差的超参数。在本文中,我们提出了“流随机变分贝叶斯”(SSVB) -一种用于数据流的新型在线近似推理框架,以解决当前最先进技术的上述缺点。SSVB在没有任何用户指定的超参数的情况下适当地调整其后向方差,同时有效地适应后向漂移模式。此外,SSVB可以很容易地被从业人员用于广泛的模型(即简单的回归模型到复杂的层次模型),几乎不需要额外的分析。我们使用两个非共轭概率模型评估了SSVB对总体变分推理(PVI)、随机变分推理(SVI)和黑盒流变分贝叶斯(BB-SVB)的性能;多项逻辑回归和线性混合效应模型。此外,我们还讨论了基于SSVB的推理在每个任务上对传统在线学习模型的显著精度增益。
{"title":"Streaming stochastic variational Bayes; An improved approach for Bayesian inference with data streams","authors":"Nadheesh Jihan, Malith Jayasinghe, S. Perera","doi":"10.7287/peerj.preprints.27790v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27790v1","url":null,"abstract":"Online learning is an essential tool for predictive analysis based on continuous, endless data streams. Adopting Bayesian inference for online settings allows hierarchical modeling while representing the uncertainty of model parameters. Existing online inference techniques are motivated by either the traditional Bayesian updating or the stochastic optimizations. However, traditional Bayesian updating suffers from overconfidence posteriors, where posterior variance becomes too inadequate to adapt to new changes to the posterior. On the other hand, stochastic optimization of variational objective demands exhausting additional analysis to optimize a hyperparameter that controls the posterior variance. In this paper, we present ''Streaming Stochastic Variational Bayes\" (SSVB)—a novel online approximation inference framework for data streaming to address the aforementioned shortcomings of the current state-of-the-art. SSVB adjusts its posterior variance duly without any user-specified hyperparameters while efficiently accommodating the drifting patterns to the posteriors. Moreover, SSVB can be easily adopted by practitioners for a wide range of models (i.e. simple regression models to complex hierarchical models) with little additional analysis. We appraised the performance of SSVB against Population Variational Inference (PVI), Stochastic Variational Inference (SVI) and Black-box Streaming Variational Bayes (BB-SVB) using two non-conjugate probabilistic models; multinomial logistic regression and linear mixed effect model. Furthermore, we also discuss the significant accuracy gain with SSVB based inference against conventional online learning models for each task.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"52 1","pages":"e27790"},"PeriodicalIF":0.0,"publicationDate":"2019-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81560349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine learning approach for automated defense against network intrusions 自动防御网络入侵的机器学习方法
Pub Date : 2019-06-03 DOI: 10.7287/PEERJ.PREPRINTS.27777V1
Farhaan Noor Hamdani, Farheen Siddiqui
With the advent of the internet, there is a major concern regarding the growing number of attacks, where the attacker can target any computing or network resource remotely Also, the exponential shift towards the use of smart-end technology devices, results in various security related concerns, which include detection of anomalous data traffic on the internet. Unravelling legitimate traffic from malignant traffic is a complex task itself. Many attacks affect system resources thereby degenerating their computing performance. In this paper we propose a framework of supervised model implemented using machine learning algorithms which can enhance or aid the existing intrusion detection systems, for detection of variety of attacks. Here KDD (knowledge data and discovery) dataset is used as a benchmark. In accordance with detective abilities, we also analyze their performance, accuracy, alerts-logs and compute their overall detection rate. These machine learning algorithms are validated and tested in terms of accuracy, precision, true-false positives and negatives. Experimental results show that these methods are effective, generating low false positives and can be operative in building a defense line against network intrusions. Further, we compare these algorithms in terms of various functional parameters
随着互联网的出现,越来越多的攻击引起了人们的关注,攻击者可以远程攻击任何计算或网络资源。此外,智能终端技术设备的使用呈指数级转变,导致各种安全相关问题,包括检测互联网上的异常数据流量。从恶意流量中分离合法流量本身就是一项复杂的任务。许多攻击会影响系统资源,从而降低系统的计算性能。在本文中,我们提出了一个使用机器学习算法实现的监督模型框架,该框架可以增强或辅助现有的入侵检测系统,以检测各种攻击。这里使用KDD(知识数据和发现)数据集作为基准。根据检测能力,分析了它们的性能、准确率、报警日志,并计算了它们的总体检测率。这些机器学习算法在准确性、精度、真假阳性和阴性方面得到了验证和测试。实验结果表明,这些方法是有效的,产生的误报率低,可用于建立防御网络入侵的防线。此外,我们比较了这些算法在不同的功能参数
{"title":"Machine learning approach for automated defense against network intrusions","authors":"Farhaan Noor Hamdani, Farheen Siddiqui","doi":"10.7287/PEERJ.PREPRINTS.27777V1","DOIUrl":"https://doi.org/10.7287/PEERJ.PREPRINTS.27777V1","url":null,"abstract":"With the advent of the internet, there is a major concern regarding the growing number of attacks, where the attacker can target any computing or network resource remotely Also, the exponential shift towards the use of smart-end technology devices, results in various security related concerns, which include detection of anomalous data traffic on the internet. Unravelling legitimate traffic from malignant traffic is a complex task itself. Many attacks affect system resources thereby degenerating their computing performance. In this paper we propose a framework of supervised model implemented using machine learning algorithms which can enhance or aid the existing intrusion detection systems, for detection of variety of attacks. Here KDD (knowledge data and discovery) dataset is used as a benchmark. In accordance with detective abilities, we also analyze their performance, accuracy, alerts-logs and compute their overall detection rate.\u0000 These machine learning algorithms are validated and tested in terms of accuracy, precision, true-false positives and negatives. Experimental results show that these methods are effective, generating low false positives and can be operative in building a defense line against network intrusions. Further, we compare these algorithms in terms of various functional parameters","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"42 1","pages":"e27777"},"PeriodicalIF":0.0,"publicationDate":"2019-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82418352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A new algorithm for band detection and pattern extraction on pulsed-field gel electrophoresis images 一种脉冲场凝胶电泳图像的条带检测与模式提取新算法
Pub Date : 2019-06-02 DOI: 10.7287/PEERJ.PREPRINTS.27771V1
Mohammad Rezaei, Naser Zohorian, Nemat Soltani, P. Mohajeri
This paper presents a new approach for band detection and pattern recognition for molecule types. Although a few studies have examined band detection, but there is still no automatic method that can perform well despite the high noise. The band detection algorithm was designed in two parts, including band location and lane pattern recognition. In order to improve band detection and remove undesirable bands, the shape and light intensity of the bands were used as features. One-hundred lane images were selected for the training stage and 350 lane images for the testing stage to evaluate the proposed algorithm in a random fashion. All the images were prepared using PFGE BIORAD at the Microbiology Laboratory of Kermanshah University of Medical Sciences. An adaptive median filter with a filter size of 5x5 was selected as the optimal filter for removing noise. The results showed that the proposed algorithm has a 98.45% accuracy and is associated with less errors compared to other methods. The proposed algorithm has a good accuracy for band detection in pulsed-field gel electrophoresis images. Considering the shape of the peaks caused by the bands in the vertical projection profile of the signal, this method can reduce band detection errors. To improve accuracy, we recommend that the designed algorithm be examined for other types of molecules as well.
本文提出了一种分子类型波段检测和模式识别的新方法。虽然有一些研究对波段检测进行了研究,但目前还没有一种自动方法能够在高噪声环境下表现良好。该算法分为两部分进行设计,包括频段定位和信道模式识别。为了改进波段检测,去除不需要的波段,利用波段的形状和光强作为特征。选择100个车道图像作为训练阶段,350个车道图像作为测试阶段,随机评估算法。所有图像均使用Kermanshah医科大学微生物实验室的PFGE BIORAD制备。选择滤波器尺寸为5x5的自适应中值滤波器作为去除噪声的最佳滤波器。结果表明,该算法的准确率为98.45%,与其他方法相比误差较小。该算法对脉冲场凝胶电泳图像的波段检测具有较好的精度。该方法考虑了信号垂直投影剖面中波段引起的峰值形状,减小了波段检测误差。为了提高准确性,我们建议对设计的算法也进行其他类型分子的检查。
{"title":"A new algorithm for band detection and pattern extraction on pulsed-field gel electrophoresis images","authors":"Mohammad Rezaei, Naser Zohorian, Nemat Soltani, P. Mohajeri","doi":"10.7287/PEERJ.PREPRINTS.27771V1","DOIUrl":"https://doi.org/10.7287/PEERJ.PREPRINTS.27771V1","url":null,"abstract":"This paper presents a new approach for band detection and pattern recognition for molecule types. Although a few studies have examined band detection, but there is still no automatic method that can perform well despite the high noise. The band detection algorithm was designed in two parts, including band location and lane pattern recognition. In order to improve band detection and remove undesirable bands, the shape and light intensity of the bands were used as features. One-hundred lane images were selected for the training stage and 350 lane images for the testing stage to evaluate the proposed algorithm in a random fashion. All the images were prepared using PFGE BIORAD at the Microbiology Laboratory of Kermanshah University of Medical Sciences. An adaptive median filter with a filter size of 5x5 was selected as the optimal filter for removing noise. The results showed that the proposed algorithm has a 98.45% accuracy and is associated with less errors compared to other methods. The proposed algorithm has a good accuracy for band detection in pulsed-field gel electrophoresis images. Considering the shape of the peaks caused by the bands in the vertical projection profile of the signal, this method can reduce band detection errors. To improve accuracy, we recommend that the designed algorithm be examined for other types of molecules as well.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"49 1","pages":"e27771"},"PeriodicalIF":0.0,"publicationDate":"2019-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84462576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A research institution framework for publishing open code to enable reproducible science 一个研究机构框架,用于发布开放代码以实现可复制的科学
Pub Date : 2019-05-28 DOI: 10.17608/K6.AUCKLAND.9789461.V1
T. Etherington, B. Jolly, Jan Zörner, Nicholas K. Spencer
Reproducible science is greatly aided by open publishing of scientific computer code. There are also many institutional benefits for encouraging the publication of scientific code, but there are also institutional considerations around intellectual property and risk. We discuss questions around scientific code publishing from the perspective of a research organisation asking: who will be involved, how should code be licensed, where should code be published, how to get credit, what standards, and what costs? In reviewing advice and evidence relevant to these questions we propose a research institution framework for publishing open scientific code to enable reproducible science.
科学计算机代码的公开发布极大地促进了可重复性科学的发展。鼓励发表科学代码也有很多制度上的好处,但是也有关于知识产权和风险的制度上的考虑。我们从一个研究机构的角度来讨论有关科学代码发布的问题:谁将参与其中,代码应该如何获得许可,代码应该在哪里发布,如何获得信誉,什么标准,以及什么成本?在审查与这些问题相关的建议和证据时,我们提出了一个研究机构框架,用于发布开放的科学代码,以实现可复制的科学。
{"title":"A research institution framework for publishing open code to enable reproducible science","authors":"T. Etherington, B. Jolly, Jan Zörner, Nicholas K. Spencer","doi":"10.17608/K6.AUCKLAND.9789461.V1","DOIUrl":"https://doi.org/10.17608/K6.AUCKLAND.9789461.V1","url":null,"abstract":"Reproducible science is greatly aided by open publishing of scientific computer code. There are also many institutional benefits for encouraging the publication of scientific code, but there are also institutional considerations around intellectual property and risk. We discuss questions around scientific code publishing from the perspective of a research organisation asking: who will be involved, how should code be licensed, where should code be published, how to get credit, what standards, and what costs? In reviewing advice and evidence relevant to these questions we propose a research institution framework for publishing open scientific code to enable reproducible science.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"43 1","pages":"e27762"},"PeriodicalIF":0.0,"publicationDate":"2019-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86660004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Sparse-Modeling based approach for Class-Specific feature selection 基于稀疏建模的类特定特征选择方法
Pub Date : 2019-05-17 DOI: 10.7287/peerj.preprints.27740v1
Davide Nardone, A. Ciaramella, A. Staiano
In this work, we propose a novel Feature Selection framework, called Sparse-Modeling Based Approach for Class Specific Feature Selection (SMBA-CSFS), that simultaneously exploits the idea of Sparse Modeling and Class-Specific Feature Selection. Feature selection plays a key role in several fields (e.g., computational biology), making it possible to treat models with fewer variables which, in turn, are easier to explain, by providing valuable insights on the importance of their role, and might speed the experimental validation up. Unfortunately, also corroborated by the no free lunch theorems, none of the approaches in literature is the most apt to detect the optimal feature subset for building a final model, thus it still represents a challenge. The proposed feature selection procedure conceives a two steps approach: (a) a sparse modeling-based learning technique is first used to find the best subset of features, for each class of a training set; (b) the discovered feature subsets are then fed to a class-specific feature selection scheme, in order to assess the effectiveness of the selected features in classification tasks. To this end, an ensemble of classifiers is built, where each classifier is trained on its own feature subset discovered in the previous phase, and a proper decision rule is adopted to compute the ensemble responses. In order to evaluate the performance of the proposed method, extensive experiments have been performed on publicly available datasets, in particular belonging to the computational biology field where feature selection is indispensable: the acute lymphoblastic leukemia and acute myeloid leukemia, the human carcinomas, the human lung carcinomas, the diffuse large B-cell lymphoma, and the malignant glioma. SMBA-CSFS is able to identify/retrieve the most representative features that maximize the classification accuracy. With top 20 and 80 features, SMBA-CSFS exhibits a promising performance when compared to its competitors from literature, on all considered datasets, especially those with a higher number of features. Experiments show that the proposed approach might outperform the state-of-the-art methods when the number of features is high. For this reason, the introduced approach proposes itself for selection and classification of data with a large number of features and classes.
在这项工作中,我们提出了一个新的特征选择框架,称为基于稀疏建模的类特定特征选择方法(SMBA-CSFS),它同时利用了稀疏建模和类特定特征选择的思想。特征选择在几个领域(例如,计算生物学)中起着关键作用,使得用更少的变量来处理模型成为可能,反过来,通过提供对其作用重要性的有价值的见解,更容易解释,并可能加快实验验证。不幸的是,也由没有免费的午餐定理证实,文献中的方法都不是最容易检测到构建最终模型的最佳特征子集,因此它仍然是一个挑战。所提出的特征选择过程采用两步方法:(a)首先使用基于稀疏建模的学习技术为训练集的每一类找到最佳特征子集;(b)然后将发现的特征子集馈送到特定类别的特征选择方案中,以评估所选特征在分类任务中的有效性。为此,构建一个分类器集成,每个分类器在前一阶段发现自己的特征子集上进行训练,并采用适当的决策规则计算集成响应。为了评估所提出的方法的性能,已经在公开可用的数据集上进行了大量的实验,特别是属于计算生物学领域的数据集,其中特征选择是必不可少的:急性淋巴细胞白血病和急性髓性白血病,人类癌症,人类肺癌,弥漫性大b细胞淋巴瘤和恶性胶质瘤。SMBA-CSFS能够识别/检索最具代表性的特征,最大限度地提高分类精度。与文献中的竞争对手相比,SMBA-CSFS具有前20和前80个特性,在所有考虑的数据集上,特别是那些具有更多特性的数据集上,表现出了很好的性能。实验表明,当特征数量较大时,所提出的方法可能优于目前最先进的方法。因此,所引入的方法可以用于具有大量特征和类别的数据的选择和分类。
{"title":"A Sparse-Modeling based approach for Class-Specific feature selection","authors":"Davide Nardone, A. Ciaramella, A. Staiano","doi":"10.7287/peerj.preprints.27740v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27740v1","url":null,"abstract":"In this work, we propose a novel Feature Selection framework, called Sparse-Modeling Based Approach for Class Specific Feature Selection (SMBA-CSFS), that simultaneously exploits the idea of Sparse Modeling and Class-Specific Feature Selection. Feature selection plays a key role in several fields (e.g., computational biology), making it possible to treat models with fewer variables which, in turn, are easier to explain, by providing valuable insights on the importance of their role, and might speed the experimental validation up. Unfortunately, also corroborated by the no free lunch theorems, none of the approaches in literature is the most apt to detect the optimal feature subset for building a final model, thus it still represents a challenge. The proposed feature selection procedure conceives a two steps approach: (a) a sparse modeling-based learning technique is first used to find the best subset of features, for each class of a training set; (b) the discovered feature subsets are then fed to a class-specific feature selection scheme, in order to assess the effectiveness of the selected features in classification tasks. To this end, an ensemble of classifiers is built, where each classifier is trained on its own feature subset discovered in the previous phase, and a proper decision rule is adopted to compute the ensemble responses. In order to evaluate the performance of the proposed method, extensive experiments have been performed on publicly available datasets, in particular belonging to the computational biology field where feature selection is indispensable: the acute lymphoblastic leukemia and acute myeloid leukemia, the human carcinomas, the human lung carcinomas, the diffuse large B-cell lymphoma, and the malignant glioma. SMBA-CSFS is able to identify/retrieve the most representative features that maximize the classification accuracy. With top 20 and 80 features, SMBA-CSFS exhibits a promising performance when compared to its competitors from literature, on all considered datasets, especially those with a higher number of features. Experiments show that the proposed approach might outperform the state-of-the-art methods when the number of features is high. For this reason, the introduced approach proposes itself for selection and classification of data with a large number of features and classes.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"83 1","pages":"e27740"},"PeriodicalIF":0.0,"publicationDate":"2019-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77317804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Sales forecasting using multivariate long short term memory network models 多变量长短期记忆网络模型的销售预测
Pub Date : 2019-05-08 DOI: 10.7287/peerj.preprints.27712v1
Suleka Helmini, Nadheesh Jihan, Malith Jayasinghe, S. Perera
In the retail domain, estimating the sales before actual sales become known plays a key role in maintaining a successful business. This is due to the fact that most crucial decisions are bound to be based on these forecasts. Statistical sales forecasting models like ARIMA (Auto-Regressive Integrated Moving Average), can be identified as one of the most traditional and commonly used forecasting methodologies. Even though these models are capable of producing satisfactory forecasts for linear time series data they are not suitable for analyzing non-linear data. Therefore, machine learning models (such as Random Forest Regression, XGBoost) have been employed frequently as they were able to achieve better results using non-linear data. The recent research shows that deep learning models (e.g. recurrent neural networks) can provide higher accuracy in predictions compared to machine learning models due to their ability to persist information and identify temporal relationships. In this paper, we adopt a special variant of Long Short Term Memory (LSTM) network called LSTM model with peephole connections for sales prediction. We first build our model using historical features for sales forecasting. We compare the results of this initial LSTM model with multiple machine learning models, namely, the Extreme Gradient Boosting model (XGB) and Random Forest Regressor model(RFR). We further improve the prediction accuracy of the initial model by incorporating features that describe the future that is known to us in the current moment, an approach that has not been explored in previous state-of-the-art LSTM based forecasting models. The initial LSTM model we develop outperforms the machine learning models achieving 12% - 14% improvement whereas the improved LSTM model achieves 11% - 13% improvement compared to the improved machine learning models. Furthermore, we also show that our improved LSTM model can obtain a 20% - 21% improvement compared to the initial LSTM model, achieving significant improvement.
在零售领域,在知道实际销售额之前估计销售额对于维持成功的业务起着关键作用。这是因为大多数关键决策必然是基于这些预测。像ARIMA(自回归综合移动平均)这样的统计销售预测模型可以被认为是最传统和最常用的预测方法之一。尽管这些模型能够对线性时间序列数据产生令人满意的预测,但它们不适合分析非线性数据。因此,机器学习模型(如Random Forest Regression, XGBoost)经常被使用,因为它们能够使用非线性数据获得更好的结果。最近的研究表明,与机器学习模型相比,深度学习模型(如循环神经网络)可以提供更高的预测准确性,因为它们能够持久保存信息和识别时间关系。本文采用长短期记忆(LSTM)网络的一种特殊变体——带窥视孔连接的LSTM模型进行销售预测。我们首先使用销售预测的历史特征来构建模型。我们将这个初始LSTM模型的结果与多个机器学习模型,即极端梯度增强模型(XGB)和随机森林回归模型(RFR)进行比较。我们通过结合描述当前时刻已知的未来的特征进一步提高了初始模型的预测精度,这是以前最先进的基于LSTM的预测模型中尚未探索的一种方法。我们开发的初始LSTM模型优于机器学习模型,实现了12% - 14%的改进,而改进的LSTM模型与改进的机器学习模型相比实现了11% - 13%的改进。此外,我们还表明,与初始LSTM模型相比,我们改进的LSTM模型可以获得20% - 21%的改进,实现了显着的改进。
{"title":"Sales forecasting using multivariate long short term memory network models","authors":"Suleka Helmini, Nadheesh Jihan, Malith Jayasinghe, S. Perera","doi":"10.7287/peerj.preprints.27712v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27712v1","url":null,"abstract":"In the retail domain, estimating the sales before actual sales become known plays a key role in maintaining a successful business. This is due to the fact that most crucial decisions are bound to be based on these forecasts. Statistical sales forecasting models like ARIMA (Auto-Regressive Integrated Moving Average), can be identified as one of the most traditional and commonly used forecasting methodologies. Even though these models are capable of producing satisfactory forecasts for linear time series data they are not suitable for analyzing non-linear data. Therefore, machine learning models (such as Random Forest Regression, XGBoost) have been employed frequently as they were able to achieve better results using non-linear data. The recent research shows that deep learning models (e.g. recurrent neural networks) can provide higher accuracy in predictions compared to machine learning models due to their ability to persist information and identify temporal relationships. In this paper, we adopt a special variant of Long Short Term Memory (LSTM) network called LSTM model with peephole connections for sales prediction. We first build our model using historical features for sales forecasting. We compare the results of this initial LSTM model with multiple machine learning models, namely, the Extreme Gradient Boosting model (XGB) and Random Forest Regressor model(RFR). We further improve the prediction accuracy of the initial model by incorporating features that describe the future that is known to us in the current moment, an approach that has not been explored in previous state-of-the-art LSTM based forecasting models. The initial LSTM model we develop outperforms the machine learning models achieving 12% - 14% improvement whereas the improved LSTM model achieves 11% - 13% improvement compared to the improved machine learning models. Furthermore, we also show that our improved LSTM model can obtain a 20% - 21% improvement compared to the initial LSTM model, achieving significant improvement.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"135 1","pages":"e27712"},"PeriodicalIF":0.0,"publicationDate":"2019-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86829706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
GIS-based seismic hazard prediction system for urban earthquake disaster prevention planning 基于gis的城市地震防灾规划震害预测系统
Pub Date : 2019-05-07 DOI: 10.7287/peerj.preprints.3165v1
Y. Zhai, Shenglong Chen, Qianwen Ouyang
It is of great significance to conduct seismic hazard prediction in mitigating the damage caused by earthquakes in urban area. In this study, a geographic information system (GIS)-based seismic hazard prediction system for urban earthquake disaster prevention planning is developed, incorporating structural vulnerability analysis, program development, and GIS. The system is integrated with proven building vulnerability analysis models, data search function, spatial analysis function, and plotting function. It realizes the batching and automation of seismic hazard prediction and the interactive visualization of predicted results. Finally, the system is applied to a test area and the results are compared with results from previous studies, the precision of which was improved because the construction time of the building was taken into consideration. Moreover, the system is of high intelligence and minimal manual intervention. It meets the operating requirements of non-professionals and provides a feasible technique and operating procedure for large-scale urban seismic hazard prediction. Above all, the system can provide data support and aid decision-making for the establishment and implementation of urban earthquake disaster prevention planning.
开展地震灾害预测对减轻城市地震灾害具有重要意义。本研究结合结构易损性分析、程序开发和GIS技术,开发了基于地理信息系统(GIS)的城市地震防灾规划震害预测系统。该系统集成了成熟的建筑脆弱性分析模型、数据检索功能、空间分析功能和绘图功能。实现了地震灾害预测的批量、自动化和预测结果的交互式可视化。最后,将该系统应用于某试验区,并与以往的研究结果进行了比较,由于考虑了建筑物的施工时间,提高了系统的精度。该系统智能化程度高,人工干预少。满足非专业人员的操作要求,为大规模城市地震灾害预测提供了可行的技术和操作流程。总之,该系统可以为城市地震防灾规划的制定和实施提供数据支持和辅助决策。
{"title":"GIS-based seismic hazard prediction system for urban earthquake disaster prevention planning","authors":"Y. Zhai, Shenglong Chen, Qianwen Ouyang","doi":"10.7287/peerj.preprints.3165v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.3165v1","url":null,"abstract":"It is of great significance to conduct seismic hazard prediction in mitigating the damage caused by earthquakes in urban area. In this study, a geographic information system (GIS)-based seismic hazard prediction system for urban earthquake disaster prevention planning is developed, incorporating structural vulnerability analysis, program development, and GIS. The system is integrated with proven building vulnerability analysis models, data search function, spatial analysis function, and plotting function. It realizes the batching and automation of seismic hazard prediction and the interactive visualization of predicted results. Finally, the system is applied to a test area and the results are compared with results from previous studies, the precision of which was improved because the construction time of the building was taken into consideration. Moreover, the system is of high intelligence and minimal manual intervention. It meets the operating requirements of non-professionals and provides a feasible technique and operating procedure for large-scale urban seismic hazard prediction. Above all, the system can provide data support and aid decision-making for the establishment and implementation of urban earthquake disaster prevention planning.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"92 1","pages":"e3165"},"PeriodicalIF":0.0,"publicationDate":"2019-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87330491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
An architecture for context-aware reactive systems based on run-time semantic models 基于运行时语义模型的上下文感知响应系统的体系结构
Pub Date : 2019-05-04 DOI: 10.7287/peerj.preprints.27702v1
Ester Giallonardo, Francesco Poggi, D. Rossi, E. Zimeo
In recent years, new classes of highly dynamic, complex systems are gaining momentum. These systems are characterized by the need to express behaviors driven by external and/or internal changes, i.e. they are reactive and context-aware. These classes include, but are not limited to IoT, smart cities, cyber-physical systems and sensor networks. An important design feature of these systems should be the ability of adapting their behavior to environment changes. This requires handling a runtime representation of the context enriched with variation points that relate different behaviors to possible changes of the representation. In this paper, we present a reference architecture for reactive, context-aware systems able to handle contextual knowledge (that defines what the system perceives) by means of virtual sensors and able to react to environment changes by means of virtual actuators, both represented in a declarative manner through semantic web technologies. To improve the ability to react with a proper behavior to context changes (e.g. faults) that may influence the ability of the system to observe the environment, we allow the definition of logical sensors and actuators through an extension of the SSN ontology (a W3C standard). In our reference architecture a knowledge base of sensors and actuators (hosted by an RDF triple store) is bound to real world by grounding semantic elements to physical devices via REST APIs. The proposed architecture along with the defined ontology try to address the main problems of dynamically reconfigurable systems by exploiting a declarative, queryable approach to enable runtime reconfiguration with the help of (a) semantics to support discovery in heterogeneous environment, (b) composition logic to define alternative behaviors for variation points, (c) bi-causal connection life-cycle to avoid dangling links with the external environment. The proposal is validated in a case study aimed at designing an edge node for smart buildings dedicated to cultural heritage preservation.
近年来,高动态、复杂系统的新类别正在获得发展势头。这些系统的特点是需要表达由外部和/或内部变化驱动的行为,即它们是反应性的和上下文感知的。这些课程包括但不限于物联网、智慧城市、网络物理系统和传感器网络。这些系统的一个重要设计特征应该是使其行为适应环境变化的能力。这需要处理上下文的运行时表示,其中包含将不同行为与表示的可能更改联系起来的变异点。在本文中,我们提出了一个响应式、上下文感知系统的参考架构,该系统能够通过虚拟传感器处理上下文知识(定义系统感知的内容),并能够通过虚拟致动器对环境变化做出反应,两者都通过语义web技术以声明的方式表示。为了提高对可能影响系统观察环境能力的上下文变化(例如故障)做出正确反应的能力,我们允许通过扩展SSN本体(W3C标准)来定义逻辑传感器和执行器。在我们的参考体系结构中,传感器和执行器的知识库(由RDF三重存储托管)通过REST api将语义元素绑定到物理设备,从而绑定到现实世界。提出的体系结构以及定义的本体试图通过利用声明性、可查询的方法来解决动态可重构系统的主要问题,从而在以下方面的帮助下实现运行时重构:(a)语义支持异构环境中的发现,(b)组合逻辑定义可变点的替代行为,(c)双因果连接生命周期以避免与外部环境的悬挂链接。该建议在一个旨在设计致力于文化遗产保护的智能建筑边缘节点的案例研究中得到了验证。
{"title":"An architecture for context-aware reactive systems based on run-time semantic models","authors":"Ester Giallonardo, Francesco Poggi, D. Rossi, E. Zimeo","doi":"10.7287/peerj.preprints.27702v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27702v1","url":null,"abstract":"In recent years, new classes of highly dynamic, complex systems are gaining momentum. These systems are characterized by the need to express behaviors driven by external and/or internal changes, i.e. they are reactive and context-aware. These classes include, but are not limited to IoT, smart cities, cyber-physical systems and sensor networks.\u0000 An important design feature of these systems should be the ability of adapting their behavior to environment changes. This requires handling a runtime representation of the context enriched with variation points that relate different behaviors to possible changes of the representation.\u0000 In this paper, we present a reference architecture for reactive, context-aware systems able to handle contextual knowledge (that defines what the system perceives) by means of virtual sensors and able to react to environment changes by means of virtual actuators, both represented in a declarative manner through semantic web technologies. To improve the ability to react with a proper behavior to context changes (e.g. faults) that may influence the ability of the system to observe the environment, we allow the definition of logical sensors and actuators through an extension of the SSN ontology (a W3C standard). In our reference architecture a knowledge base of sensors and actuators (hosted by an RDF triple store) is bound to real world by grounding semantic elements to physical devices via REST APIs.\u0000 The proposed architecture along with the defined ontology try to address the main problems of dynamically reconfigurable systems by exploiting a declarative, queryable approach to enable runtime reconfiguration with the help of (a) semantics to support discovery in heterogeneous environment, (b) composition logic to define alternative behaviors for variation points, (c) bi-causal connection life-cycle to avoid dangling links with the external environment. The proposal is validated in a case study aimed at designing an edge node for smart buildings dedicated to cultural heritage preservation.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"1 1","pages":"e27702"},"PeriodicalIF":0.0,"publicationDate":"2019-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87760347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GIS analysis of geological surfaces orientations: the qgSurf plugin for QGIS 地理信息系统分析的地质表面方向:qgSurf插件的QGIS
Pub Date : 2019-04-30 DOI: 10.7287/peerj.preprints.27694v1
M. Alberti
GIS techniques enable the quantitative analysis of geological structures. In particular, topographic traces of geological lineaments can be compared with the theoretical ones for geological planes, to determine the best fitting theoretical planes. qgSurf, a Python plugin for QGIS, implements this kind of processing, in addition to the determination of the best-fit plane to a set of topographic points, the calculation of the distances between topographic traces and geological planes and also basic stereonet plottings. By applying these tools to a case study of a Cenozoic thrust lineament in the Southern Apennines (Calabria, Southern Italy), we deduce the approximate orientations of the lineament in different fault-delimited sectors and calculate the misfits between the theoretical orientations and the actual topographic traces.
地理信息系统技术使地质构造的定量分析成为可能。特别地,地质地貌的地形迹线可以与地质平面的理论迹线进行比较,以确定最适合的理论面。qgSurf,一个用于QGIS的Python插件,实现了这种处理,除了确定一组地形点的最佳拟合平面,计算地形痕迹与地质平面之间的距离,以及基本的立体绘图。通过对意大利南部卡拉布里亚(Calabria)南亚平宁地区新生代逆冲构造的实例研究,我们推导出了该构造在不同断界扇区的大致走向,并计算了理论走向与实际地形迹线之间的不匹配。
{"title":"GIS analysis of geological surfaces orientations: the qgSurf plugin for QGIS","authors":"M. Alberti","doi":"10.7287/peerj.preprints.27694v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27694v1","url":null,"abstract":"GIS techniques enable the quantitative analysis of geological structures. In particular, topographic traces of geological lineaments can be compared with the theoretical ones for geological planes, to determine the best fitting theoretical planes. qgSurf, a Python plugin for QGIS, implements this kind of processing, in addition to the determination of the best-fit plane to a set of topographic points, the calculation of the distances between topographic traces and geological planes and also basic stereonet plottings. By applying these tools to a case study of a Cenozoic thrust lineament in the Southern Apennines (Calabria, Southern Italy), we deduce the approximate orientations of the lineament in different fault-delimited sectors and calculate the misfits between the theoretical orientations and the actual topographic traces.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"4 1","pages":"e27694"},"PeriodicalIF":0.0,"publicationDate":"2019-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84631488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
PeerJ preprints
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1