首页 > 最新文献

Sixth International Conference on Machine Learning and Applications (ICMLA 2007)最新文献

英文 中文
Predicting Binding Sites in the Mouse Genome 预测小鼠基因组中的结合位点
Yi Sun, M. Robinson, R. Adams, N. Davey, A. Rust
The identification of cis-regulatory binding sites in DNA in multicellular eukaryotes is a particularly difficult problem in computational biology. To obtain a full understanding of the complex machinery embodied in genetic regulatory networks it is necessary to know both the identity of the regulatory transcription factors together with the location of their binding sites in the genome. We show that using an SVM together with data sampling, to integrate the results of individual algorithms specialised for the prediction of binding site locations, can produce significant improvements upon the original algorithms applied to the mouse genome. These results make more tractable the expensive experimental procedure of actually verifying the predictions.
多细胞真核生物DNA顺式调控结合位点的鉴定是计算生物学中一个特别困难的问题。为了充分了解遗传调控网络中的复杂机制,有必要了解调控转录因子的身份及其在基因组中结合位点的位置。我们表明,将支持向量机与数据采样一起使用,整合专门用于预测结合位点位置的单个算法的结果,可以对应用于小鼠基因组的原始算法产生显着改进。这些结果使实际验证预测的昂贵实验过程更加容易处理。
{"title":"Predicting Binding Sites in the Mouse Genome","authors":"Yi Sun, M. Robinson, R. Adams, N. Davey, A. Rust","doi":"10.1109/ICMLA.2007.28","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.28","url":null,"abstract":"The identification of cis-regulatory binding sites in DNA in multicellular eukaryotes is a particularly difficult problem in computational biology. To obtain a full understanding of the complex machinery embodied in genetic regulatory networks it is necessary to know both the identity of the regulatory transcription factors together with the location of their binding sites in the genome. We show that using an SVM together with data sampling, to integrate the results of individual algorithms specialised for the prediction of binding site locations, can produce significant improvements upon the original algorithms applied to the mouse genome. These results make more tractable the expensive experimental procedure of actually verifying the predictions.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115475558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Recognition of ultrasonic multi-echo sequences for autonomous symbolic indoor tracking 用于自主符号室内跟踪的超声多回波序列识别
André Stuhlsatz
This paper presents an autonomous symbolic indoor tracking system for ubiquitous computing applications. The proposed approach is based upon the assumption that topologically discriminable information can be assigned explicitly to different spaces of a given indoor environment. On that assumption, continuous time-of-flight (ToF) measurements of echo-bursts obtained from four orthogonally and coplanarly mounted ultrasonic transducer are used to learn a stochastic room model. While the individual acoustic representation of space is captured using Gaussian mixture densities, the stochastic variabilities in the moving direction of a person are modeled by hidden-Markov-models (HMMs). Experiments within a six room environment resulted in a room recognition rate of 92.21% and a room sequence recognition rate of 66.00% without any pre-fixed devices.
本文提出了一种适用于泛在计算应用的自主符号室内跟踪系统。所提出的方法基于拓扑可判别信息可以明确分配到给定室内环境的不同空间的假设。在此假设下,利用四个正交共面安装的超声换能器获得的连续飞行时间(ToF)测量值来学习随机房间模型。使用高斯混合密度捕获空间的单个声学表示,而使用隐马尔可夫模型(hmm)模拟人的移动方向的随机变量。在6个房间环境下进行实验,在没有任何预先固定设备的情况下,房间识别率为92.21%,房间序列识别率为66.00%。
{"title":"Recognition of ultrasonic multi-echo sequences for autonomous symbolic indoor tracking","authors":"André Stuhlsatz","doi":"10.1109/ICMLA.2007.30","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.30","url":null,"abstract":"This paper presents an autonomous symbolic indoor tracking system for ubiquitous computing applications. The proposed approach is based upon the assumption that topologically discriminable information can be assigned explicitly to different spaces of a given indoor environment. On that assumption, continuous time-of-flight (ToF) measurements of echo-bursts obtained from four orthogonally and coplanarly mounted ultrasonic transducer are used to learn a stochastic room model. While the individual acoustic representation of space is captured using Gaussian mixture densities, the stochastic variabilities in the moving direction of a person are modeled by hidden-Markov-models (HMMs). Experiments within a six room environment resulted in a room recognition rate of 92.21% and a room sequence recognition rate of 66.00% without any pre-fixed devices.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127497001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Web based machine learning for language identification and translation 基于Web的语言识别和翻译机器学习
Ş. Sağiroğlu, U. Yavanoglu, Esra Nergis Guven
Language identification is an important task for Web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on Web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on Web pages.
语言识别是Web信息检索服务的一项重要任务。本文介绍了一个基于Web的多语种文档语言识别平台的实现。该平台由五个模块组成,可自动完成任务。此外,还利用人工神经网络对多语种文档中的语言进行识别。目前有包括土耳其语、法语、意大利语、丹麦语和德语在内的六种语言的结果。该方法的主要优点是基于人工神经网络的语言识别系统可以在开发的系统的帮助下满足实时语言识别精度的期望。实验表明,该系统在网页上对不同语言进行识别和转换,达到了较高的准确率。
{"title":"Web based machine learning for language identification and translation","authors":"Ş. Sağiroğlu, U. Yavanoglu, Esra Nergis Guven","doi":"10.1109/ICMLA.2007.27","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.27","url":null,"abstract":"Language identification is an important task for Web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on Web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on Web pages.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124942886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
An Itemset-Driven Cluster-Oriented Approach to Extract Compact and Meaningful Sets of Association Rules 一种项集驱动的面向聚类的关联规则紧凑有意义集提取方法
C. H. Yamamoto, Maria Cristina Ferreira de Oliveira, M. L. Fujimoto, S. O. Rezende
Extracting association rules from large datasets typically results in a huge amount of rules. An approach to tackle this problem is to filter the resulting rule set, which reduces the rules, at the cost of also eliminating potentially interesting ones. In exploring a new dataset in search of relevant associations, it may be more useful for miners to have an overview of the space of rules obtainable from the dataset, rather than getting an arbitrary set satisfying high values for given interest measures. We describe a rule extraction approach that favors rule diversity, allowing miners to gain an overview of the rule space while reducing semantic redundancy within the rule set. This approach adopts an itemset-driven rule generation coupled with a cluster-based filtering process. The set of rules so obtained provides a starting point for a user-driven exploration of it.
从大型数据集中提取关联规则通常会产生大量的规则。解决这个问题的一种方法是过滤结果规则集,这样可以减少规则,但同时也要消除可能有趣的规则。在探索新的数据集以寻找相关关联时,对于矿工来说,从数据集中获得可获得的规则空间的概述可能更有用,而不是为给定的兴趣度量获得满足高值的任意集合。我们描述了一种有利于规则多样性的规则提取方法,允许矿工在减少规则集中的语义冗余的同时获得规则空间的概述。这种方法采用了项集驱动的规则生成和基于聚类的过滤过程。这样获得的规则集为用户驱动的探索提供了一个起点。
{"title":"An Itemset-Driven Cluster-Oriented Approach to Extract Compact and Meaningful Sets of Association Rules","authors":"C. H. Yamamoto, Maria Cristina Ferreira de Oliveira, M. L. Fujimoto, S. O. Rezende","doi":"10.1109/ICMLA.2007.45","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.45","url":null,"abstract":"Extracting association rules from large datasets typically results in a huge amount of rules. An approach to tackle this problem is to filter the resulting rule set, which reduces the rules, at the cost of also eliminating potentially interesting ones. In exploring a new dataset in search of relevant associations, it may be more useful for miners to have an overview of the space of rules obtainable from the dataset, rather than getting an arbitrary set satisfying high values for given interest measures. We describe a rule extraction approach that favors rule diversity, allowing miners to gain an overview of the rule space while reducing semantic redundancy within the rule set. This approach adopts an itemset-driven rule generation coupled with a cluster-based filtering process. The set of rules so obtained provides a starting point for a user-driven exploration of it.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123789043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Predicting building contamination using machine learning 使用机器学习预测建筑物污染
Shawn Martin, S. McKenna
Potential events involving biological or chemical contamination of buildings are of major concern in the area of homeland security. Tools are needed to provide rapid, on- site predictions of contaminant levels given only approximate measurements in limited locations throughout a building. In principal, such tools could use calculations based on physical process models to provide accurate predictions. In practice, however, physical process models are too complex and computationally costly to be used in a real-time scenario. In this paper, we investigate the feasibility of using machine learning to provide easily computed but approximate models that would be applicable in the field. We develop a machine learning method based on support vector machine regression and classification. We apply our method to problems of estimating contamination levels and contaminant source location.
涉及建筑物的生物或化学污染的潜在事件是国土安全领域的主要关切。需要工具来提供快速的、现场的污染物水平预测,仅在整个建筑物的有限位置进行近似测量。原则上,这些工具可以使用基于物理过程模型的计算来提供准确的预测。然而,在实践中,物理过程模型过于复杂,计算成本太高,无法用于实时场景。在本文中,我们研究了使用机器学习来提供易于计算但近似的模型的可行性,这些模型将适用于该领域。我们开发了一种基于支持向量机回归和分类的机器学习方法。我们将我们的方法应用于估计污染水平和污染源位置的问题。
{"title":"Predicting building contamination using machine learning","authors":"Shawn Martin, S. McKenna","doi":"10.1109/ICMLA.2007.12","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.12","url":null,"abstract":"Potential events involving biological or chemical contamination of buildings are of major concern in the area of homeland security. Tools are needed to provide rapid, on- site predictions of contaminant levels given only approximate measurements in limited locations throughout a building. In principal, such tools could use calculations based on physical process models to provide accurate predictions. In practice, however, physical process models are too complex and computationally costly to be used in a real-time scenario. In this paper, we investigate the feasibility of using machine learning to provide easily computed but approximate models that would be applicable in the field. We develop a machine learning method based on support vector machine regression and classification. We apply our method to problems of estimating contamination levels and contaminant source location.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126844120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Maximum Likelihood Quantization of Genomic Features Using Dynamic Programming 基于动态规划的基因组特征的最大似然量化
Mingzhou Song, R. Haralick, S. Boissinot
Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINE1. In particular, the recombination rate obtained by quantization is studied for LINE1 elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.
采用动态规划方法将连续随机变量量化为离散随机变量。在对多个随机变量组成的大型网络模型进行统计分析或重建之前,量化通常是有用的。量化是通过动态规划,通过最大化观测数据的似然,找到随机变量原始概率密度函数的最优离散表示。该算法非常适用于研究基因组特征,如染色体间的重组率和非编码元件(如LINE1)的统计特性。特别研究了同样采用长度量化分组的LINE1元素的量化重组率。对于单个变量的离散化,精确且密度保持的量化方法提供了一种优于不精确且基于距离的k-means聚类算法的替代方法。
{"title":"Maximum Likelihood Quantization of Genomic Features Using Dynamic Programming","authors":"Mingzhou Song, R. Haralick, S. Boissinot","doi":"10.1109/ICMLA.2007.36","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.36","url":null,"abstract":"Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINE1. In particular, the recombination rate obtained by quantization is studied for LINE1 elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122641866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Feature extraction using random matrix theory approach 特征提取采用随机矩阵理论方法
V. Rojkova, M. Kantardzic
Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. In this paper, we propose to broaden the feature extraction algorithms with Random Matrix Theory methodology. Testing the cross-correlation matrix of variables against the null hypothesis of random correlations, we can derive characteristic parameters of the system, such as boundaries of eigenvalue spectra of random correlations, distribution of eigenvalues and eigenvectors of random correlations, inverse participation ratio and stability of eigenvectors of non-random correlations. We demonstrate the usefullness of these parameters for network traffic application, in particular, for network congestion control and for detection of any changes in the stable traffic dynamics.
特征提取涉及简化准确描述大量数据所需的资源量。在本文中,我们提出用随机矩阵理论的方法来扩展特征提取算法。根据随机相关的零假设检验变量的互相关矩阵,可以得到系统的特征参数,如随机相关特征值谱的边界、随机相关特征值和特征向量的分布、逆参与比和非随机相关特征向量的稳定性。我们展示了这些参数对网络流量应用的有用性,特别是对于网络拥塞控制和检测稳定流量动态中的任何变化。
{"title":"Feature extraction using random matrix theory approach","authors":"V. Rojkova, M. Kantardzic","doi":"10.1109/ICMLA.2007.95","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.95","url":null,"abstract":"Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. In this paper, we propose to broaden the feature extraction algorithms with Random Matrix Theory methodology. Testing the cross-correlation matrix of variables against the null hypothesis of random correlations, we can derive characteristic parameters of the system, such as boundaries of eigenvalue spectra of random correlations, distribution of eigenvalues and eigenvectors of random correlations, inverse participation ratio and stability of eigenvectors of non-random correlations. We demonstrate the usefullness of these parameters for network traffic application, in particular, for network congestion control and for detection of any changes in the stable traffic dynamics.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122586842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An agent based system for california electricity market: a perspective of myopic machine learning 基于代理的加州电力市场系统:近视眼机器学习的视角
T. Sueyoshi, G. R. Tadiparthi
In recent years, an agent based system is widely adopted to model a deregulated electricity market. [1] and [2] have developed a multi-agent intelligent simulator (MAIS) to model the structure of US wholesale market. The methodological practicality was confirmed with a simulation study and a real data set from PJM electricity market. In our proposed artificial wholesale market, the agents are equipped with limited reinforcement learning capabilities. We validate the agent based model with the help of six data sets from the California electricity market. The performance of the MAIS is compared with other well-known methods, using a real data set on power trading related to the California electricity (2000-2001).
近年来,基于智能体的电力市场模型被广泛采用。[1]和[2]开发了一个多智能体智能模拟器(MAIS)来模拟美国批发市场的结构。仿真研究和PJM电力市场的实际数据验证了方法的实用性。在我们提出的人工批发市场中,智能体具有有限的强化学习能力。我们利用来自加州电力市场的六个数据集验证了基于智能体的模型。使用与加州电力相关的电力交易(2000-2001)的真实数据集,将MAIS的性能与其他知名方法进行比较。
{"title":"An agent based system for california electricity market: a perspective of myopic machine learning","authors":"T. Sueyoshi, G. R. Tadiparthi","doi":"10.1109/ICMLA.2007.83","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.83","url":null,"abstract":"In recent years, an agent based system is widely adopted to model a deregulated electricity market. [1] and [2] have developed a multi-agent intelligent simulator (MAIS) to model the structure of US wholesale market. The methodological practicality was confirmed with a simulation study and a real data set from PJM electricity market. In our proposed artificial wholesale market, the agents are equipped with limited reinforcement learning capabilities. We validate the agent based model with the help of six data sets from the California electricity market. The performance of the MAIS is compared with other well-known methods, using a real data set on power trading related to the California electricity (2000-2001).","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129135691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Generalized Sequence Signatures through Symbolic Clustering 基于符号聚类的广义序列签名
D. Dorr, A. Denton
Traditionally sequence motifs and domains, also called signatures, are defined such that insertions, deletions and mismatched regions are small compared with matched regions. We introduce an algorithm for the identification of generalized sequence signatures that can be composed of windows distributed throughout the sequence. We use an approach that is based on clustering analysis of recurring subsequences, to which we refer as symbols, of a predefined length. Symbols are not required to be located in close proximity to each other. The clustering algorithm group sequences so as to maximize the number of shared symbols among sequences. We evaluate our signatures in comparison to those obtained from the InterPro database, and show that our approach has benefits for deriving sequence annotations compared with InterPro's signatures.
传统上,序列基序和结构域(也称为特征)被定义为插入、缺失和不匹配区域比匹配区域小。我们介绍了一种广义序列签名的识别算法,该算法可以由分布在整个序列中的窗口组成。我们使用了一种方法,该方法基于预定义长度的循环子序列的聚类分析,我们将其称为符号。符号不需要彼此靠近。聚类算法对序列进行分组,使序列间共享符号的数量最大化。我们将我们的签名与从InterPro数据库中获得的签名进行了比较,并表明与InterPro的签名相比,我们的方法在获得序列注释方面具有优势。
{"title":"Generalized Sequence Signatures through Symbolic Clustering","authors":"D. Dorr, A. Denton","doi":"10.1109/ICMLA.2007.41","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.41","url":null,"abstract":"Traditionally sequence motifs and domains, also called signatures, are defined such that insertions, deletions and mismatched regions are small compared with matched regions. We introduce an algorithm for the identification of generalized sequence signatures that can be composed of windows distributed throughout the sequence. We use an approach that is based on clustering analysis of recurring subsequences, to which we refer as symbols, of a predefined length. Symbols are not required to be located in close proximity to each other. The clustering algorithm group sequences so as to maximize the number of shared symbols among sequences. We evaluate our signatures in comparison to those obtained from the InterPro database, and show that our approach has benefits for deriving sequence annotations compared with InterPro's signatures.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134116001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A new time series prediction algorithm based on moving average of nth-order difference 一种新的基于n阶差分移动平均的时间序列预测算法
Yang Lan, D. Neagu
As a typical research topic, time series analysis and prediction face a continuously rising interest and have been widely applied in various domains. Current approaches focus on a large number of data collections, using mathematics, statistics and artificial intelligence methods, to process and make a prediction on the next most probable value. This paper proposes a new algorithm using moving average of nth-order difference to predict the next term for pseudo- periodical time series. We use artificial neural networks (ANNs) and range evaluation for error in a hybrid model to extend our prediction method further. The algorithm performances are reported on case studies on monthly average sunspot number data set and earthquake data set.
时间序列分析与预测作为一个典型的研究课题,受到了越来越多的关注,并在各个领域得到了广泛的应用。目前的方法集中在大量的数据收集上,使用数学、统计学和人工智能方法,对下一个最可能的值进行处理和预测。本文提出了一种利用n阶差分移动平均预测伪周期时间序列下一项的新算法。我们在混合模型中使用人工神经网络和误差范围评估来进一步扩展我们的预测方法。以月平均太阳黑子数数据集和地震数据集为例,报告了算法的性能。
{"title":"A new time series prediction algorithm based on moving average of nth-order difference","authors":"Yang Lan, D. Neagu","doi":"10.1109/ICMLA.2007.7","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.7","url":null,"abstract":"As a typical research topic, time series analysis and prediction face a continuously rising interest and have been widely applied in various domains. Current approaches focus on a large number of data collections, using mathematics, statistics and artificial intelligence methods, to process and make a prediction on the next most probable value. This paper proposes a new algorithm using moving average of nth-order difference to predict the next term for pseudo- periodical time series. We use artificial neural networks (ANNs) and range evaluation for error in a hybrid model to extend our prediction method further. The algorithm performances are reported on case studies on monthly average sunspot number data set and earthquake data set.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127608988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Sixth International Conference on Machine Learning and Applications (ICMLA 2007)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1