首页 > 最新文献

2010 Ninth International Conference on Machine Learning and Applications最新文献

英文 中文
Improved Fine-Grained Component-Conditional Class Labeling with Active Learning 基于主动学习的改进细粒度组件条件类标记
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.8
David J. Miller, Chu-Fang Lin, G. Kesidis, Christopher M. Collins
We have recently introduced new generative semi supervised mixtures with more fine-grained class label generation mechanisms than previous methods. Our models combine advantages of semi supervised mixtures, which achieve label extrapolation over a component, and nearest-neighbor (NN)/nearest-prototype (NP) classification, which achieves accurate classification in the vicinity of labeled samples. Our models are advantageous when within-component class proportions are not constant over the feature space region ``owned by'' a component. In this paper, we develop an active learning extension of our fine-grained labeling methods. We propose two new uncertainty sampling methods in comparison with traditional entropy-based uncertainty sampling. Our experiments on a number of UC Irvine data sets show that the proposed active learning methods improve classification accuracy more than standard entropy-based active learning. The proposed methods are particularly advantageous when the labeled percentage is small. We also extend our semi supervised method to allow variable weighting on labeled and unlabeled data likelihood terms. This approach is shown to outperform previous weighting schemes.
我们最近引入了新的生成半监督混合,具有比以前的方法更细粒度的类标签生成机制。我们的模型结合了半监督混合和最近邻(NN)/最近邻原型(NP)分类的优点,前者实现了对组件的标签外推,后者实现了对标记样本附近的准确分类。当组件内的类比例在组件“拥有”的特征空间区域上不恒定时,我们的模型是有利的。在本文中,我们开发了一种主动学习扩展我们的细粒度标记方法。与传统的基于熵的不确定性抽样方法相比,提出了两种新的不确定性抽样方法。我们在加州大学欧文分校的大量数据集上的实验表明,所提出的主动学习方法比基于熵的标准主动学习更能提高分类精度。当标记的百分比很小时,所提出的方法特别有利。我们还扩展了我们的半监督方法,允许对标记和未标记的数据似然项进行可变加权。这种方法被证明优于以前的加权方案。
{"title":"Improved Fine-Grained Component-Conditional Class Labeling with Active Learning","authors":"David J. Miller, Chu-Fang Lin, G. Kesidis, Christopher M. Collins","doi":"10.1109/ICMLA.2010.8","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.8","url":null,"abstract":"We have recently introduced new generative semi supervised mixtures with more fine-grained class label generation mechanisms than previous methods. Our models combine advantages of semi supervised mixtures, which achieve label extrapolation over a component, and nearest-neighbor (NN)/nearest-prototype (NP) classification, which achieves accurate classification in the vicinity of labeled samples. Our models are advantageous when within-component class proportions are not constant over the feature space region ``owned by'' a component. In this paper, we develop an active learning extension of our fine-grained labeling methods. We propose two new uncertainty sampling methods in comparison with traditional entropy-based uncertainty sampling. Our experiments on a number of UC Irvine data sets show that the proposed active learning methods improve classification accuracy more than standard entropy-based active learning. The proposed methods are particularly advantageous when the labeled percentage is small. We also extend our semi supervised method to allow variable weighting on labeled and unlabeled data likelihood terms. This approach is shown to outperform previous weighting schemes.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120957159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware Implementation of a Real-Time Neural Network Controller Set for Reactive Power Compensation Systems 无功补偿系统实时神经网络控制器的硬件实现
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.107
R. Bayindir, Alper Gorgun
This paper introduces the use of hardware implementation of a real time neural network controller set for reactive power compensation (RPC) systems with synchronous motor. In this study, measurement of parameters required in systems such as current, phase differences, frequency and power are measured by means of a PIC 18F452 microcontroller with high accuracy and then controlled via artificial neural networks;. The performance test based on obtained data using a computer codes written in Visual Basic.Net are implemented. Different ANN controller structures are verified by simulating them on a computer. It is evaluated that the set developed can be easily adapted in real time applications.
介绍了同步电机无功补偿(RPC)系统实时神经网络控制器的硬件实现。本研究利用pic18f452单片机对系统所需的电流、相位差、频率、功率等参数进行高精度测量,并通过人工神经网络进行控制;该性能测试基于所获得的数据,使用Visual Basic编写了计算机代码。的实现。通过计算机仿真,对不同的人工神经网络控制器结构进行了验证。结果表明,所开发的集易于适应实时应用。
{"title":"Hardware Implementation of a Real-Time Neural Network Controller Set for Reactive Power Compensation Systems","authors":"R. Bayindir, Alper Gorgun","doi":"10.1109/ICMLA.2010.107","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.107","url":null,"abstract":"This paper introduces the use of hardware implementation of a real time neural network controller set for reactive power compensation (RPC) systems with synchronous motor. In this study, measurement of parameters required in systems such as current, phase differences, frequency and power are measured by means of a PIC 18F452 microcontroller with high accuracy and then controlled via artificial neural networks;. The performance test based on obtained data using a computer codes written in Visual Basic.Net are implemented. Different ANN controller structures are verified by simulating them on a computer. It is evaluated that the set developed can be easily adapted in real time applications.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126731760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Centroid-based Classification Enhanced with Wikipedia 基于质心的分类与维基百科增强
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.17
Abdullah Bawakid, M. Oussalah
Most of the traditional text classification methods employ Bag of Words (BOW) approaches relying on the words frequencies existing within the training corpus and the testing documents. Recently, studies have examined using external knowledge to enrich the text representation of documents. Some have focused on using WordNet which suffers from different limitations including the available number of words, synsets and coverage. Other studies used different aspects of Wikipedia instead. Depending on the features being selected and evaluated and the external knowledge being used, a balance between recall, precision, noise reduction and information loss has to be applied. In this paper, we propose a new Centroid-based classification approach relying on Wikipedia to enrich the representation of documents through the use of Wikpedia’s concepts, categories structure, links, and articles text. We extract candidate concepts for each class with the help of Wikipedia and merge them with important features derived directly from the text documents. Different variations of the system were evaluated and the results show improvements in the performance of the system.
传统的文本分类方法大多采用词袋(BOW)方法,依赖于训练语料库和测试文档中存在的词频。近年来,有研究探讨了利用外部知识来丰富文档的文本表示。一些人专注于使用WordNet,它受到不同的限制,包括可用的单词数量、同义词集和覆盖范围。其他研究则使用了维基百科的不同方面。根据所选择和评估的特征以及所使用的外部知识,必须在召回率、精确度、降噪和信息损失之间取得平衡。在本文中,我们提出了一种新的基于中心点的分类方法,依靠维基百科通过使用维基百科的概念、分类结构、链接和文章文本来丰富文档的表示。我们在维基百科的帮助下为每个类提取候选概念,并将它们与直接从文本文档派生的重要特征合并。对系统的不同变化进行了评估,结果表明系统的性能有所改善。
{"title":"Centroid-based Classification Enhanced with Wikipedia","authors":"Abdullah Bawakid, M. Oussalah","doi":"10.1109/ICMLA.2010.17","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.17","url":null,"abstract":"Most of the traditional text classification methods employ Bag of Words (BOW) approaches relying on the words frequencies existing within the training corpus and the testing documents. Recently, studies have examined using external knowledge to enrich the text representation of documents. Some have focused on using WordNet which suffers from different limitations including the available number of words, synsets and coverage. Other studies used different aspects of Wikipedia instead. Depending on the features being selected and evaluated and the external knowledge being used, a balance between recall, precision, noise reduction and information loss has to be applied. In this paper, we propose a new Centroid-based classification approach relying on Wikipedia to enrich the representation of documents through the use of Wikpedia’s concepts, categories structure, links, and articles text. We extract candidate concepts for each class with the help of Wikipedia and merge them with important features derived directly from the text documents. Different variations of the system were evaluated and the results show improvements in the performance of the system.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127029623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust Learning for Adaptive Programs by Leveraging Program Structure 利用程序结构的自适应程序鲁棒学习
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.150
Jervis Pinto, Alan Fern, Tim Bauer, Martin Erwig
We study how to effectively integrate reinforcement learning (RL) and programming languages via adaptation-based programming, where programs can include non-deterministic structures that can be automatically optimized via RL. Prior work has optimized adaptive programs by defining an induced sequential decision process to which standard RL is applied. Here we show that the success of this approach is highly sensitive to the specific program structure, where even seemingly minor program transformations can lead to failure. This sensitivity makes it extremely difficult for a non-RL-expert to write effective adaptive programs. In this paper, we study a more robust learning approach, where the key idea is to leverage information about program structure in order to define a more informative decision process and to improve the SARSA(lambda) RL algorithm. Our empirical results show significant benefits for this approach.
我们研究了如何通过基于适应的编程有效地整合强化学习(RL)和编程语言,其中程序可以包括可以通过RL自动优化的非确定性结构。先前的工作通过定义一个适用于标准RL的诱导顺序决策过程来优化自适应程序。这里我们展示了这种方法的成功是对特定的程序结构高度敏感的,即使看起来很小的程序转换也可能导致失败。这种敏感性使得非强化学习专家很难编写有效的自适应程序。在本文中,我们研究了一种更稳健的学习方法,其关键思想是利用有关程序结构的信息来定义更具信息量的决策过程,并改进SARSA(lambda) RL算法。我们的实证结果显示了这种方法的显著好处。
{"title":"Robust Learning for Adaptive Programs by Leveraging Program Structure","authors":"Jervis Pinto, Alan Fern, Tim Bauer, Martin Erwig","doi":"10.1109/ICMLA.2010.150","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.150","url":null,"abstract":"We study how to effectively integrate reinforcement learning (RL) and programming languages via adaptation-based programming, where programs can include non-deterministic structures that can be automatically optimized via RL. Prior work has optimized adaptive programs by defining an induced sequential decision process to which standard RL is applied. Here we show that the success of this approach is highly sensitive to the specific program structure, where even seemingly minor program transformations can lead to failure. This sensitivity makes it extremely difficult for a non-RL-expert to write effective adaptive programs. In this paper, we study a more robust learning approach, where the key idea is to leverage information about program structure in order to define a more informative decision process and to improve the SARSA(lambda) RL algorithm. Our empirical results show significant benefits for this approach.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130660994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Comparison of Techniques for Handling Incomplete Input Data with a Focus on Attribute Relevance Influence 以属性关联影响为重点的不完全输入数据处理技术比较
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.126
M. Millán-Giraldo, J. S. Sánchez, V. Traver
This work presents a new approach based on support vector regression to deal with incomplete input (unseen) data and compares it to other existing techniques. The empirical analysis has been done over 18 real data sets and using five different classifiers, with the aim of foreseeing which technique can be deemed as more suitable for each classifier. Also, this study tries to devise how the relevance of the missing attribute affects the performance of each pair (handling algorithm, classifier). Experimental results demonstrate that no technique is absolutely better than the others for all classifiers. However, combining the proposed strategy with the nearest neighbor classifier appears as the best choice to face the problem of missing attribute values in the input data.
这项工作提出了一种基于支持向量回归的新方法来处理不完整的输入(看不见的)数据,并将其与其他现有技术进行了比较。本文对18个真实数据集进行了实证分析,并使用了5种不同的分类器,目的是预测哪种技术更适合每种分类器。此外,本研究试图设计缺失属性的相关性如何影响每对(处理算法,分类器)的性能。实验结果表明,对于所有分类器,没有一种技术绝对优于其他技术。然而,将所提出的策略与最近邻分类器相结合是面对输入数据中属性值缺失问题的最佳选择。
{"title":"A Comparison of Techniques for Handling Incomplete Input Data with a Focus on Attribute Relevance Influence","authors":"M. Millán-Giraldo, J. S. Sánchez, V. Traver","doi":"10.1109/ICMLA.2010.126","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.126","url":null,"abstract":"This work presents a new approach based on support vector regression to deal with incomplete input (unseen) data and compares it to other existing techniques. The empirical analysis has been done over 18 real data sets and using five different classifiers, with the aim of foreseeing which technique can be deemed as more suitable for each classifier. Also, this study tries to devise how the relevance of the missing attribute affects the performance of each pair (handling algorithm, classifier). Experimental results demonstrate that no technique is absolutely better than the others for all classifiers. However, combining the proposed strategy with the nearest neighbor classifier appears as the best choice to face the problem of missing attribute values in the input data.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125122441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Noise Filtering Algorithm for Imbalanced Data 一种新的不平衡数据噪声滤波算法
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.9
J. V. Hulse, T. Khoshgoftaar, Amri Napolitano
Noise filtering is a commonly-used methodology to improve the performance of learners built using low-quality data. A common type of noise filtering is a data preprocessing technique called classification filtering. In classification filtering, a classifier is built and evaluated on the training dataset (typically using cross-validation) and any misclassified instances are considered noisy. The strategies employed with classification filters are not ideal, particularly when learning from class-imbalanced data. To address this deficiency, we propose an alternative method for classification filtering called the threshold-adjusted classification filter. This methodology is compared with the standard classification filter, and the results clearly demonstrate the efficacy of our technique.
噪声滤波是一种常用的方法来提高使用低质量数据构建的学习器的性能。一种常见的噪声滤波是一种称为分类滤波的数据预处理技术。在分类过滤中,分类器是在训练数据集上构建和评估的(通常使用交叉验证),任何错误分类的实例都被认为是有噪声的。使用分类过滤器的策略并不理想,特别是在从类不平衡数据中学习时。为了解决这一缺陷,我们提出了一种分类滤波的替代方法,称为阈值调整分类滤波器。将该方法与标准分类滤波器进行了比较,结果清楚地证明了该方法的有效性。
{"title":"A Novel Noise Filtering Algorithm for Imbalanced Data","authors":"J. V. Hulse, T. Khoshgoftaar, Amri Napolitano","doi":"10.1109/ICMLA.2010.9","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.9","url":null,"abstract":"Noise filtering is a commonly-used methodology to improve the performance of learners built using low-quality data. A common type of noise filtering is a data preprocessing technique called classification filtering. In classification filtering, a classifier is built and evaluated on the training dataset (typically using cross-validation) and any misclassified instances are considered noisy. The strategies employed with classification filters are not ideal, particularly when learning from class-imbalanced data. To address this deficiency, we propose an alternative method for classification filtering called the threshold-adjusted classification filter. This methodology is compared with the standard classification filter, and the results clearly demonstrate the efficacy of our technique.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134104697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
How Dependencies Affect the Capability of Several Feature Selection Approaches to Extract the Key Features 依赖关系如何影响几种特征选择方法提取关键特征的能力
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.26
Qin Yang, R. Gras
The goal of this research is to find how dependencies affect the capability of several feature selection approaches to extract of the relevant features for a classification purpose. The hypothesis is that more dependencies and higher level dependencies mean more complexity for the task. Some experiments are used to intend to discover some limitations of several feature selection approaches by altering the degree of dependency of the test datasets. A new method has been proposed, which uses a pair of pre-designed Bayesian Networks to generate the test datasets with an easy tuning level of complexity for feature selection test. Relief, CFS, NB-GA, NB-BOA, SVM-GA, SVM-BOA and SVM-mBOA are the filter or wrapper model feature selection approaches which are used and evaluated in the experiments. For these approaches, higher level of dependency among the relevant features greatly affect the capability to find the relevant features for classification. For Relief, SVM-BOA and SVM-mBOA, if the dependencies among the irrelevant features are altered, the performance of them changes as well. Moreover, a multi-objective optimization method is used to keep the diversity of the populations in each generation of the BOA search algorithm improving the overall quality of solutions in our experiments.
本研究的目标是发现依赖关系如何影响几种特征选择方法提取相关特征以用于分类目的的能力。假设更多的依赖关系和更高级别的依赖关系意味着任务的复杂性更高。一些实验旨在通过改变测试数据集的依赖程度来发现几种特征选择方法的一些局限性。提出了一种新的方法,利用预先设计的一对贝叶斯网络生成复杂度易于调优的测试数据集进行特征选择测试。Relief、CFS、NB-GA、NB-BOA、SVM-GA、SVM-BOA和SVM-mBOA是实验中使用和评估的滤波或包装模型特征选择方法。对于这些方法,相关特征之间较高的依赖程度极大地影响了找到相关特征进行分类的能力。对于Relief、SVM-BOA和SVM-mBOA,如果不相关特征之间的依赖关系发生改变,它们的性能也会发生变化。此外,我们还采用了多目标优化方法来保持每一代BOA搜索算法中种群的多样性,从而提高了实验中解决方案的整体质量。
{"title":"How Dependencies Affect the Capability of Several Feature Selection Approaches to Extract the Key Features","authors":"Qin Yang, R. Gras","doi":"10.1109/ICMLA.2010.26","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.26","url":null,"abstract":"The goal of this research is to find how dependencies affect the capability of several feature selection approaches to extract of the relevant features for a classification purpose. The hypothesis is that more dependencies and higher level dependencies mean more complexity for the task. Some experiments are used to intend to discover some limitations of several feature selection approaches by altering the degree of dependency of the test datasets. A new method has been proposed, which uses a pair of pre-designed Bayesian Networks to generate the test datasets with an easy tuning level of complexity for feature selection test. Relief, CFS, NB-GA, NB-BOA, SVM-GA, SVM-BOA and SVM-mBOA are the filter or wrapper model feature selection approaches which are used and evaluated in the experiments. For these approaches, higher level of dependency among the relevant features greatly affect the capability to find the relevant features for classification. For Relief, SVM-BOA and SVM-mBOA, if the dependencies among the irrelevant features are altered, the performance of them changes as well. Moreover, a multi-objective optimization method is used to keep the diversity of the populations in each generation of the BOA search algorithm improving the overall quality of solutions in our experiments.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132147846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Computational Analysis of Muscular Dystrophy Sub-types Using a Novel Integrative Scheme 用一种新的综合方案计算分析肌肉萎缩症亚型
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.49
Chen Wang, S. S. Ha, Y. Wang, J. Xuan, E. Hoffman
To construct biologically interpretable features and facilitate Muscular Dystrophy (MD) sub-types classification, we propose a novel integrative scheme utilizing PPI network, functional gene sets information, and mRNA profiling. The workflow of the proposed scheme includes three major steps: First, by combining protein–protein interaction network structure and gene co-expression relationship into new distance metric, we apply affinity propagation clustering to build gene sub-networks. Secondly, we further incorporate functional gene sets knowledge to complement the physical interaction information. Finally, based on constructed sub-network and gene set features, we apply multi-class support vector machine (MSVM) for MD sub-type classification, and highlight the biomarkers contributing to the sub-type prediction. The experimental results show that our scheme could construct sub-networks that are more relevant to MD than those constructed by conventional approach. Furthermore, our integrative strategy substantially improved the prediction accuracy, especially for those hard-to-classify sub-types.
为了构建生物学上可解释的特征并促进肌肉萎缩症(MD)亚型分类,我们提出了一种利用PPI网络、功能基因集信息和mRNA谱分析的新型整合方案。该方案的工作流程包括三个主要步骤:首先,将蛋白质-蛋白质相互作用网络结构和基因共表达关系结合到新的距离度量中,采用亲和传播聚类方法构建基因子网络;其次,我们进一步整合功能基因集知识来补充物理相互作用信息。最后,基于构建的子网络和基因集特征,应用多类支持向量机(MSVM)对MD亚型进行分类,并突出对亚型预测有贡献的生物标志物。实验结果表明,与传统方法相比,该方法可以构造出与MD更相关的子网络。此外,我们的综合策略大大提高了预测的准确性,特别是对于那些难以分类的子类型。
{"title":"Computational Analysis of Muscular Dystrophy Sub-types Using a Novel Integrative Scheme","authors":"Chen Wang, S. S. Ha, Y. Wang, J. Xuan, E. Hoffman","doi":"10.1109/ICMLA.2010.49","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.49","url":null,"abstract":"To construct biologically interpretable features and facilitate Muscular Dystrophy (MD) sub-types classification, we propose a novel integrative scheme utilizing PPI network, functional gene sets information, and mRNA profiling. The workflow of the proposed scheme includes three major steps: First, by combining protein–protein interaction network structure and gene co-expression relationship into new distance metric, we apply affinity propagation clustering to build gene sub-networks. Secondly, we further incorporate functional gene sets knowledge to complement the physical interaction information. Finally, based on constructed sub-network and gene set features, we apply multi-class support vector machine (MSVM) for MD sub-type classification, and highlight the biomarkers contributing to the sub-type prediction. The experimental results show that our scheme could construct sub-networks that are more relevant to MD than those constructed by conventional approach. Furthermore, our integrative strategy substantially improved the prediction accuracy, especially for those hard-to-classify sub-types.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130492798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Boosting Multi-Task Weak Learners with Applications to Textual and Social Data 基于文本和社会数据的多任务弱学习者应用
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.61
J. Faddoul, Boris Chidlovskii, Fabien Torre, Rémi Gilleron
Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.
相对于单独学习这些任务,同时从数据中学习多个相关任务可以提高预测性能。本文提出了一种新的多任务学习算法MT-Adaboost:它将Adaboost算法Freund1999Short扩展到多任务设置,使用多任务决策残桩作为多任务弱分类器。这允许在学习空间的不同区域学习任务之间的不同依赖关系。因此,我们放宽了传统的假设,即任务在整个学习空间中的行为相似。此外,MT-Adaboost可以学习多个任务,而不需要在任务之间共享相同的标签集和/或示例。理论分析来源于对Adaboost的原始分析。在具有社会背景的大规模文本数据集(安然和烟草)上进行的多任务实验产生了非常有希望的结果。
{"title":"Boosting Multi-Task Weak Learners with Applications to Textual and Social Data","authors":"J. Faddoul, Boris Chidlovskii, Fabien Torre, Rémi Gilleron","doi":"10.1109/ICMLA.2010.61","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.61","url":null,"abstract":"Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115352482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Plant Species Classification Using a 3D LIDAR Sensor and Machine Learning 利用3D激光雷达传感器和机器学习进行植物物种分类
Pub Date : 2010-12-12 DOI: 10.1109/ICMLA.2010.57
Ulrich Weiss, P. Biber, Stefan Laible, K. Bohlmann, A. Zell
In the domain of agricultural robotics, one major application is crop scouting, e.g., for the task of weed control. For this task a key enabler is a robust detection and classification of the plant and species. Automatically distinguishing between plant species is a challenging task, because some species look very similar. It is also difficult to translate the symbolic high level description of the appearances and the differences between the plants used by humans, into a formal, computer understandable form. Also it is not possible to reliably detect structures, like leaves and branches in 3D data provided by our sensor. One approach to solve this problem is to learn how to classify the species by using a set of example plants and machine learning methods. In this paper we are introducing a method for distinguishing plant species using a 3D LIDAR sensor and supervised learning. For that we have developed a set of size and rotation invariant features and evaluated experimentally which are the most descriptive ones. Besides these features we have also compared different learning methods using the toolbox Weka. It turned out that the best methods for our application are simple logistic regression functions, support vector machines and neural networks. In our experiments we used six different plant species, typically available at common nurseries, and about 20 examples of each species. In the laboratory we were able to identify over 98% of these plants correctly.
在农业机器人领域,一个主要应用是作物侦察,例如,用于杂草控制的任务。对于这项任务,一个关键的促成因素是对植物和物种的强大检测和分类。自动区分植物物种是一项具有挑战性的任务,因为有些物种看起来非常相似。将人类使用的植物的外观和差异的象征性高级描述转化为正式的、计算机可理解的形式也很困难。此外,在我们的传感器提供的3D数据中,也不可能可靠地检测到树叶和树枝等结构。解决这个问题的一种方法是通过使用一组示例植物和机器学习方法来学习如何对物种进行分类。在本文中,我们介绍了一种使用3D激光雷达传感器和监督学习来区分植物物种的方法。为此,我们开发了一套大小和旋转不变量特征,并通过实验评估了最具描述性的特征。除了这些特性,我们还比较了使用工具箱Weka的不同学习方法。结果表明,对于我们的应用来说,最好的方法是简单的逻辑回归函数、支持向量机和神经网络。在我们的实验中,我们使用了六种不同的植物物种,这些植物通常可以在普通的苗圃里找到,每种植物大约有20个样本。在实验室里,我们能够正确识别98%以上的这些植物。
{"title":"Plant Species Classification Using a 3D LIDAR Sensor and Machine Learning","authors":"Ulrich Weiss, P. Biber, Stefan Laible, K. Bohlmann, A. Zell","doi":"10.1109/ICMLA.2010.57","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.57","url":null,"abstract":"In the domain of agricultural robotics, one major application is crop scouting, e.g., for the task of weed control. For this task a key enabler is a robust detection and classification of the plant and species. Automatically distinguishing between plant species is a challenging task, because some species look very similar. It is also difficult to translate the symbolic high level description of the appearances and the differences between the plants used by humans, into a formal, computer understandable form. Also it is not possible to reliably detect structures, like leaves and branches in 3D data provided by our sensor. One approach to solve this problem is to learn how to classify the species by using a set of example plants and machine learning methods. In this paper we are introducing a method for distinguishing plant species using a 3D LIDAR sensor and supervised learning. For that we have developed a set of size and rotation invariant features and evaluated experimentally which are the most descriptive ones. Besides these features we have also compared different learning methods using the toolbox Weka. It turned out that the best methods for our application are simple logistic regression functions, support vector machines and neural networks. In our experiments we used six different plant species, typically available at common nurseries, and about 20 examples of each species. In the laboratory we were able to identify over 98% of these plants correctly.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115581041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
期刊
2010 Ninth International Conference on Machine Learning and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1