2006 5th International Conference on Machine Learning and Applications (ICMLA'06)最新文献

英文中文

Two-Level Hierarchical Hybrid SVM-RVM Classification Model 两级分层混合SVM-RVM分类模型

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.52

Catarina Silva, B. Ribeiro

Support vector machines (SVM) and relevance vector machines (RVM) constitute two state-of-the-art learning machines that are currently focus of cutting-edge research. SVM present accuracy and complexity preponderance, but are surpassed by RVM when probabilistic outputs or kernel selection come to discussion. We propose a two-level hierarchical hybrid SVM-RVM model to combine the best of both learning machines. The proposed model first level uses an RVM to determine the less confident classified examples and the second level then makes use of an SVM to learn and classify the tougher examples. We show the benefits of the hierarchical approach on a text classification task, where the two-levels outperform both learning machines

支持向量机(SVM)和相关向量机(RVM)是两种最先进的学习机器，是目前研究的热点。支持向量机具有精度和复杂性优势，但在讨论概率输出或核选择时被RVM超越。我们提出了一种两级分层混合SVM-RVM模型来结合这两种学习机的优点。所提出的模型第一层使用RVM来确定可信度较低的分类示例，第二层使用SVM来学习和分类较难的示例。我们展示了分层方法在文本分类任务上的好处，其中两层方法优于两种学习机器

引用次数: 8

Automatic Intravital Video Mining of Rolling and Adhering Leukocytes 滚动和粘附白细胞的自动实时视频挖掘

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.18

Xin C. Anders, Chengcui Zhang, Hong Yuan

In this paper, we present an automatic spatio-temporal mining system of rolling and adherent leukocytes for intravital videos. The magnitude of leukocyte adhesion and the decrease in rolling velocity are common interests for inflammation response studies. Currently, there is no existing system which is perfect for such purposes. Our approach starts with locating moving leukocytes by probabilistic learning of temporal features. It then removes noises through median and location-based filtering, and finally performs motion correspondence through centroid trackers. By extracting the information about moving leukocytes first, we are able to extract adherent leukocytes in a more robust way with an adaptive threshold method. The effectiveness and the efficiency of the proposed method are demonstrated by the experimental results

在本文中，我们提出了一个用于活体视频的滚动和粘附白细胞的自动时空挖掘系统。白细胞粘附的大小和滚动速度的降低是炎症反应研究的共同兴趣。目前，还没有一个现有的系统可以完美地实现这一目的。我们的方法首先通过时间特征的概率学习来定位移动的白细胞。然后通过中值滤波和位置滤波去除噪声，最后通过质心跟踪器进行运动对应。通过首先提取移动白细胞的信息，我们可以使用自适应阈值法以更稳健的方式提取粘附白细胞。实验结果证明了该方法的有效性和高效性

引用次数: 0

Introducing Emergent Loose Modules into the Learning Process of a Linear Genetic Programming System 将涌现松散模块引入线性遗传规划系统的学习过程

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.31

Xin Li, Chi Zhou, Weimin Xiao, P. Nelson

Modularity and building blocks have drawn attention from the genetic programming (GP) community for a long time. The results are usually twofold: a hierarchical evolution with adequate building block reuse can accelerate the learning process, but rigidly defined and excessively employed modules may also counteract the expected advantages by confining the reachable search space. In this work, we introduce the concept of emergent loose modules based on a new linear GP system, prefix gene expression programming (P-GEP), in an attempt to balance between the stochastic exploration and the hierarchical construction for the optimal solutions. Emergent loose modules are dynamically produced by the evolution, and are reusable as sub-functions in later generations. The proposed technique is fully illustrated with a simple symbolic regression problem. The initial experimental results suggest it is a flexible approach in identifying the evolved regularity and the emergent loose modules are critical in composing the best solutions

模块化和构建块一直是遗传编程(GP)界关注的问题。结果通常是双重的:具有适当的构建块重用的分层进化可以加速学习过程，但是严格定义和过度使用的模块也可能通过限制可访问的搜索空间而抵消预期的优势。在这项工作中，我们引入了基于一个新的线性GP系统的涌现松散模块的概念，前缀基因表达规划(P-GEP)，试图在随机探索和分层构建之间取得最优解的平衡。演化过程中动态产生的涌现松散模块可作为子功能在后续迭代中重用。用一个简单的符号回归问题充分说明了所提出的技术。初步的实验结果表明，这是一种灵活的识别演化规律的方法，而出现的松散模块对于组成最佳解决方案至关重要

引用次数: 1

Naive Bayes Classification Given Probability Estimation Trees 给定概率估计树的朴素贝叶斯分类

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.36

Zengchang Qin

Tree induction is one of the most effective and widely used models in classification. Unfortunately, decision trees such as C4.5 have been found to provide poor probability estimates. By the empirical studies, Provost and Domingos found that probability estimation trees (PETs) give a fairly good probability estimation. However, different from normal decision trees, pruning reduces the performances of PETs. In order to get a good probability estimation, we usually need large trees which are not good in terms of the model transparency. In this paper, two hybrid models by combining the naive Bayes classifier and PETs are proposed in order to build a model with good performance without losing too much transparency. The first model use naive Bayes estimation given a PET and the second model use a group of small-sized PETs as naive Bayes estimators. Empirical studies show that the first model outperforms the PET model at shallow depth and the second model is equivalent to naive Bayes and PET

树归纳是分类中最有效、应用最广泛的模型之一。不幸的是，人们发现像C4.5这样的决策树提供的概率估计很差。通过实证研究，Provost和Domingos发现概率估计树(PETs)给出了相当好的概率估计。然而，与普通决策树不同的是，剪枝会降低pet的性能。为了得到一个好的概率估计，我们通常需要大的树，这在模型透明度方面是不好的。本文提出了将朴素贝叶斯分类器和pet相结合的两种混合模型，目的是在不损失太多透明度的情况下，建立一个性能良好的模型。第一个模型使用给定PET的朴素贝叶斯估计，第二个模型使用一组小型PET作为朴素贝叶斯估计量。实证研究表明，第一个模型在浅深度下优于PET模型，第二个模型相当于朴素贝叶斯和PET

引用次数: 29

On L_1-Norm Multi-class Support Vector Machines 关于l_1 -范数多类支持向量机

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.38

Lifeng Wang, Xiaotong Shen, Yuan F. Zheng

Binary support vector machines (SVM) have proven effective in classification. However, problems remain with respect to feature selection in multi-class classification. This article proposes a novel multi-class SVM, which performs classification and feature selection simultaneously via L1-norm penalized sparse representations. The proposed methodology, together with our developed regularization solution path, permits feature selection within the framework of classification. The operational characteristics of the proposed methodology is examined via both simulated and benchmark examples, and is compared to some competitors in terms of the accuracy of prediction and feature selection. The numerical results suggest that the proposed methodology is highly competitive

二值支持向量机(SVM)已被证明是有效的分类方法。然而，在多类分类中，特征选择问题仍然存在。本文提出了一种新的多类支持向量机，该支持向量机通过l1范数惩罚稀疏表示同时进行分类和特征选择。所提出的方法，连同我们开发的正则化解决方案路径，允许在分类框架内进行特征选择。通过模拟和基准示例检查了所提出方法的操作特性，并在预测和特征选择的准确性方面与一些竞争对手进行了比较。数值结果表明，所提出的方法具有很强的竞争力

引用次数: 25

Semi-supervised Data Organization for Interactive Anomaly Analysis. 交互式异常分析的半监督数据组织。

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.47

J. Aslam, S. Bratus, Virgil Pavlu

We consider the problem of interactive iterative analysis of datasets that consist of a large number of records represented as feature vectors. The record set is known to contain a number of anomalous records that the analyst desires to locate and describe in a short and comprehensive manner The nature of the anomaly is not known in advance (in particular, it is not known, which features or feature values identify the anomalous records, and which are irrelevant to the search), and becomes clear only in the process of analysis, as the description of the target subset is gradually refined. This situation is common in computer intrusion analysis, when a forensic analyst browses the logs to locate traces of an intrusion of unknown nature and origin, and extends to other tasks and data sets. To facilitate such "browsing for anomalies", we propose an unsupervised data organization technique for initial summarization and representation of data sets, and a semi-supervised learning technique for iterative modifications of the latter representation. Our approach is based on information content and Jensen-Shannon divergence and is related to information bottleneck methods. We have implemented it as apart of the Kerf log analysis toolkit

我们考虑了数据集的交互式迭代分析问题，这些数据集由大量以特征向量表示的记录组成。已知记录集包含许多异常记录，分析人员希望以简短而全面的方式定位和描述这些异常记录。异常的性质事先不知道(特别是不知道哪些特征或特征值识别了异常记录，哪些与搜索无关)，只有在分析过程中，随着目标子集的描述逐渐细化，才会变得清晰。这种情况在计算机入侵分析中很常见，当取证分析人员浏览日志以定位未知性质和来源的入侵痕迹，并扩展到其他任务和数据集时。为了促进这种“浏览异常”，我们提出了一种无监督数据组织技术，用于数据集的初始总结和表示，以及一种半监督学习技术，用于对后一种表示进行迭代修改。我们的方法基于信息内容和Jensen-Shannon散度，并与信息瓶颈方法相关。我们已经将其作为Kerf日志分析工具包的一部分实现

{"title":"Semi-supervised Data Organization for Interactive Anomaly Analysis.","authors":"J. Aslam, S. Bratus, Virgil Pavlu","doi":"10.1109/ICMLA.2006.47","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.47","url":null,"abstract":"We consider the problem of interactive iterative analysis of datasets that consist of a large number of records represented as feature vectors. The record set is known to contain a number of anomalous records that the analyst desires to locate and describe in a short and comprehensive manner The nature of the anomaly is not known in advance (in particular, it is not known, which features or feature values identify the anomalous records, and which are irrelevant to the search), and becomes clear only in the process of analysis, as the description of the target subset is gradually refined. This situation is common in computer intrusion analysis, when a forensic analyst browses the logs to locate traces of an intrusion of unknown nature and origin, and extends to other tasks and data sets. To facilitate such \"browsing for anomalies\", we propose an unsupervised data organization technique for initial summarization and representation of data sets, and a semi-supervised learning technique for iterative modifications of the latter representation. Our approach is based on information content and Jensen-Shannon divergence and is related to information bottleneck methods. We have implemented it as apart of the Kerf log analysis toolkit","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122069405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Modeling Hesitation and Conflict: A Belief-Based Approach for Multi-class Problems 犹豫与冲突建模:多类问题的基于信念的方法

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.35

Thomas Burger, O. Aran, A. Caplier

Support vector machine (SVM) is a powerful tool for binary classification. Numerous methods are known to fuse several binary SVMs into multi-class (MC) classifiers. These methods are efficient, but an accurate study of the misclassified items leads to notice two sources of mistakes: (1) the response of each classifier does not use the entire information from the SVM, and (2) the decision method does not use the entire information from the classifier responses. In this paper, we present a method which partially prevents these two losses of information by applying belief theories (BTs) to SVM fusion, while keeping the efficient aspect of the classical methods

支持向量机(SVM)是一种强大的二值分类工具。已知有许多方法将多个二值支持向量机融合成多类(MC)分类器。这些方法是有效的，但是对错误分类项目的准确研究导致注意到两个错误来源:(1)每个分类器的响应没有使用来自支持向量机的全部信息，(2)决策方法没有使用来自分类器响应的全部信息。本文提出了一种将信念理论应用于支持向量机融合的方法，在保留经典方法的有效性的同时，部分地避免了这两种信息的丢失

引用次数: 19

Incremental Learning By Decomposition 分解式增量学习

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.28

A. Bouchachia

Adaptivity in neural networks aims at equipping learning algorithms with the ability to self-update as new training data becomes available. In many application, data arrives over long periods of time, hence the traditional one-shot training phase cannot be applied. The most appropriate training methodology in such circumstances is incremental learning (IL). The present paper introduces a new IL algorithm dedicated to classification problems. The basic idea is to incrementally generate prototyped categories which are then linked to their corresponding classes. Numerical simulations show the performance of the proposed algorithm

神经网络的自适应性旨在使学习算法在新的训练数据可用时具有自我更新的能力。在许多应用中，数据需要很长时间才能到达，因此传统的一次性训练阶段无法应用。在这种情况下，最合适的训练方法是增量学习(IL)。本文介绍了一种新的用于分类问题的IL算法。基本思想是增量地生成原型类别，然后将其链接到相应的类。数值仿真结果表明了该算法的有效性

引用次数: 11

Formal Concept Analysis for Digital Ecosystem 数字生态系统的形式概念分析

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.24

Huaiguo Fu

Formal concept analysis (FCA) is an effective tool for data analysis and knowledge discovery. Concept lattice, which is derived from mathematical order theory and lattice theory, is the core of FCA. Many research works of various areas show that concept lattices structures is an effective platform for data mining, machine learning, information retrieval, software engineer, etc. This paper offers a brief overview of FCA and proposes to apply FCA as a tool for analysis and visualization of data in digital ecosystem, and also discusses the applications of data mining for digital ecosystem

形式概念分析(FCA)是数据分析和知识发现的有效工具。概念格是FCA的核心，它衍生自数学序理论和格理论。许多领域的研究表明，概念格结构是数据挖掘、机器学习、信息检索、软件工程等领域的有效平台。本文简要介绍了FCA的研究概况，提出将FCA作为数字生态系统中数据分析和可视化的工具，并讨论了数据挖掘在数字生态系统中的应用

引用次数: 14

A Fast Feature Selection Model for Online Handwriting Symbol Recognition 一种用于在线手写符号识别的快速特征选择模型

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

Pub Date : 2006-12-14 DOI: 10.1109/ICMLA.2006.6

B. Huang, Mohand Tahar Kechadi

Many feature selection models have been proposed for online handwriting recognition. However, most of them require expensive computational overhead, or inaccurately find an improper feature set which leads to unacceptable recognition rates. This paper presents a new efficient feature selection model for handwriting symbol recognition by using an improved sequential floating search method coupled with a hybrid classifier, which is obtained by combining hidden Markov models with multilayer forward network. The effectiveness of proposed method is verified by comprehensive experiments based on UNIPEN database

针对在线手写识别，已经提出了许多特征选择模型。然而，大多数方法都需要昂贵的计算开销，或者不能准确地找到不合适的特征集，从而导致不可接受的识别率。将隐马尔可夫模型与多层前向网络相结合，提出了一种改进的顺序浮动搜索法与混合分类器相结合的高效手写符号特征选择模型。基于UNIPEN数据库的综合实验验证了该方法的有效性

引用次数: 12

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2006 5th International Conference on Machine Learning and Applications (ICMLA'06)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀