In this paper we present two algorithms for shape recognition. Both algorithms map the contour of the shape to be recognized into a string of symbols. The first algorithm is based on supervised learning using string kernels as often used for text categorization and classification. The second algorithm is very weakly supervised and is based on the procrustes analysis and on the edit distance used for computing the similarity between strings of symbols. The second algorithm correctly recognizes 98.29% of shapes from the MPEG-7 database, i.e. better than any previous algorithms. The second algorithm is able also to retrieve similar shapes from a database
{"title":"Shape Recognition and Retrieval Using String of Symbols","authors":"M. Daliri, V. Torre","doi":"10.1109/ICMLA.2006.48","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.48","url":null,"abstract":"In this paper we present two algorithms for shape recognition. Both algorithms map the contour of the shape to be recognized into a string of symbols. The first algorithm is based on supervised learning using string kernels as often used for text categorization and classification. The second algorithm is very weakly supervised and is based on the procrustes analysis and on the edit distance used for computing the similarity between strings of symbols. The second algorithm correctly recognizes 98.29% of shapes from the MPEG-7 database, i.e. better than any previous algorithms. The second algorithm is able also to retrieve similar shapes from a database","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126841543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Designing antisense oligonucleotides with high efficacy is of great interest both for its usefulness to the study of gene regulation and for its potential therapeutic effects. The high cost associated with experimental approaches has motivated the development of computational methods to assist in their design. Essentially, these computational methods rely on various sequential and structural features to differentiate the high efficacy antisense oligonucleotides from the low efficacy. By far, however, most of the features used are either local motifs present in primary sequences or in secondary structures. We proposed a novel approach to profiling antisense oligonucleotides and the target RNA to reflect some of the global structural features such as hairpin structures. Such profiles are then utilized for classification and prediction of high efficacy oligonucleotides using support vector machines. The method was tested on a set of 348 antisense oligonucleotides of 19 RNA targets with known activity. The performance was evaluated by cross validation and ROC scores. It was shown that the prediction accuracy was significantly enhanced
{"title":"Prediction of Antisense Oligonucleotide Efficacy Using Local and Global Structure Information with Support Vector Machines","authors":"R. Craig, Li Liao","doi":"10.1109/ICMLA.2006.39","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.39","url":null,"abstract":"Designing antisense oligonucleotides with high efficacy is of great interest both for its usefulness to the study of gene regulation and for its potential therapeutic effects. The high cost associated with experimental approaches has motivated the development of computational methods to assist in their design. Essentially, these computational methods rely on various sequential and structural features to differentiate the high efficacy antisense oligonucleotides from the low efficacy. By far, however, most of the features used are either local motifs present in primary sequences or in secondary structures. We proposed a novel approach to profiling antisense oligonucleotides and the target RNA to reflect some of the global structural features such as hairpin structures. Such profiles are then utilized for classification and prediction of high efficacy oligonucleotides using support vector machines. The method was tested on a set of 348 antisense oligonucleotides of 19 RNA targets with known activity. The performance was evaluated by cross validation and ROC scores. It was shown that the prediction accuracy was significantly enhanced","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116485268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce regression databases (REDB) to formalize and automate probabilistic querying using sparse learning sets. The REDB data model involves observation data, learning set data, views definitions, and a regression model instance. The observation data is a collection of relational tuples over a set of attributes; the learning data set involves a subset of observation tuples, augmented with learned attributes, which are modeled as random variables; the views are expressed as linear combinations of observation and learned attributes; and the regression model involves functions that map observation tuples to probability distributions of the random variables, which are learned dynamically from the learning data set. The REDB query language extends relational algebra project-select queries with conditions on probabilities of first-order logical expressions, which in turn involve linear combinations of learned attributes and views, and arithmetic comparison operators. Such capability relies on the underlying regression model for the learned attributes. We show that REDB queries are computable by developing conceptual evaluation algorithms and by proving their correctness and termination
{"title":"Regression Databases: Probabilistic Querying Using Sparse Learning Sets","authors":"A. Brodsky, C. Domeniconi, David Etter","doi":"10.1109/ICMLA.2006.44","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.44","url":null,"abstract":"We introduce regression databases (REDB) to formalize and automate probabilistic querying using sparse learning sets. The REDB data model involves observation data, learning set data, views definitions, and a regression model instance. The observation data is a collection of relational tuples over a set of attributes; the learning data set involves a subset of observation tuples, augmented with learned attributes, which are modeled as random variables; the views are expressed as linear combinations of observation and learned attributes; and the regression model involves functions that map observation tuples to probability distributions of the random variables, which are learned dynamically from the learning data set. The REDB query language extends relational algebra project-select queries with conditions on probabilities of first-order logical expressions, which in turn involve linear combinations of learned attributes and views, and arithmetic comparison operators. Such capability relies on the underlying regression model for the learned attributes. We show that REDB queries are computable by developing conceptual evaluation algorithms and by proving their correctness and termination","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116144918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a novel application of support vector machines (SVMs) to the problem of identifying potential supernovae using photometric and geometric features computed from astronomical imagery. The challenges of this supervised learning application are significant: 1) noisy and corrupt imagery resulting in high levels of feature uncertainty, 2) features with heavy-tailed, peaked distributions, 3) extremely imbalanced and overlapping positive and negative data sets, and 4) the need to reach high positive classification rates, i.e. to find all potential supernovae, while reducing the burdensome workload of manually examining false positives. High accuracy is achieved via a sign-preserving, shifted log transform applied to features with peaked, heavy-tailed distributions. The imbalanced data problem is handled by oversampling positive examples, selectively sampling misclassified negative examples, and iteratively training multiple SVMs for improved supernova recognition on unseen test data. We present cross-validation results and demonstrate the impact on a large-scale supernova survey that currently uses the SVM decision value to rank-order 600,000 potential supernovae each night
{"title":"Supernova Recognition Using Support Vector Machines","authors":"R. Romano, C. Aragon, C. Ding","doi":"10.1109/ICMLA.2006.49","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.49","url":null,"abstract":"We introduce a novel application of support vector machines (SVMs) to the problem of identifying potential supernovae using photometric and geometric features computed from astronomical imagery. The challenges of this supervised learning application are significant: 1) noisy and corrupt imagery resulting in high levels of feature uncertainty, 2) features with heavy-tailed, peaked distributions, 3) extremely imbalanced and overlapping positive and negative data sets, and 4) the need to reach high positive classification rates, i.e. to find all potential supernovae, while reducing the burdensome workload of manually examining false positives. High accuracy is achieved via a sign-preserving, shifted log transform applied to features with peaked, heavy-tailed distributions. The imbalanced data problem is handled by oversampling positive examples, selectively sampling misclassified negative examples, and iteratively training multiple SVMs for improved supernova recognition on unseen test data. We present cross-validation results and demonstrate the impact on a large-scale supernova survey that currently uses the SVM decision value to rank-order 600,000 potential supernovae each night","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128615481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The accuracy of the rules produced by a concept learning system can be hindered by the presence of errors in the data, such as "ill-defined" attributes that are too general or too specific for the concept to learn. In this paper, we devise a method that uses the Boolean differences computed by a program called Newton to identify multiple ill-defined attributes in a dataset in a single pass. The method is based on a compound heuristic that assigns a real-valued rank to each possible hypothesis based on its key characteristics. We show by extensive empirical testing on randomly generated classifiers that the hypothesis with the highest rank is the correct one with an observed probability quickly converging to 100%. Moreover, the monotonicity of the function enables us to use it as a rough estimator of its own likelihood
{"title":"An Efficient Heuristic for Discovering Multiple Ill-Defined Attributes in Datasets","authors":"Sylvain Hallé","doi":"10.1109/ICMLA.2006.14","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.14","url":null,"abstract":"The accuracy of the rules produced by a concept learning system can be hindered by the presence of errors in the data, such as \"ill-defined\" attributes that are too general or too specific for the concept to learn. In this paper, we devise a method that uses the Boolean differences computed by a program called Newton to identify multiple ill-defined attributes in a dataset in a single pass. The method is based on a compound heuristic that assigns a real-valued rank to each possible hypothesis based on its key characteristics. We show by extensive empirical testing on randomly generated classifiers that the hypothesis with the highest rank is the correct one with an observed probability quickly converging to 100%. Moreover, the monotonicity of the function enables us to use it as a rough estimator of its own likelihood","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114667052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic signature verification is an active area of research with numerous applications such as bank check verification, ATM access, etc. In this research, a kernel principal component self-regression (KPCSR) model is proposed for offline signature verification and recognition problems. Developed from the kernel principal component regression (KPCR), the self-regression model selects a subset of the principal components from the kernel space for the input variables to accurately characterize each user's signature, thus offering good verification and recognition performance. The model directly works on bitmap images in the preliminary experiments, showing satisfactory performance. A modular scheme with subject-specific KPCSR structure proves very efficient, from which each user is assigned an independent KPCSR model for coding the corresponding visual information. Experimental results obtained on public benchmarking signature databases demonstrate the superiority of the proposed method
{"title":"Off-Line Signature Recognition and Verification by Kernel Principal Component Self-Regression","authors":"Bai-ling Zhang","doi":"10.1109/ICMLA.2006.37","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.37","url":null,"abstract":"Automatic signature verification is an active area of research with numerous applications such as bank check verification, ATM access, etc. In this research, a kernel principal component self-regression (KPCSR) model is proposed for offline signature verification and recognition problems. Developed from the kernel principal component regression (KPCR), the self-regression model selects a subset of the principal components from the kernel space for the input variables to accurately characterize each user's signature, thus offering good verification and recognition performance. The model directly works on bitmap images in the preliminary experiments, showing satisfactory performance. A modular scheme with subject-specific KPCSR structure proves very efficient, from which each user is assigned an independent KPCSR model for coding the corresponding visual information. Experimental results obtained on public benchmarking signature databases demonstrate the superiority of the proposed method","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133225878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intelligent devices, with smart clutter management capabilities, can enhance a user's situational awareness under adverse conditions. Two approaches to assist a user with target detection and clutter analysis are presented, and suggestions on how these tools could be integrated with an electronic chart system are further detailed. The first tool, which can assist a user in finding a target partially obscured by display clutter, is a multiple-view generalization of AdaBoost. The second technique determines a meaningful measure of clutter in electronic displays by clustering features in both geospatial and color space. The clutter metric correlates with preliminary, subjective, clutter ratings. The user can be warned if display clutter is a potential hazard to performance. Synthetic and real data sets are used for performance evaluation of the proposed technique compared with recent classifier fusion strategies
{"title":"Intelligent Electronic Navigational Aids: A New Approach","authors":"C. Barbu, M. Lohrenz, G. Layne","doi":"10.1109/ICMLA.2006.30","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.30","url":null,"abstract":"Intelligent devices, with smart clutter management capabilities, can enhance a user's situational awareness under adverse conditions. Two approaches to assist a user with target detection and clutter analysis are presented, and suggestions on how these tools could be integrated with an electronic chart system are further detailed. The first tool, which can assist a user in finding a target partially obscured by display clutter, is a multiple-view generalization of AdaBoost. The second technique determines a meaningful measure of clutter in electronic displays by clustering features in both geospatial and color space. The clutter metric correlates with preliminary, subjective, clutter ratings. The user can be warned if display clutter is a potential hazard to performance. Synthetic and real data sets are used for performance evaluation of the proposed technique compared with recent classifier fusion strategies","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"25 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131924931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principle component analysis (PCA) and random projection (RP), are investigated for this purpose and compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and five micro array data sets. The experiment results demonstrate that PCA outperforms RP for all data sets used in this study. However, the experiments also show that PCA is more sensitive to the choice of the number of reduced dimensions. After reaching a peak, the accuracy degrades with the number of dimensions for PCA, while the accuracy for RP increases with the number of dimensions. The experiments also show that the use of PCA and RP may even outperform using the non-reduced feature set (in 9 respectively 6 cases out of 10), hence not only resulting in more efficient, but also more effective, nearest neighbor classification
{"title":"Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification","authors":"Sampath Deegalla, Henrik Boström","doi":"10.1109/ICMLA.2006.43","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.43","url":null,"abstract":"The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principle component analysis (PCA) and random projection (RP), are investigated for this purpose and compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and five micro array data sets. The experiment results demonstrate that PCA outperforms RP for all data sets used in this study. However, the experiments also show that PCA is more sensitive to the choice of the number of reduced dimensions. After reaching a peak, the accuracy degrades with the number of dimensions for PCA, while the accuracy for RP increases with the number of dimensions. The experiments also show that the use of PCA and RP may even outperform using the non-reduced feature set (in 9 respectively 6 cases out of 10), hence not only resulting in more efficient, but also more effective, nearest neighbor classification","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132115579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
U. Johansson, Tuwe Löfström, Rikard König, Cecilia Sönströd, L. Niklasson
When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only
{"title":"Rule Extraction from Opaque Models-- A Slightly Different Perspective","authors":"U. Johansson, Tuwe Löfström, Rikard König, Cecilia Sönströd, L. Niklasson","doi":"10.1109/ICMLA.2006.46","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.46","url":null,"abstract":"When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128392051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
More and more powerful computer technology inspires people to investigate information hidden under huge amounts of documents. In this report, we are especially interested in documents with relative time order, which we also call document streams. Examples include TV news, forums, emails of company projects, call center telephone logs, etc. To get an insight into these document streams, first we need to detect the events among the document streams. We use a time-sensitive Dirichlet process mixture model to find the events in the document streams. A time sensitive Dirichlet process mixture model is a generative model, which allows a potentially infinite number of mixture components and uses a Dirichlet compound multinomial model to model the distribution of words in documents. In this report, we consider three different time sensitive Dirichlet process mixture models: an exponential decay kernel model, a polynomial decay function kernel Dirichlet process model and a sliding window kernel model. Experiments on the TDT2 dataset have shown that the time sensitive models perform 18-20% better in terms of accuracy than the Dirichlet process mixture model. The sliding windows kernel and the polynomial kernel are more promising in detecting events. We use ThemeRiver to provide a visualization of the events along the time axis. With the help of ThemeRiver, people can easily get an overall picture of how different events evolve. Besides ThemeRiver, we investigate using top words as a high-level summarization of each event. Experiment results on TDT2 dataset suggests that the sliding window kernel is a better choice both in terms of capturing the trend of the events and expressibility
{"title":"Trend Analysis for Large Document Streams","authors":"Chengliang Zhang, Shenghuo Zhu, Yihong Gong","doi":"10.1109/ICMLA.2006.51","DOIUrl":"https://doi.org/10.1109/ICMLA.2006.51","url":null,"abstract":"More and more powerful computer technology inspires people to investigate information hidden under huge amounts of documents. In this report, we are especially interested in documents with relative time order, which we also call document streams. Examples include TV news, forums, emails of company projects, call center telephone logs, etc. To get an insight into these document streams, first we need to detect the events among the document streams. We use a time-sensitive Dirichlet process mixture model to find the events in the document streams. A time sensitive Dirichlet process mixture model is a generative model, which allows a potentially infinite number of mixture components and uses a Dirichlet compound multinomial model to model the distribution of words in documents. In this report, we consider three different time sensitive Dirichlet process mixture models: an exponential decay kernel model, a polynomial decay function kernel Dirichlet process model and a sliding window kernel model. Experiments on the TDT2 dataset have shown that the time sensitive models perform 18-20% better in terms of accuracy than the Dirichlet process mixture model. The sliding windows kernel and the polynomial kernel are more promising in detecting events. We use ThemeRiver to provide a visualization of the events along the time axis. With the help of ThemeRiver, people can easily get an overall picture of how different events evolve. Besides ThemeRiver, we investigate using top words as a high-level summarization of each event. Experiment results on TDT2 dataset suggests that the sliding window kernel is a better choice both in terms of capturing the trend of the events and expressibility","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123325848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}