Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093338
R. Reyhani, A. Eftekhari-Moghadam
Time series is one of the most attractive and mysterious mathematical subjects. Weather temperature, rainfall, water flow volume of a river and other similar cases in meteorology are known and predictable time series; amount of load peak, electricity price and other similar cases in electrical engineering are considerable time series. Time series forecasting is highly taken into account in economy. Stocks price in stock exchange market, currency equivalent rate in such market as Forex, world price of petroleum, sugar, gas, gold and other key stuffs are best known time series. The discovery of chaos in economics time such as stock exchange is highly regarded by scholars of economics. In recent years, chaos has proven in many economic time series such as stock changes. Also, it has been proven that discovery of chaos will help to forecast time series by intelligent algorithms better than before. In this paper, by propose a new heuristic method inspired from chaotic characteristic of economic time series, forecasts this time series by means of artificial neural networks. In proposed method, output of chaotic function is used to help time series prediction well.
{"title":"A heuristic method for forecasting chaotic time series based on economic variables","authors":"R. Reyhani, A. Eftekhari-Moghadam","doi":"10.1109/ICDIM.2011.6093338","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093338","url":null,"abstract":"Time series is one of the most attractive and mysterious mathematical subjects. Weather temperature, rainfall, water flow volume of a river and other similar cases in meteorology are known and predictable time series; amount of load peak, electricity price and other similar cases in electrical engineering are considerable time series. Time series forecasting is highly taken into account in economy. Stocks price in stock exchange market, currency equivalent rate in such market as Forex, world price of petroleum, sugar, gas, gold and other key stuffs are best known time series. The discovery of chaos in economics time such as stock exchange is highly regarded by scholars of economics. In recent years, chaos has proven in many economic time series such as stock changes. Also, it has been proven that discovery of chaos will help to forecast time series by intelligent algorithms better than before. In this paper, by propose a new heuristic method inspired from chaotic characteristic of economic time series, forecasts this time series by means of artificial neural networks. In proposed method, output of chaotic function is used to help time series prediction well.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133731802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093350
Fariba Khademolghorani
Association rule mining is one of the most applicable techniques in data mining, which includes two stages. The first is to find the frequent itemsets; the second is to use them to generate association rules. A lot of algorithms have been introduced for discovering these rules. Most of the previous algorithms mine occurrence rules, which are not interesting and readable for the users. In this paper, we propose a new efficient algorithm for exploring high-quality association rules by improving the imperialist competitive algorithm. The proposed method mine interesting and understandable association rules without relying upon the minimum support and the minimum confidence thresholds in only single run. The algorithm is evaluated with several experiments, and the results indicate the efficiency of our method.
{"title":"An effective algorithm for mining association rules based on imperialist competitive algorithm","authors":"Fariba Khademolghorani","doi":"10.1109/ICDIM.2011.6093350","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093350","url":null,"abstract":"Association rule mining is one of the most applicable techniques in data mining, which includes two stages. The first is to find the frequent itemsets; the second is to use them to generate association rules. A lot of algorithms have been introduced for discovering these rules. Most of the previous algorithms mine occurrence rules, which are not interesting and readable for the users. In this paper, we propose a new efficient algorithm for exploring high-quality association rules by improving the imperialist competitive algorithm. The proposed method mine interesting and understandable association rules without relying upon the minimum support and the minimum confidence thresholds in only single run. The algorithm is evaluated with several experiments, and the results indicate the efficiency of our method.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121750721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093364
Jun Wu, Xiaodong Li, Xin Wang, Baoping Yan
The principal goal of DNS usage mining is the discovery and analysis of patterns in the query behavior of DNS users. In this paper, we develop a unified framework for DNS usage mining based on Clustering analysis of co-occurrence data derived from DNS server query data. Through transforming the raw query data into co-occurrence matrix, user transaction clustering can be applied to discover the user groups according to their similar query behaviors. Using the aggregate usage profile that represents a user cluster and suitable similarity measure, a specific approach for a domain name recommendation engine is shown. For identifying the latent purpose of a domain name, Probabilistic Latent Semantic Analysis (PLSA) is used, which can automatically discover hidden semantic relationships between users and domain names. We demonstrate the effectiveness of our approaches through experiments performed on real-world data sets.
{"title":"DNS usage mining and its two applications","authors":"Jun Wu, Xiaodong Li, Xin Wang, Baoping Yan","doi":"10.1109/ICDIM.2011.6093364","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093364","url":null,"abstract":"The principal goal of DNS usage mining is the discovery and analysis of patterns in the query behavior of DNS users. In this paper, we develop a unified framework for DNS usage mining based on Clustering analysis of co-occurrence data derived from DNS server query data. Through transforming the raw query data into co-occurrence matrix, user transaction clustering can be applied to discover the user groups according to their similar query behaviors. Using the aggregate usage profile that represents a user cluster and suitable similarity measure, a specific approach for a domain name recommendation engine is shown. For identifying the latent purpose of a domain name, Probabilistic Latent Semantic Analysis (PLSA) is used, which can automatically discover hidden semantic relationships between users and domain names. We demonstrate the effectiveness of our approaches through experiments performed on real-world data sets.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130357101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093342
L. Alexandre, Jorge Coelho
In this paper we describe a tool for imposing constraints in the content of webpages. This tool can be used in one of two ways, by imposing constraints on content previously to its dissemination on a website or by imposing constraints on content which is being presented to users inside a local area network. The tool allows the composition of constraints either visually or manually for removing, replacing or even blocking content from publication and presentation. This tool relies on a highly declarative and flexible approach enabling an agile implementation of constraints.
{"title":"Filtering XML content for publication and presentation on the web","authors":"L. Alexandre, Jorge Coelho","doi":"10.1109/ICDIM.2011.6093342","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093342","url":null,"abstract":"In this paper we describe a tool for imposing constraints in the content of webpages. This tool can be used in one of two ways, by imposing constraints on content previously to its dissemination on a website or by imposing constraints on content which is being presented to users inside a local area network. The tool allows the composition of constraints either visually or manually for removing, replacing or even blocking content from publication and presentation. This tool relies on a highly declarative and flexible approach enabling an agile implementation of constraints.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"6 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113932054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093330
Ning Li, Zhanhuai Li, Yanming Nie, Xiling Sun, Xia Li
Defect number prediction is essential to make a key decision on when to stop testing. For more applicable and accurate prediction, we propose an ensemble prediction model based on stacked generalization (PMoSG), and use it to predict the number of defects detected by third-party black-box testing. Taking the characteristics of black-box defects and causal relationships among factors which influence defect detection into account, Bayesian net and other numeric prediction models are employed in our ensemble models. Experimental results show that our PMoSG model achieves a significant improvement in accuracy of defect numeric prediction than any individual model, and achieves best prediction accuracy when using LWL(Locally Weighted Learning) method as level-1 model.
{"title":"Predicting software black-box defects using stacked generalization","authors":"Ning Li, Zhanhuai Li, Yanming Nie, Xiling Sun, Xia Li","doi":"10.1109/ICDIM.2011.6093330","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093330","url":null,"abstract":"Defect number prediction is essential to make a key decision on when to stop testing. For more applicable and accurate prediction, we propose an ensemble prediction model based on stacked generalization (PMoSG), and use it to predict the number of defects detected by third-party black-box testing. Taking the characteristics of black-box defects and causal relationships among factors which influence defect detection into account, Bayesian net and other numeric prediction models are employed in our ensemble models. Experimental results show that our PMoSG model achieves a significant improvement in accuracy of defect numeric prediction than any individual model, and achieves best prediction accuracy when using LWL(Locally Weighted Learning) method as level-1 model.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115990851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093347
A. Chua, Radhika Shenoy Balkunje, D. Goh
Even though knowledge management has been applied to study disaster relief efforts, the extent to which knowledge management has been incorporated into disaster management portals has yet to attract any scholarly attention. The purpose of this paper is therefore two-fold. One, it seeks to develop a KM framework for disaster management portals. Two, with this framework, it seeks to construct a checklist and demonstrate its utility by evaluating 60 disaster management portals from the North American and the Asian region. Findings revealed statistically significant differences in the knowledge management implementation level between the two regions. The findings also helped to identify the essential knowledge management features that these portals should support. This research not only highlights the usefulness of a knowledge management perspective in managing digital information but also serves as a template to design disaster management portals.
{"title":"Evaluation of disaster management portals: Applying knowledge management to digital information","authors":"A. Chua, Radhika Shenoy Balkunje, D. Goh","doi":"10.1109/ICDIM.2011.6093347","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093347","url":null,"abstract":"Even though knowledge management has been applied to study disaster relief efforts, the extent to which knowledge management has been incorporated into disaster management portals has yet to attract any scholarly attention. The purpose of this paper is therefore two-fold. One, it seeks to develop a KM framework for disaster management portals. Two, with this framework, it seeks to construct a checklist and demonstrate its utility by evaluating 60 disaster management portals from the North American and the Asian region. Findings revealed statistically significant differences in the knowledge management implementation level between the two regions. The findings also helped to identify the essential knowledge management features that these portals should support. This research not only highlights the usefulness of a knowledge management perspective in managing digital information but also serves as a template to design disaster management portals.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126379442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093322
M. Nečaský, Jakub Klímek, Jakub Malý
Modern information systems usually exploit numerous XML formats for communication with other systems. There are, however, many potential problems hidden. This includes the degree of readability, integrability and adaptability of the XML formats. In the first part of this paper we demonstrate the problems on a real-world application — the National Register of Public Procurement in the Czech Republic. In the second part we show how we can improve readability, integrability and adaptability of the XML formats of this system with a conceptual model for XML we have developed in our previous works. Finally, we generalize our experience gained into a methodology which can be applied in any other problem domain.
{"title":"When theory meets practice: A case report on conceptual modeling for XML","authors":"M. Nečaský, Jakub Klímek, Jakub Malý","doi":"10.1109/ICDIM.2011.6093322","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093322","url":null,"abstract":"Modern information systems usually exploit numerous XML formats for communication with other systems. There are, however, many potential problems hidden. This includes the degree of readability, integrability and adaptability of the XML formats. In the first part of this paper we demonstrate the problems on a real-world application — the National Register of Public Procurement in the Czech Republic. In the second part we show how we can improve readability, integrability and adaptability of the XML formats of this system with a conceptual model for XML we have developed in our previous works. Finally, we generalize our experience gained into a methodology which can be applied in any other problem domain.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117165584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093315
A. Zaman, P. Matsakis, C. G. Brown
The goal of this research is to evaluate the use of English stop word lists in Latent Semantic Indexing (LSI)-based Information Retrieval (IR) systems with large text datasets. Literature claims that the use of such lists improves retrieval performance. Here, three different lists are compared: two were compiled by IR groups at the University of Glasgow and the University of Tennessee, and one is our own list developed at the University of Northern British Columbia. We also examine the case where stop words are not removed from the input dataset. Our research finds that using tailored stop word lists improves retrieval performance. On the other hand, using arbitrary (non-tailored) lists or not using any list reduces the retrieval performance of LSI-based IR systems with large text datasets.
本研究的目的是评估基于潜在语义索引(LSI)的大型文本数据集信息检索(IR)系统中英语停止词列表的使用。文献声称使用这样的列表可以提高检索性能。本文比较了三份不同的榜单:两份由格拉斯哥大学(University of Glasgow)和田纳西大学(University of Tennessee)的IR小组编制,一份是我们自己在北英属哥伦比亚大学(University of Northern British Columbia)编制的榜单。我们还研究了停止词没有从输入数据集中删除的情况。我们的研究发现,使用定制的停止词列表可以提高检索性能。另一方面,使用任意(非定制)列表或不使用任何列表都会降低基于lsi的大型文本数据集IR系统的检索性能。
{"title":"Evaluation of stop word lists in text retrieval using Latent Semantic Indexing","authors":"A. Zaman, P. Matsakis, C. G. Brown","doi":"10.1109/ICDIM.2011.6093315","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093315","url":null,"abstract":"The goal of this research is to evaluate the use of English stop word lists in Latent Semantic Indexing (LSI)-based Information Retrieval (IR) systems with large text datasets. Literature claims that the use of such lists improves retrieval performance. Here, three different lists are compared: two were compiled by IR groups at the University of Glasgow and the University of Tennessee, and one is our own list developed at the University of Northern British Columbia. We also examine the case where stop words are not removed from the input dataset. Our research finds that using tailored stop word lists improves retrieval performance. On the other hand, using arbitrary (non-tailored) lists or not using any list reduces the retrieval performance of LSI-based IR systems with large text datasets.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125223834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093358
Sven Groppe, Jinghua Groppe, Stefan Werner, Matthias Samsel, F. Kalis, Kristina Fell, Peter Kliesch, Markus Nakhlah
Data streams are becoming an important concept and used in more and more applications. Processing of data streams needs a streaming engine. The streaming engine can start query processing once initial data is available. This capability is especially important for real-time computation and for long-relay transmission of data streams. In this work, we demonstrate a monitoring system of eBay auctions, which is based on our RDF stream engine and can analyze eBay auctions in a flexible way. Using our monitoring system, users can easily monitor the eBay auctions information of interest, analyze the behavior of buyers and sellers, predict the tendency of auctions and make more favorable decisions. Furthermore, each step during RDF stream processing can be visualized allowing a better and easier understanding of the internal processes.
{"title":"Monitoring eBay auctions by querying RDF streams","authors":"Sven Groppe, Jinghua Groppe, Stefan Werner, Matthias Samsel, F. Kalis, Kristina Fell, Peter Kliesch, Markus Nakhlah","doi":"10.1109/ICDIM.2011.6093358","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093358","url":null,"abstract":"Data streams are becoming an important concept and used in more and more applications. Processing of data streams needs a streaming engine. The streaming engine can start query processing once initial data is available. This capability is especially important for real-time computation and for long-relay transmission of data streams. In this work, we demonstrate a monitoring system of eBay auctions, which is based on our RDF stream engine and can analyze eBay auctions in a flexible way. Using our monitoring system, users can easily monitor the eBay auctions information of interest, analyze the behavior of buyers and sellers, predict the tendency of auctions and make more favorable decisions. Furthermore, each step during RDF stream processing can be visualized allowing a better and easier understanding of the internal processes.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128437684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/ICDIM.2011.6093326
M. Jalili
In this paper we study how the meso-scale and micro-scale electroencephalography (EEG) synchronization measures can be used for discriminating patients suffering from Alzheimer's disease (AD) from normal control subjects. To this end, two synchronization measures, namely power spectral density and multivariate phase synchronization, are considered and the topography of the changes in patients vs. Controls is shown. The AD patients showed increased power spectral density in the frontal area in theta band and widespread decrease in the higher frequency bands. It was also characterized with decreased multivariate phase synchronization in the left fronto-temporal and medial regions, which was consistent across all frequency bands. A region of interest was selected based on these maps and the average of the power spectral density and phase synchrony was obtained in these regions. These two quantities were then used as features for classification of the subjects into patients' and controls' groups. Our analysis showed that the theta band can be a marker for discriminating AD patients from normal controls, where a simple linear discriminant resulted in 83% classification precision.
{"title":"Discriminating early stage AD patients from healthy controls using synchronization analysis of EEG","authors":"M. Jalili","doi":"10.1109/ICDIM.2011.6093326","DOIUrl":"https://doi.org/10.1109/ICDIM.2011.6093326","url":null,"abstract":"In this paper we study how the meso-scale and micro-scale electroencephalography (EEG) synchronization measures can be used for discriminating patients suffering from Alzheimer's disease (AD) from normal control subjects. To this end, two synchronization measures, namely power spectral density and multivariate phase synchronization, are considered and the topography of the changes in patients vs. Controls is shown. The AD patients showed increased power spectral density in the frontal area in theta band and widespread decrease in the higher frequency bands. It was also characterized with decreased multivariate phase synchronization in the left fronto-temporal and medial regions, which was consistent across all frequency bands. A region of interest was selected based on these maps and the average of the power spectral density and phase synchrony was obtained in these regions. These two quantities were then used as features for classification of the subjects into patients' and controls' groups. Our analysis showed that the theta band can be a marker for discriminating AD patients from normal controls, where a simple linear discriminant resulted in 83% classification precision.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134474081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}