The enormous amount of data which is distributed on the World Wide Web can be very useful if the users became able to get these data in an easy and appropriate method, search engines help the users to find what they need from this enormous amount of data. Meta-search is the application of data fusion to document retrieval, Metasearch engine takes as an input the N ranked lists output by each of N search engines in response to a given query, As output, it computes a single ranked list, which is hopefully an improvement over any input list as measured by standard information retrieval performance metrics such as the mean average precision (MAP). Our goal in this paper is to answer the following question, what are the factors affecting the performance of Data fusion algorithms? The reason behind introducing those factors is the absence of a single source in the literature able to present all those factors in an organized and complete manner. This work is needed to integrate all data fusion performance research findings. This paper contributes to the data fusion literature by two things, firstly; it will deliver all factors affecting the performance of data fusion algorithms in an organized and complete manner. Secondly; it will deliver recommendations which are related to how and when to deal with the factors that affect the performance.
{"title":"The Factors Affecting the Performance of Data Fusion Algorithms","authors":"M. Nassar, G. Kanaan","doi":"10.1109/ICIME.2009.44","DOIUrl":"https://doi.org/10.1109/ICIME.2009.44","url":null,"abstract":"The enormous amount of data which is distributed on the World Wide Web can be very useful if the users became able to get these data in an easy and appropriate method, search engines help the users to find what they need from this enormous amount of data. Meta-search is the application of data fusion to document retrieval, Metasearch engine takes as an input the N ranked lists output by each of N search engines in response to a given query, As output, it computes a single ranked list, which is hopefully an improvement over any input list as measured by standard information retrieval performance metrics such as the mean average precision (MAP). Our goal in this paper is to answer the following question, what are the factors affecting the performance of Data fusion algorithms? The reason behind introducing those factors is the absence of a single source in the literature able to present all those factors in an organized and complete manner. This work is needed to integrate all data fusion performance research findings. This paper contributes to the data fusion literature by two things, firstly; it will deliver all factors affecting the performance of data fusion algorithms in an organized and complete manner. Secondly; it will deliver recommendations which are related to how and when to deal with the factors that affect the performance.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116575284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In various kinds of manufacturing production, predicting the influence of process parameters in terms of machine performance is a necessity as they may have a serious impact on product quality as well as on the probability of machine failure. To address this issue, this paper presents a novel knowledge-based algorithm embedded with Artificial Intelligence for evaluating the overall suitability of adopting the predicted control parameters suggested by domain experts. The originality of this research is that the proposed knowledge-based system is equipped with fuzzy-guided genetic algorithm, enabling the identification of the best set of process parameters. Simulation using the RIE machine is provided to validate the practicability of the proposed approach.
{"title":"Joint Optimization for Knowledge Mining: Evaluating Parameters of Manufacturing Processes","authors":"C.X.H. Tang, H. Lau","doi":"10.1109/ICIME.2009.119","DOIUrl":"https://doi.org/10.1109/ICIME.2009.119","url":null,"abstract":"In various kinds of manufacturing production, predicting the influence of process parameters in terms of machine performance is a necessity as they may have a serious impact on product quality as well as on the probability of machine failure. To address this issue, this paper presents a novel knowledge-based algorithm embedded with Artificial Intelligence for evaluating the overall suitability of adopting the predicted control parameters suggested by domain experts. The originality of this research is that the proposed knowledge-based system is equipped with fuzzy-guided genetic algorithm, enabling the identification of the best set of process parameters. Simulation using the RIE machine is provided to validate the practicability of the proposed approach.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125205143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study analyzes the mortgage loans of five Taiwanese commerce banks to identify the key factors that influence prepayments and defaults. Using data from a total of 16,215 data entries of mortgage loans of five Taiwanese commerce banks in 2002 through 2007, this study first conducts Logistic regression to analyze the behavior of prepayments and default. As far the overall predictability is concerned, this paper finds that the logistic regression model is able to provide simplified results in the measurement of model variables concerning defaults and prepayments.
{"title":"Evaluation of Prepayment and Default Behaviour of Mortgage Customers: With a Case Study of the Banking Industry in Taiwan","authors":"Shuo-fen Hsu, Po-Sheng Ko, Cheng-Chung Wu","doi":"10.1109/ICIME.2009.116","DOIUrl":"https://doi.org/10.1109/ICIME.2009.116","url":null,"abstract":"This study analyzes the mortgage loans of five Taiwanese commerce banks to identify the key factors that influence prepayments and defaults. Using data from a total of 16,215 data entries of mortgage loans of five Taiwanese commerce banks in 2002 through 2007, this study first conducts Logistic regression to analyze the behavior of prepayments and default. As far the overall predictability is concerned, this paper finds that the logistic regression model is able to provide simplified results in the measurement of model variables concerning defaults and prepayments.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127116458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MOLAP (multidimensional OLAP) systems are storing data as cubes in multidimensional arrays. Data cubes can be sparse, which slows down the performance of MOLAPs and requests useless additional data storage. Many compression algorithms have been introduced to deal with the sparsity of MOLAP data cubes. In this paper we present a new compression algorithm based on the bitmap compression technique. Instead of the linear structure used by the classical bitmap, we use a balanced tree structure to store the compressed data in order to reduce the search time. We demonstrate in this paper that our algorithm performs a search in the compressed structure in a logarithmic time which overcomes the linear time needed by classical bitmap compression methods. We finally show some empirical results in which our proposed algorithm has been tested over multiple datasets and compared to the classical bitmap algorithm.
{"title":"Speed up the Search in Bitmap Based Compressed Sparse Arrays","authors":"J. Zalaket","doi":"10.1109/ICIME.2009.43","DOIUrl":"https://doi.org/10.1109/ICIME.2009.43","url":null,"abstract":"MOLAP (multidimensional OLAP) systems are storing data as cubes in multidimensional arrays. Data cubes can be sparse, which slows down the performance of MOLAPs and requests useless additional data storage. Many compression algorithms have been introduced to deal with the sparsity of MOLAP data cubes. In this paper we present a new compression algorithm based on the bitmap compression technique. Instead of the linear structure used by the classical bitmap, we use a balanced tree structure to store the compressed data in order to reduce the search time. We demonstrate in this paper that our algorithm performs a search in the compressed structure in a logarithmic time which overcomes the linear time needed by classical bitmap compression methods. We finally show some empirical results in which our proposed algorithm has been tested over multiple datasets and compared to the classical bitmap algorithm.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122926630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the current implementation of SOA using web services, where different services can provide the same functionality for consumers, QoSs of these services are essential in determining the most suitable services for consumers. Although QoS for web services has attracted wide attention in the literature for the past few years, most of the current efforts did not consider the multi-stakeholders nature of web services. Unlike traditional software paradigms, SOA applications owned, developed, and/or used by different stakeholders. Those stakeholders are provider, consumer,developer, and broker. As a result, non-functional requirements are different from one stakeholder to another.This paper presents a quality model that classifies nonfunctional characteristics based on the different stakeholders' requirements. In the discussion of the stakeholders we focus on developer, provider, and consumer. The model presents also some metrics for these characteristics. We argue that this model gives better view for SOA and web services quality requirements from different perspectives.
{"title":"Quality Model for Web Services from Multi-stakeholders' Perspective","authors":"Zain Balfagih, M. Hassan","doi":"10.1109/ICIME.2009.11","DOIUrl":"https://doi.org/10.1109/ICIME.2009.11","url":null,"abstract":"In the current implementation of SOA using web services, where different services can provide the same functionality for consumers, QoSs of these services are essential in determining the most suitable services for consumers. Although QoS for web services has attracted wide attention in the literature for the past few years, most of the current efforts did not consider the multi-stakeholders nature of web services. Unlike traditional software paradigms, SOA applications owned, developed, and/or used by different stakeholders. Those stakeholders are provider, consumer,developer, and broker. As a result, non-functional requirements are different from one stakeholder to another.This paper presents a quality model that classifies nonfunctional characteristics based on the different stakeholders' requirements. In the discussion of the stakeholders we focus on developer, provider, and consumer. The model presents also some metrics for these characteristics. We argue that this model gives better view for SOA and web services quality requirements from different perspectives.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123296737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mining biclusters that exhibit both consistent trends and trends with similar degrees of fluctuations is vital to bioinformatics research. However, existing biclustering methods are not very efficient and effective at mining such biclusters. Most biclustering models, including those used in subspace clustering, define similarity among different objects by distances over either all or only a subset of dimensions in gene expression data. However, distance functions are not always adequate in capturing co-relations among the objects. In fact, strong co-relations may still exist among a set of objects even if they are far apart from each other as measured by the distance function.Under the CPB (Coherent Pattern Biclustering) model, we proposed, two objects are similar if they exhibit coherent pattern on a subset of dimensions. For instances, in DNA microarray analysis, the expression levels of two genes may rise or fall synchronously in response to a set of environmental stimuli. Though the magnitude of their expression levels may not be close, but the pattern they exhibit can be very much similar. Our proposed model is interested in finding such coherent patterns of biclusters of genes and with a general understanding of biological processes that many genes participate in multiple different processes.
{"title":"CPB: A Model for Biclustering","authors":"Debahuti Mishra, A. Rath","doi":"10.1109/ICIME.2009.48","DOIUrl":"https://doi.org/10.1109/ICIME.2009.48","url":null,"abstract":"Mining biclusters that exhibit both consistent trends and trends with similar degrees of fluctuations is vital to bioinformatics research. However, existing biclustering methods are not very efficient and effective at mining such biclusters. Most biclustering models, including those used in subspace clustering, define similarity among different objects by distances over either all or only a subset of dimensions in gene expression data. However, distance functions are not always adequate in capturing co-relations among the objects. In fact, strong co-relations may still exist among a set of objects even if they are far apart from each other as measured by the distance function.Under the CPB (Coherent Pattern Biclustering) model, we proposed, two objects are similar if they exhibit coherent pattern on a subset of dimensions. For instances, in DNA microarray analysis, the expression levels of two genes may rise or fall synchronously in response to a set of environmental stimuli. Though the magnitude of their expression levels may not be close, but the pattern they exhibit can be very much similar. Our proposed model is interested in finding such coherent patterns of biclusters of genes and with a general understanding of biological processes that many genes participate in multiple different processes.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123306877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keramat Hassani, R. Roustaei, H. Zafari, E. Zohrevandi, M. Shiri, O. M. Talab
The problem of providing explanation for a query answer is referred to as lineage tracing. This problem has been studied extensively in data warehouse systems, but for mediator-based systems, this is identified as a research problem. In such a system, the mediator does not store data. This means for query processing as well as for tracing, the mediator has to communicate with the data sources. which this communication could be expensive or impossible. so To resolve this, we clearly define forward lineage tracing and show its properties. We propose a tracing method computes data lineage without storing any data and effectively supports aggregation and variable granularity lineage. And we illustrate that our method is more efficient than methods that compute the lineage by executing the reverse query.
{"title":"An Approach to Tracking Data Lineage in Mediator Based Information Integration Systems","authors":"Keramat Hassani, R. Roustaei, H. Zafari, E. Zohrevandi, M. Shiri, O. M. Talab","doi":"10.1109/ICIME.2009.75","DOIUrl":"https://doi.org/10.1109/ICIME.2009.75","url":null,"abstract":"The problem of providing explanation for a query answer is referred to as lineage tracing. This problem has been studied extensively in data warehouse systems, but for mediator-based systems, this is identified as a research problem. In such a system, the mediator does not store data. This means for query processing as well as for tracing, the mediator has to communicate with the data sources. which this communication could be expensive or impossible. so To resolve this, we clearly define forward lineage tracing and show its properties. We propose a tracing method computes data lineage without storing any data and effectively supports aggregation and variable granularity lineage. And we illustrate that our method is more efficient than methods that compute the lineage by executing the reverse query.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131376574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most modern business applications today are developed by using object technology such as Java to build application software and using a relational or multivalued database to store data. Nevertheless, impedance mismatch exists between objects and data store mechanisms. Hence, object persistence has become a necessary practice to map database records into objects for further in-memory processing. However, most research works focus on objects to relational database mapping and very few works focus on objects to multivalued database (O/M) mapping. Nonetheless, these few existing O/M mapping mechanisms are either hard to be extended or difficult to be managed. This paper presents the design of an O/M mapping mechanism called PersistOM with design patterns. A set of design patterns has been applied to make the PersistOM easy to be extended and reused. Layers architectural pattern was applied to structure the whole mapping mechanism to ensure each mapping sub-layer is at a particular level of abstraction. Simulation results show that PersistOM not only shortens the overall development period, but also is comparatively easy to be modified and extended.
{"title":"PersistOM: An Objects-to-Multivalued Database Mapping Mechanism","authors":"Fuguo Wei, S. Lee","doi":"10.1109/ICIME.2009.10","DOIUrl":"https://doi.org/10.1109/ICIME.2009.10","url":null,"abstract":"Most modern business applications today are developed by using object technology such as Java to build application software and using a relational or multivalued database to store data. Nevertheless, impedance mismatch exists between objects and data store mechanisms. Hence, object persistence has become a necessary practice to map database records into objects for further in-memory processing. However, most research works focus on objects to relational database mapping and very few works focus on objects to multivalued database (O/M) mapping. Nonetheless, these few existing O/M mapping mechanisms are either hard to be extended or difficult to be managed. This paper presents the design of an O/M mapping mechanism called PersistOM with design patterns. A set of design patterns has been applied to make the PersistOM easy to be extended and reused. Layers architectural pattern was applied to structure the whole mapping mechanism to ensure each mapping sub-layer is at a particular level of abstraction. Simulation results show that PersistOM not only shortens the overall development period, but also is comparatively easy to be modified and extended.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123666137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A prediction scheme of sunspot series using a BiLinear Recurrent Neural Network (BLRNN) is proposed in this paper. Since the BLRNN is based on the bilinear polynomial, it has been successfully used in modeling highly nonlinear systems with time-series characteristics and the BLRNN can be a natural choice in predicting sunspot series. The performance of the proposed BLRNN-based predictor is evaluated and compared with the conventional MultiLayer Perceptron Type Neural Network (MLPNN)-based predictor. Experiments are conducted on the Wolf sunspot series number data. The results show that the proposed BLRNN based predictor outperforms the MLPNN-based one interms of the Normalized Mean Squared Error (NMSE).
{"title":"Prediction of Sunspot Series Using BiLinear Recurrent Neural Network","authors":"Dong-Chul Park, Dong-Min Woo","doi":"10.1109/ICIME.2009.90","DOIUrl":"https://doi.org/10.1109/ICIME.2009.90","url":null,"abstract":"A prediction scheme of sunspot series using a BiLinear Recurrent Neural Network (BLRNN) is proposed in this paper. Since the BLRNN is based on the bilinear polynomial, it has been successfully used in modeling highly nonlinear systems with time-series characteristics and the BLRNN can be a natural choice in predicting sunspot series. The performance of the proposed BLRNN-based predictor is evaluated and compared with the conventional MultiLayer Perceptron Type Neural Network (MLPNN)-based predictor. Experiments are conducted on the Wolf sunspot series number data. The results show that the proposed BLRNN based predictor outperforms the MLPNN-based one interms of the Normalized Mean Squared Error (NMSE).","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114408974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile agents have been advocated to support electronic commerce over the Internet. While being a promising paradigm, many intricate problems such as security and fault tolerance need to be solved to make this vision reality. In this paper we have proposed a fault tolerant comparison internet shopping system BestDeal. We assume that both the mobile agent and the host responsible to execute Mobile Agent are test worthy and mobile agent does not get tampered, kidnapped or robbed on its way. Hierarchical Fault Tolerance Protocol (HFTP) has been used to make this application fault tolerant i.e. user, who launches the mobile agent receives it back with correct result within time limit in spite of hardware and software faults such as link failure, host failure, or crash of mobile agent or mobile agent system. Proposed protocol has been modeled by using CPN tools and been analyzed by using simulations and data gathering tools.
{"title":"A Fault Tolerant Comparison Internet Shopping System: BestDeal by Using Mobile Agent","authors":"H. Pathak, Nipur, K. Garg","doi":"10.1109/ICIME.2009.77","DOIUrl":"https://doi.org/10.1109/ICIME.2009.77","url":null,"abstract":"Mobile agents have been advocated to support electronic commerce over the Internet. While being a promising paradigm, many intricate problems such as security and fault tolerance need to be solved to make this vision reality. In this paper we have proposed a fault tolerant comparison internet shopping system BestDeal. We assume that both the mobile agent and the host responsible to execute Mobile Agent are test worthy and mobile agent does not get tampered, kidnapped or robbed on its way. Hierarchical Fault Tolerance Protocol (HFTP) has been used to make this application fault tolerant i.e. user, who launches the mobile agent receives it back with correct result within time limit in spite of hardware and software faults such as link failure, host failure, or crash of mobile agent or mobile agent system. Proposed protocol has been modeled by using CPN tools and been analyzed by using simulations and data gathering tools.","PeriodicalId":445284,"journal":{"name":"2009 International Conference on Information Management and Engineering","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123699131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}