Clustering Web search result is a promising way to help alleviate the information overload for Web users. In this paper, we focus on clustering snippets returned by Google Scholar. We propose a novel similarity function based on mining domain knowledge and an outlier-conscious clustering algorithm. Experimental results showed improved effectiveness of the proposed approach compared with existing methods.
{"title":"Effective Snippet Clustering with Domain Knowledge","authors":"S. Patro, Wei Wang","doi":"10.1109/DBKDA.2009.8","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.8","url":null,"abstract":"Clustering Web search result is a promising way to help alleviate the information overload for Web users. In this paper, we focus on clustering snippets returned by Google Scholar. We propose a novel similarity function based on mining domain knowledge and an outlier-conscious clustering algorithm. Experimental results showed improved effectiveness of the proposed approach compared with existing methods.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124923709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The decision to outsource databases is strategic in many organizations due to the increasing costs of internally managing large volumes of information. The sensitive nature of this information raises the need for powerful mechanisms to protect it against unauthorized disclosure. Centralized encryption to access control at the data owner level has been proposed as one way of handling this issue. However, its prohibitive costs renders it impractical and inflexible. A distributed cryptographic approach has been suggested as a promising alternative, where keys are distributed to users on the basis of their assigned privileges. But in this case, key management becomes problematic in the face of frequent database updates and remains an open issue. In this paper, we present a novel approach based on Binary Tries. By exploiting the intrinsic properties of these data structures, key management complexity, and thus its cost, is significantly reduced. Changes to the Binary Trie structure remain limited in the face of frequent updates. Preliminary experimental analysis demonstrates the validity and the effectiveness of our approach.
{"title":"Distributed Key Management in Dynamic Outsourced Databases: A Trie-Based Approach","authors":"Vanessa El-Khoury, N. Bennani, A. Ouksel","doi":"10.1109/DBKDA.2009.31","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.31","url":null,"abstract":"The decision to outsource databases is strategic in many organizations due to the increasing costs of internally managing large volumes of information. The sensitive nature of this information raises the need for powerful mechanisms to protect it against unauthorized disclosure. Centralized encryption to access control at the data owner level has been proposed as one way of handling this issue. However, its prohibitive costs renders it impractical and inflexible. A distributed cryptographic approach has been suggested as a promising alternative, where keys are distributed to users on the basis of their assigned privileges. But in this case, key management becomes problematic in the face of frequent database updates and remains an open issue. In this paper, we present a novel approach based on Binary Tries. By exploiting the intrinsic properties of these data structures, key management complexity, and thus its cost, is significantly reduced. Changes to the Binary Trie structure remain limited in the face of frequent updates. Preliminary experimental analysis demonstrates the validity and the effectiveness of our approach.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128270772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transaction processing is of growing importance for mobile computing. Booking tickets, flight reservation, banking,ePayment, and booking holiday arrangements are just a few examples for mobile transactions. Due to temporarily disconnected situations the synchronisation and consistent transaction processing are key issues. Serializability is a too strong criteria for correctness when the semantics of a transaction is known. We introduce a transaction model that allows higher concurrency for a certain class of transactions defined by its semantic. The transaction results are”escrow serializable” and the synchronisation mechanism is non-blocking. Experimental implementation showed higher concurrency, transaction throughput, and less resources used than common locking or optimistic protocols.
{"title":"Transaction Processing in Mobile Computing Using Semantic Properties","authors":"F. Laux, Tim Lessner","doi":"10.1109/DBKDA.2009.29","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.29","url":null,"abstract":"Transaction processing is of growing importance for mobile computing. Booking tickets, flight reservation, banking,ePayment, and booking holiday arrangements are just a few examples for mobile transactions. Due to temporarily disconnected situations the synchronisation and consistent transaction processing are key issues. Serializability is a too strong criteria for correctness when the semantics of a transaction is known. We introduce a transaction model that allows higher concurrency for a certain class of transactions defined by its semantic. The transaction results are”escrow serializable” and the synchronisation mechanism is non-blocking. Experimental implementation showed higher concurrency, transaction throughput, and less resources used than common locking or optimistic protocols.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134068862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a Data Quality Manager (DQM) prototype providing information regarding the elements of derived non-atomic data values. Users are able to make effective decisions by trusting data according to the description of the conflict resolution function that was utilized for fusing data along with the quality properties of data ancestor. The assessment and ranking of non-atomic data is possible by the specification of quality properties and priorities from users at any level of experience.
{"title":"Assessing Quality of Derived Non Atomic Data by Considering Conflict Resolution Function","authors":"M. Angeles, F. Garcia-Ugalde","doi":"10.1109/DBKDA.2009.10","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.10","url":null,"abstract":"We present a Data Quality Manager (DQM) prototype providing information regarding the elements of derived non-atomic data values. Users are able to make effective decisions by trusting data according to the description of the conflict resolution function that was utilized for fusing data along with the quality properties of data ancestor. The assessment and ranking of non-atomic data is possible by the specification of quality properties and priorities from users at any level of experience.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124972951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roland Gabriel, Tobias Hoppe, Alexander Pastwa, Sebastian Sowa
Enterprise information infrastructures are generally characterized by a multitude of information systems which support decision makers in fulfilling their duties. The object of information security management is the protection of these systems, whereas security information and event management (SIEM) addresses those information management tasks which focus on the short term handling of events, as well as on the long term improvement of the entire information security architectures. This is carried out based on those data which can be logged and collected within the enterprise information security infrastructure. An especially interesting type of log data is data created by anti-malware software. This paper demonstrates in the context of a project case study that data mining (DM) is a well suited approach to detect hidden patterns in malware data and thus to support SIEM.
{"title":"Analyzing Malware Log Data to Support Security Information and Event Management: Some Research Results","authors":"Roland Gabriel, Tobias Hoppe, Alexander Pastwa, Sebastian Sowa","doi":"10.1109/DBKDA.2009.26","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.26","url":null,"abstract":"Enterprise information infrastructures are generally characterized by a multitude of information systems which support decision makers in fulfilling their duties. The object of information security management is the protection of these systems, whereas security information and event management (SIEM) addresses those information management tasks which focus on the short term handling of events, as well as on the long term improvement of the entire information security architectures. This is carried out based on those data which can be logged and collected within the enterprise information security infrastructure. An especially interesting type of log data is data created by anti-malware software. This paper demonstrates in the context of a project case study that data mining (DM) is a well suited approach to detect hidden patterns in malware data and thus to support SIEM.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126361849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software systems in health care, such as disease and medical-record management, or financial applications, such as customer relationship and portfolio management, have very often a temporal nature. Information is specified in form of rules as a pre-step for monitoring the changes of interest in the application. Managing such applications requires to provide replay support for managed information at any specified review period. This paper presents a replay support method for information formalized as rules. This method is based on an XML-based model and a replay query language for the rule-based information. We introduce the model and the language using a clinical case study, and evaluates the storage efficiency of the model and the performance of the query.
{"title":"Replay the Execution History of Rule-Based Information","authors":"Essam Mansour, Hagen Höpfner","doi":"10.1109/DBKDA.2009.17","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.17","url":null,"abstract":"Software systems in health care, such as disease and medical-record management, or financial applications, such as customer relationship and portfolio management, have very often a temporal nature. Information is specified in form of rules as a pre-step for monitoring the changes of interest in the application. Managing such applications requires to provide replay support for managed information at any specified review period. This paper presents a replay support method for information formalized as rules. This method is based on an XML-based model and a replay query language for the rule-based information. We introduce the model and the language using a clinical case study, and evaluates the storage efficiency of the model and the performance of the query.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130503389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we define a new notion of containment for two queries, called strong containment. The strong containment implies the classical containment. A necessary and sufficient condition for the strong containment relation between two queries is given. The time complexity of the decision problem for two queries to be in strong containment relation is a linear function of the containment mappings number corresponding to the queries
{"title":"A Strong Containment Problem for Queries in Conjunctive Form with Negation","authors":"V. Felea","doi":"10.1109/DBKDA.2009.23","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.23","url":null,"abstract":"In this paper we define a new notion of containment for two queries, called strong containment. The strong containment implies the classical containment. A necessary and sufficient condition for the strong containment relation between two queries is given. The time complexity of the decision problem for two queries to be in strong containment relation is a linear function of the containment mappings number corresponding to the queries","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131338773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the fast development in mobile and communication technology, the need for moving object databases increases. Moving object databases constitute a major ingredient in many applications that are location sensitive. Traffic control, fleet management, m-commerce, and E-911 are among those emerging location based services (LBSs). Along with the vast increase of location information, and the need to efficiently manage and mine those information, spatio-temporal data warehouses arise into the picture requiring new algorithms, new measures, and efficient querying techniques to efficiently utilize the historical information obtained from moving objects. In this paper we present a spatio-temporal data warehouse (STDW) that enables querying location information. Experimental results on our query performance are presented.
{"title":"Querying Trajectory Data Warehouses","authors":"Hoda M. O. Mokhtar, Gihan Mahmoud","doi":"10.1109/DBKDA.2009.20","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.20","url":null,"abstract":"With the fast development in mobile and communication technology, the need for moving object databases increases. Moving object databases constitute a major ingredient in many applications that are location sensitive. Traffic control, fleet management, m-commerce, and E-911 are among those emerging location based services (LBSs). Along with the vast increase of location information, and the need to efficiently manage and mine those information, spatio-temporal data warehouses arise into the picture requiring new algorithms, new measures, and efficient querying techniques to efficiently utilize the historical information obtained from moving objects. In this paper we present a spatio-temporal data warehouse (STDW) that enables querying location information. Experimental results on our query performance are presented.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131219369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conceptual modeling for database design needs simple and easy to use techniques for gathering and presenting information. Furthermore these approaches must be a means for communication between different kinds of stakeholders. In this paper a combination of such techniques are presented. It will be proposed how a glossary based representation, a graphical representation as well as a verbalization of concepts can be used together for communication with the end user.
{"title":"Towards a Combination of Three Representation Techniques for Conceptual Data Modeling","authors":"C. Kop","doi":"10.1109/DBKDA.2009.12","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.12","url":null,"abstract":"Conceptual modeling for database design needs simple and easy to use techniques for gathering and presenting information. Furthermore these approaches must be a means for communication between different kinds of stakeholders. In this paper a combination of such techniques are presented. It will be proposed how a glossary based representation, a graphical representation as well as a verbalization of concepts can be used together for communication with the end user.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"64 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131219965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developing, maintaining and accessing complex information repositories are some of the main concerns of today’s information society. In huge distributed systems, guaranteeing good performances and data integrity is a real challenge when using standard relational databases. This paper presents a database concept methodology that addresses these concerns under the form of an “RDBMS independent” database designed in an object fashion, while allowing for standard SQL access for those applications that require it. After presenting the general concept and model and showing that it can be used and implemented on any database system, we will describe it in detail and focus on examples of current uses both for commercial and academic applications. The paper will highlight the benefits of using this model in one particular highly complex, distributed database environment in describing its application to the central authoritative information repository of the EGEE and WLCG worldwide computing Grids.
{"title":"A Pseudo Object Database Model and Its Applications on a Highly Complex Distributed Architecture","authors":"P. Colclough, Gilles Mathieu","doi":"10.1109/DBKDA.2009.14","DOIUrl":"https://doi.org/10.1109/DBKDA.2009.14","url":null,"abstract":"Developing, maintaining and accessing complex information repositories are some of the main concerns of today’s information society. In huge distributed systems, guaranteeing good performances and data integrity is a real challenge when using standard relational databases. This paper presents a database concept methodology that addresses these concerns under the form of an “RDBMS independent” database designed in an object fashion, while allowing for standard SQL access for those applications that require it. After presenting the general concept and model and showing that it can be used and implemented on any database system, we will describe it in detail and focus on examples of current uses both for commercial and academic applications. The paper will highlight the benefits of using this model in one particular highly complex, distributed database environment in describing its application to the central authoritative information repository of the EGEE and WLCG worldwide computing Grids.","PeriodicalId":231150,"journal":{"name":"2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117303557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}