Pub Date : 2022-05-01DOI: 10.1007/s10619-023-07428-y
M. Jibril, Alexander Baumstark, K. Sattler
{"title":"Adaptive update handling for graph HTAP","authors":"M. Jibril, Alexander Baumstark, K. Sattler","doi":"10.1007/s10619-023-07428-y","DOIUrl":"https://doi.org/10.1007/s10619-023-07428-y","url":null,"abstract":"","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"1 1","pages":"1-27"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42420807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The sensor-based recognition of Activities of Daily Living (ADLs) in smart-home environments enables several important applications, including the continuous monitoring of fragile subjects in their homes for healthcare systems. The majority of the approaches in the literature assume that only one resident is living in the home. Multi-inhabitant ADLs recognition is significantly more challenging, and only a limited effort has been devoted to address this setting by the research community. One of the major open problems is called data association, which is correctly associating each environmental sensor event (e.g., the opening of a fridge door) with the inhabitant that actually triggered it. Moreover, existing multi-inhabitant approaches rely on supervised learning, assuming a high availability of labeled data. However, collecting a comprehensive training set of ADLs (especially in multiple-residents settings) is prohibitive. In this work, we propose MICAR: a novel multi-inhabitant ADLs recognition approach that combines semi-supervised learning and knowledge-based reasoning. Data association is performed by semantic reasoning, combining high-level context information (e.g., residents' postures and semantic locations) with triggered sensor events. The personalized stream of sensor events is processed by an incremental classifier, that is initialized with a limited amount of labeled ADLs. A novel cache-based active learning strategy is adopted to continuously improve the classifier. Our results on a dataset where up to 4 subjects perform ADLs at the same time show that MICAR reliably recognizes individual and joint activities while triggering a significantly low number of active learning queries.
{"title":"MICAR: multi-inhabitant context-aware activity recognition in home environments.","authors":"Luca Arrotta, Claudio Bettini, Gabriele Civitarese","doi":"10.1007/s10619-022-07403-z","DOIUrl":"10.1007/s10619-022-07403-z","url":null,"abstract":"<p><p>The sensor-based recognition of Activities of Daily Living (ADLs) in smart-home environments enables several important applications, including the continuous monitoring of fragile subjects in their homes for healthcare systems. The majority of the approaches in the literature assume that only one resident is living in the home. Multi-inhabitant ADLs recognition is significantly more challenging, and only a limited effort has been devoted to address this setting by the research community. One of the major open problems is called <i>data association</i>, which is correctly associating each environmental sensor event (e.g., the opening of a fridge door) with the inhabitant that actually triggered it. Moreover, existing multi-inhabitant approaches rely on supervised learning, assuming a high availability of labeled data. However, collecting a comprehensive training set of ADLs (especially in multiple-residents settings) is prohibitive. In this work, we propose MICAR: a novel multi-inhabitant ADLs recognition approach that combines semi-supervised learning and knowledge-based reasoning. Data association is performed by semantic reasoning, combining high-level context information (e.g., residents' postures and semantic locations) with triggered sensor events. The personalized stream of sensor events is processed by an incremental classifier, that is initialized with a limited amount of labeled ADLs. A novel cache-based active learning strategy is adopted to continuously improve the classifier. Our results on a dataset where up to 4 subjects perform ADLs at the same time show that MICAR reliably recognizes individual and joint activities while triggering a significantly low number of active learning queries.</p>","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"1 1","pages":"1-32"},"PeriodicalIF":1.5,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8980210/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48545332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-31DOI: 10.1007/s10619-022-07407-9
Doaa Abdelfattah, Hesham A. Hassan, Fatma A. Omara
The collaboration among different organizations is considered one of the main benefits of moving applications and services to a cloud computing environment. Unfortunately, this collaboration raises many challenges such as the access of sensitive resources by unauthorized people. Usually, Role-Based Access-Control (RBAC) model is deployed in large organizations. This paper addresses the scalability problem of the online stored rules. This problem affects the performance of the access control system due to increasing number of shared resources and/or number of collaborating organizations in the same cloud environment. Therefore, this paper proposes replacing the cross-domain RBAC rules with Role-To-Role (RTR) mapping rules among all organizations. The RTR mapping rules are generated using a newly proposed Role-Mapping algorithm. A comparative study is performed to evaluate the proposed algorithm’s performance with concerning the Rule-Store size and the authorization response time. According to the results, it is found that the proposed algorithm reduces the number of stored rules which minimizes the Rule-Store size and reduces the authorization response time. Additionally, this paper proposes applying a concurrent approach on the RTR mapping model using the proposed Role-Mapping algorithm to achieve more savings in the authorization response time. Therefore, it will be suitable in highly-collaborative cloud environments.
{"title":"A novel role-mapping algorithm for enhancing highly collaborative access control system","authors":"Doaa Abdelfattah, Hesham A. Hassan, Fatma A. Omara","doi":"10.1007/s10619-022-07407-9","DOIUrl":"https://doi.org/10.1007/s10619-022-07407-9","url":null,"abstract":"<p>The collaboration among different organizations is considered one of the main benefits of moving applications and services to a cloud computing environment. Unfortunately, this collaboration raises many challenges such as the access of sensitive resources by unauthorized people. Usually, Role-Based Access-Control (RBAC) model is deployed in large organizations. This paper addresses the scalability problem of the online stored rules. This problem affects the performance of the access control system due to increasing number of shared resources and/or number of collaborating organizations in the same cloud environment. Therefore, this paper proposes replacing the cross-domain RBAC rules with Role-To-Role (RTR) mapping rules among all organizations. The RTR mapping rules are generated using a newly proposed Role-Mapping algorithm. A comparative study is performed to evaluate the proposed algorithm’s performance with concerning the Rule-Store size and the authorization response time. According to the results, it is found that the proposed algorithm reduces the number of stored rules which minimizes the Rule-Store size and reduces the authorization response time. Additionally, this paper proposes applying a concurrent approach on the RTR mapping model using the proposed Role-Mapping algorithm to achieve more savings in the authorization response time. Therefore, it will be suitable in highly-collaborative cloud environments.</p>","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"71 6","pages":""},"PeriodicalIF":1.2,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2022-07-16DOI: 10.1007/s10619-022-07414-w
Ana Claudia Sima, Tarcisio Mendes de Farias, Maria Anisimova, Christophe Dessimoz, Marc Robinson-Rechavi, Erich Zbinden, Kurt Stockinger
The problem of natural language processing over structured data has become a growing research field, both within the relational database and the Semantic Web community, with significant efforts involved in question answering over knowledge graphs (KGQA). However, many of these approaches are either specifically targeted at open-domain question answering using DBpedia, or require large training datasets to translate a natural language question to SPARQL in order to query the knowledge graph. Hence, these approaches often cannot be applied directly to complex scientific datasets where no prior training data is available. In this paper, we focus on the challenges of natural language processing over knowledge graphs of scientific datasets. In particular, we introduce Bio-SODA, a natural language processing engine that does not require training data in the form of question-answer pairs for generating SPARQL queries. Bio-SODA uses a generic graph-based approach for translating user questions to a ranked list of SPARQL candidate queries. Furthermore, Bio-SODA uses a novel ranking algorithm that includes node centrality as a measure of relevance for selecting the best SPARQL candidate query. Our experiments with real-world datasets across several scientific domains, including the official bioinformatics Question Answering over Linked Data (QALD) challenge, as well as the CORDIS dataset of European projects, show that Bio-SODA outperforms publicly available KGQA systems by an F1-score of least 20% and by an even higher factor on more complex bioinformatics datasets. Finally, we introduce Bio-SODA UX, a graphical user interface designed to assist users in the exploration of large knowledge graphs and in dynamically disambiguating natural language questions that target the data available in these graphs.
{"title":"Bio-SODA UX: enabling natural language question answering over knowledge graphs with user disambiguation.","authors":"Ana Claudia Sima, Tarcisio Mendes de Farias, Maria Anisimova, Christophe Dessimoz, Marc Robinson-Rechavi, Erich Zbinden, Kurt Stockinger","doi":"10.1007/s10619-022-07414-w","DOIUrl":"https://doi.org/10.1007/s10619-022-07414-w","url":null,"abstract":"<p><p>The problem of natural language processing over structured data has become a growing research field, both within the relational database and the Semantic Web community, with significant efforts involved in question answering over knowledge graphs (KGQA). However, many of these approaches are either specifically targeted at <i>open-domain</i> question answering using DBpedia, or require <i>large training datasets</i> to translate a natural language question to SPARQL in order to query the knowledge graph. Hence, these approaches often cannot be applied directly to complex <i>scientific datasets</i> where no prior training data is available. In this paper, we focus on the challenges of natural language processing over knowledge graphs of scientific datasets. In particular, we introduce Bio-SODA, a natural language processing engine that does not require training data in the form of question-answer pairs for generating SPARQL queries. Bio-SODA uses a generic graph-based approach for translating user questions to a ranked list of SPARQL candidate queries. Furthermore, Bio-SODA uses a novel ranking algorithm that includes node centrality as a measure of relevance for selecting the best SPARQL candidate query. Our experiments with real-world datasets across several scientific domains, including the official <i>bioinformatics</i> Question Answering over Linked Data (QALD) challenge, as well as the CORDIS dataset of European projects, show that Bio-SODA outperforms publicly available KGQA systems by an F1-score of least 20% and by an even higher factor on more complex bioinformatics datasets. Finally, we introduce Bio-SODA UX, a graphical user interface designed to assist users in the exploration of large knowledge graphs and in dynamically disambiguating natural language questions that target the data available in these graphs.</p>","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"40 2-3","pages":"409-440"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9458692/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33471108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2021-08-07DOI: 10.1007/s10619-021-07358-7
R Sitharthan, M Rajesh
{"title":"RETRACTED ARTICLE: Application of machine learning (ML) and internet of things (IoT) in healthcare to predict and tackle pandemic situation.","authors":"R Sitharthan, M Rajesh","doi":"10.1007/s10619-021-07358-7","DOIUrl":"https://doi.org/10.1007/s10619-021-07358-7","url":null,"abstract":"","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"40 4","pages":"887"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8349240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39311605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-28DOI: 10.1007/s10619-021-07375-6
Xite Wang, Chaojin Wang, Mei Bai, Qian Ma, Guanyu Li
As one of the most popular parallel data processing models, data analysis system MapReduce has been widely used in many fields. Task scheduling is the core module in MapReduce system, and the quality of the scheduling algorithm directly affects the processing capacity of the system. Since new nodes need to be continuously added in the cluster to improve the processing capacity of the cluster, objectively, the heterogeneity of the cluster is caused. Heterogeneous environment is common in practical application scenarios, but there has been little research on task scheduling in heterogeneous environment. For this reason, this paper presents an in-depth study of task scheduling in heterogeneous environment and proposes a new task scheduling algorithm HTD. First, we give a formal definition of the throughput-driven task scheduling problem in a heterogeneous environment. Second, we design the scheduling algorithm HTD, which quickly obtains the completion sequence of a jobs set and optimizes the task scheduling details in heterogeneous environment. Finally, a series of experiments show the efficiency and effectiveness of the algorithm.
{"title":"HTD: heterogeneous throughput-driven task scheduling algorithm in MapReduce","authors":"Xite Wang, Chaojin Wang, Mei Bai, Qian Ma, Guanyu Li","doi":"10.1007/s10619-021-07375-6","DOIUrl":"https://doi.org/10.1007/s10619-021-07375-6","url":null,"abstract":"<p>As one of the most popular parallel data processing models, data analysis system MapReduce has been widely used in many fields. Task scheduling is the core module in MapReduce system, and the quality of the scheduling algorithm directly affects the processing capacity of the system. Since new nodes need to be continuously added in the cluster to improve the processing capacity of the cluster, objectively, the heterogeneity of the cluster is caused. Heterogeneous environment is common in practical application scenarios, but there has been little research on task scheduling in heterogeneous environment. For this reason, this paper presents an in-depth study of task scheduling in heterogeneous environment and proposes a new task scheduling algorithm HTD. First, we give a formal definition of the throughput-driven task scheduling problem in a heterogeneous environment. Second, we design the scheduling algorithm HTD, which quickly obtains the completion sequence of a jobs set and optimizes the task scheduling details in heterogeneous environment. Finally, a series of experiments show the efficiency and effectiveness of the algorithm.</p>","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"71 S102","pages":""},"PeriodicalIF":1.2,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-20DOI: 10.1007/s10619-021-07366-7
P. Revanth Rathan, P. Krishna Reddy, Anirban Mondal
While the problems of finding the shortest path and k-shortest paths have been extensively researched, the research community has been shifting its focus towards discovering and identifying paths based on user preferences. Since users naturally follow some of the paths more than other paths, the popularity of a given path often reflects such user preferences. Given a set of user traversals in a road network and a set of paths between a given source and destination pair, we address the problem of performing top-k ranking of the paths in that set based on path popularity. In this paper, we introduce a new model for computing the popularity scores of paths. Our main contributions are threefold. First, we propose a framework for modeling user traversals in a road network as transactions. Second, we present an approach for efficiently computing the popularity score of any path based on the itemsets extracted from the transactions using pattern mining techniques. Third, we conducted an extensive performance evaluation with two real datasets to demonstrate the effectiveness of the proposed scheme.
{"title":"A framework for discovering popular paths using transactional modeling and pattern mining","authors":"P. Revanth Rathan, P. Krishna Reddy, Anirban Mondal","doi":"10.1007/s10619-021-07366-7","DOIUrl":"https://doi.org/10.1007/s10619-021-07366-7","url":null,"abstract":"<p>While the problems of finding the shortest path and <i>k</i>-shortest paths have been extensively researched, the research community has been shifting its focus towards discovering and identifying paths based on user preferences. Since users naturally follow some of the paths more than other paths, the popularity of a given path often reflects such user preferences. Given a set of user traversals in a road network and a set of paths between a given source and destination pair, we address the problem of performing top-<i>k</i> ranking of the paths in that set based on path popularity. In this paper, we introduce a new model for computing the popularity scores of paths. Our main contributions are threefold. First, we propose a framework for modeling user traversals in a road network as transactions. Second, we present an approach for <i>efficiently</i> computing the popularity score of any path based on the itemsets extracted from the transactions using pattern mining techniques. Third, we conducted an extensive performance evaluation with two real datasets to demonstrate the effectiveness of the proposed scheme.</p>","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"71 11","pages":""},"PeriodicalIF":1.2,"publicationDate":"2021-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-03DOI: 10.1007/s10619-021-07359-6
N. Panneerselvam, S. Krithiga
{"title":"Mutual-contained access delegation scheme for the Internet of Things user services","authors":"N. Panneerselvam, S. Krithiga","doi":"10.1007/s10619-021-07359-6","DOIUrl":"https://doi.org/10.1007/s10619-021-07359-6","url":null,"abstract":"","PeriodicalId":50568,"journal":{"name":"Distributed and Parallel Databases","volume":"40 1","pages":"835-860"},"PeriodicalIF":1.2,"publicationDate":"2021-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43385244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}