首页 > 最新文献

ACM Journal of Data and Information Quality最新文献

英文 中文
Incentive Mechanism Design for Responsible Data Governance: A Large-scale Field Experiment 负责任数据治理的激励机制设计:一项大规模现场实验
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-04-19 DOI: 10.1145/3592617
Christina Timko, Malte Niederstadt, Naman Goel, Boi Faltings
A crucial building block of responsible artificial intelligence is responsible data governance, including data collection. Its importance is also underlined in the latest EU regulations. The data should be of high quality, foremost correct and representative, and individuals providing the data should have autonomy over what data is collected. In this article, we consider the setting of collecting personally measured fitness data (physical activity measurements), in which some individuals may not have an incentive to measure and report accurate data. This can significantly degrade the quality of the collected data. On the other hand, high-quality collective data of this nature could be used for reliable scientific insights or to build trustworthy artificial intelligence applications. We conduct a framed field experiment (N = 691) to examine the effect of offering fixed and quality-dependent monetary incentives on the quality of the collected data. We use a peer-based incentive-compatible mechanism for the quality-dependent incentives without spot-checking or surveilling individuals. We find that the incentive-compatible mechanism can elicit good-quality data while providing a good user experience and compensating fairly, although, in the specific study context, the data quality does not necessarily differ under the two incentive schemes. We contribute new design insights from the experiment and discuss directions that future field experiments and applications on explainable and transparent data collection may focus on.
负责任的人工智能的一个关键组成部分是负责任的数据治理,包括数据收集。最新的欧盟法规也强调了它的重要性。数据应该是高质量的,最重要的是正确和具有代表性的,提供数据的个人应该对收集的数据拥有自主权。在本文中,我们考虑收集个人测量的健身数据(身体活动测量)的设置,其中一些个人可能没有动力测量和报告准确的数据。这可能会显著降低收集数据的质量。另一方面,这种性质的高质量集体数据可以用于可靠的科学见解或构建值得信赖的人工智能应用程序。我们进行了一项有框架的实地实验(N = 691),以检验提供固定的和依赖于质量的货币激励对收集数据质量的影响。我们使用基于同行的激励兼容机制来实现质量依赖的激励,而不需要对个人进行抽查或监督。我们发现,激励兼容机制可以在提供良好用户体验和公平补偿的同时获得高质量的数据,尽管在具体的研究背景下,两种激励方案下的数据质量并不一定不同。我们从实验中提供了新的设计见解,并讨论了未来可解释和透明数据收集的现场实验和应用可能关注的方向。
{"title":"Incentive Mechanism Design for Responsible Data Governance: A Large-scale Field Experiment","authors":"Christina Timko, Malte Niederstadt, Naman Goel, Boi Faltings","doi":"10.1145/3592617","DOIUrl":"https://doi.org/10.1145/3592617","url":null,"abstract":"A crucial building block of responsible artificial intelligence is responsible data governance, including data collection. Its importance is also underlined in the latest EU regulations. The data should be of high quality, foremost correct and representative, and individuals providing the data should have autonomy over what data is collected. In this article, we consider the setting of collecting personally measured fitness data (physical activity measurements), in which some individuals may not have an incentive to measure and report accurate data. This can significantly degrade the quality of the collected data. On the other hand, high-quality collective data of this nature could be used for reliable scientific insights or to build trustworthy artificial intelligence applications. We conduct a framed field experiment (N = 691) to examine the effect of offering fixed and quality-dependent monetary incentives on the quality of the collected data. We use a peer-based incentive-compatible mechanism for the quality-dependent incentives without spot-checking or surveilling individuals. We find that the incentive-compatible mechanism can elicit good-quality data while providing a good user experience and compensating fairly, although, in the specific study context, the data quality does not necessarily differ under the two incentive schemes. We contribute new design insights from the experiment and discuss directions that future field experiments and applications on explainable and transparent data collection may focus on.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"23 1","pages":"1 - 18"},"PeriodicalIF":2.1,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83296585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Theory and Practice of Relational-to-RDF Temporal Data Exchange and Query Answering 关系型与rdf时态数据交换与查询应答的理论与实践
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-04-08 DOI: 10.1145/3591359
J. Ao, Zehui Cheng, Rada Y. Chirkova, Phokion G. Kolaitis
We consider the problem of answering temporal queries on RDF stores, in presence of atemporal RDFS domain ontologies, of relational data sources that include temporal information, and of rules that map the domain information in the source schemas into the target ontology. Our proposed practice-oriented solution consists of two rule-based domain-independent algorithms. The first algorithm materializes target RDF data via a version of data exchange that enriches both the data and the ontology with temporal information from the relational sources. The second algorithm accepts as inputs temporal queries expressed in terms of the domain ontology using a lightweight temporal extension of SPARQL, and ensures successful evaluation of the queries on the materialized temporally-enriched RDF data. To study the quality of the information generated by the algorithms, we develop a general framework that formalizes the relational-to-RDF temporal data-exchange problem. The framework includes a chase formalism and a formal solution for the problem of answering temporal queries in the context of relational-to-RDF temporal data exchange. In this article, we present the algorithms and the formal framework that proves correctness of the information output by the algorithms, and also report on the algorithm implementation and experimental results for two application domains.
我们考虑在非时态RDFS域本体、包含时态信息的关系数据源以及将源模式中的域信息映射到目标本体的规则的存在下,回答RDF存储上的时态查询的问题。我们提出的面向实践的解决方案由两个基于规则的领域独立算法组成。第一种算法通过一个数据交换版本实现目标RDF数据,该版本使用来自关系源的时态信息丰富了数据和本体。第二种算法使用SPARQL的轻量级时间扩展,接受用领域本体表示的时间查询作为输入,并确保对物化的时间丰富的RDF数据成功地评估查询。为了研究算法生成的信息的质量,我们开发了一个通用框架,将关系到rdf的时态数据交换问题形式化。该框架包括一个追逐形式和一个在关系到rdf时态数据交换的上下文中回答时态查询问题的形式化解决方案。在本文中,我们给出了算法和证明算法输出信息正确性的形式化框架,并报告了算法的实现和两个应用领域的实验结果。
{"title":"Theory and Practice of Relational-to-RDF Temporal Data Exchange and Query Answering","authors":"J. Ao, Zehui Cheng, Rada Y. Chirkova, Phokion G. Kolaitis","doi":"10.1145/3591359","DOIUrl":"https://doi.org/10.1145/3591359","url":null,"abstract":"We consider the problem of answering temporal queries on RDF stores, in presence of atemporal RDFS domain ontologies, of relational data sources that include temporal information, and of rules that map the domain information in the source schemas into the target ontology. Our proposed practice-oriented solution consists of two rule-based domain-independent algorithms. The first algorithm materializes target RDF data via a version of data exchange that enriches both the data and the ontology with temporal information from the relational sources. The second algorithm accepts as inputs temporal queries expressed in terms of the domain ontology using a lightweight temporal extension of SPARQL, and ensures successful evaluation of the queries on the materialized temporally-enriched RDF data. To study the quality of the information generated by the algorithms, we develop a general framework that formalizes the relational-to-RDF temporal data-exchange problem. The framework includes a chase formalism and a formal solution for the problem of answering temporal queries in the context of relational-to-RDF temporal data exchange. In this article, we present the algorithms and the formal framework that proves correctness of the information output by the algorithms, and also report on the algorithm implementation and experimental results for two application domains.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"25 1","pages":"1 - 27"},"PeriodicalIF":2.1,"publicationDate":"2023-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72468768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
To Link or Synthesize? An Approach to Data Quality Comparison 链接还是合成?一种数据质量比较方法
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-02-21 DOI: 10.1145/3580487
Duncan Smith, M. Elliot, J. Sakshaug
Linking administrative data to produce more informative data for subsequent analysis has become an increasingly common practice. However, there might be concomitant risks of disclosing sensitive information about individuals. One practice that reduces these risks is data synthesis. In data synthesis the data are used to fit a model from which synthetic data are then generated. The synthetic data are then released to end users. There are some scenarios where an end user might have the option of using linked data or accepting synthesized data. However, linkage and synthesis are susceptible to errors that could limit their usefulness. Here, we investigate the problem of comparing the quality of linked data to synthesized data and demonstrate through simulations how the problem might be approached. These comparisons are important when considering how an end user can be supplied with the highest-quality data and in situations where one must consider risk/utility tradeoffs.
将管理数据联系起来以产生更多信息丰富的数据以供后续分析,已成为越来越普遍的做法。然而,泄露个人敏感信息可能会带来风险。减少这些风险的一种做法是数据综合。在数据合成中,数据用于拟合模型,然后从中生成合成数据。然后将合成数据发布给最终用户。在某些场景中,最终用户可以选择使用链接数据或接受合成数据。但是,连接和综合容易受到错误的影响,从而限制了它们的用途。在这里,我们研究了将链接数据的质量与合成数据进行比较的问题,并通过模拟演示了如何处理这个问题。在考虑如何向最终用户提供最高质量的数据以及必须考虑风险/效用权衡的情况下,这些比较非常重要。
{"title":"To Link or Synthesize? An Approach to Data Quality Comparison","authors":"Duncan Smith, M. Elliot, J. Sakshaug","doi":"10.1145/3580487","DOIUrl":"https://doi.org/10.1145/3580487","url":null,"abstract":"Linking administrative data to produce more informative data for subsequent analysis has become an increasingly common practice. However, there might be concomitant risks of disclosing sensitive information about individuals. One practice that reduces these risks is data synthesis. In data synthesis the data are used to fit a model from which synthetic data are then generated. The synthetic data are then released to end users. There are some scenarios where an end user might have the option of using linked data or accepting synthesized data. However, linkage and synthesis are susceptible to errors that could limit their usefulness. Here, we investigate the problem of comparing the quality of linked data to synthesized data and demonstrate through simulations how the problem might be approached. These comparisons are important when considering how an end user can be supplied with the highest-quality data and in situations where one must consider risk/utility tradeoffs.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"58 1","pages":"1 - 20"},"PeriodicalIF":2.1,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74240710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Introduction to the Special Issue on Truth and Trust Online 《真相与信任在线》特刊简介
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-02-03 DOI: 10.1145/3578242
Dustin Wright, Paolo Papotti, Isabelle Augenstein
This editorial summarizes the content of the Special Issue on Truth and Trust Online of the Journal of Data and Information Quality. We thank the authors for their exceptional contributions to this special issue.
这篇社论总结了《数据与信息质量杂志》在线真相与信任特刊的内容。我们感谢作者对本期特刊的杰出贡献。
{"title":"Introduction to the Special Issue on Truth and Trust Online","authors":"Dustin Wright, Paolo Papotti, Isabelle Augenstein","doi":"10.1145/3578242","DOIUrl":"https://doi.org/10.1145/3578242","url":null,"abstract":"This editorial summarizes the content of the Special Issue on Truth and Trust Online of the Journal of Data and Information Quality. We thank the authors for their exceptional contributions to this special issue.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"48 1","pages":"1 - 3"},"PeriodicalIF":2.1,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82737811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experimental Evaluation of Covariates Effects on Periocular Biometrics: A Robust Security Assessment Framework 眼周生物特征协变量效应的实验评估:一个稳健的安全性评估框架
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-30 DOI: 10.1145/3579029
Gautam Kumar, Sambit Bakshi, A. K. Sangaiah, Pankaj Kumar Sa
The growing integration of technology into our lives has resulted in unprecedented amounts of data that are being exchanged among devices in an Internet of Things (IoT) environment. Authentication, identification, and device heterogeneities are major security and privacy concerns in IoT. One of the most effective solutions to avoid unauthorized access to sensitive information is biometrics. Deep learning-based biometric systems have been proven to outperform traditional image processing and machine learning techniques. However, the image quality covariates associated with blur, resolution, illumination, and noise predominantly affect recognition performance. Therefore, assessing the robustness of the developed solution is another important concern that still needs to be investigated. This article proposes a periocular region-based biometric system and explores the effect of image quality covariates (artifacts) on the performance of periocular recognition. To simulate the real-time scenarios and understand the consequences of blur, resolution, and bit-depth of images on the recognition accuracy of periocular biometrics, we modeled out-of-focus blur, camera shake blur, low-resolution, and low bit-depth image acquisition using Gaussian function, linear motion, interpolation, and bit plan slicing, respectively. All the images of the UBIRIS.v1 database are degraded by varying strength of image quality covariates to obtain degraded versions of the database. Afterward, deep models are trained with each degraded version of the database. The performance of the model is evaluated by measuring statistical parameters calculated from a confusion matrix. Experimental results show that among all types of covariates, camera shake blur has less effect on the recognition performance, while out-of-focus blur significantly impacts it. Irrespective of image quality, the convolutional neural network produces excellent results, which proves the robustness of the developed model.
科技日益融入我们的生活,导致在物联网(IoT)环境中,设备之间交换的数据量前所未有。身份验证、身份识别和设备异构是物联网中主要的安全和隐私问题。避免未经授权访问敏感信息的最有效解决方案之一是生物识别技术。基于深度学习的生物识别系统已被证明优于传统的图像处理和机器学习技术。然而,图像质量协变量相关的模糊,分辨率,照明和噪声主要影响识别性能。因此,评估开发的解决方案的健壮性是另一个需要研究的重要问题。本文提出了一种基于眼周区域的生物识别系统,并探讨了图像质量协变量(伪影)对眼周识别性能的影响。为了模拟实时场景,了解图像的模糊、分辨率和位深对眼周生物特征识别精度的影响,我们分别使用高斯函数、线性运动、插值和位计划切片对失焦模糊、相机抖动模糊、低分辨率和低位深图像采集进行了建模。所有UBIRIS的图像。通过改变图像质量协变量的强度对V1数据库进行降级,得到降级版本的数据库。然后,使用数据库的每个降级版本对深度模型进行训练。通过测量由混淆矩阵计算的统计参数来评估模型的性能。实验结果表明,在所有类型的协变量中,相机抖动模糊对识别性能的影响较小,而失焦模糊对识别性能的影响较大。在不考虑图像质量的情况下,卷积神经网络得到了很好的结果,证明了所建模型的鲁棒性。
{"title":"Experimental Evaluation of Covariates Effects on Periocular Biometrics: A Robust Security Assessment Framework","authors":"Gautam Kumar, Sambit Bakshi, A. K. Sangaiah, Pankaj Kumar Sa","doi":"10.1145/3579029","DOIUrl":"https://doi.org/10.1145/3579029","url":null,"abstract":"The growing integration of technology into our lives has resulted in unprecedented amounts of data that are being exchanged among devices in an Internet of Things (IoT) environment. Authentication, identification, and device heterogeneities are major security and privacy concerns in IoT. One of the most effective solutions to avoid unauthorized access to sensitive information is biometrics. Deep learning-based biometric systems have been proven to outperform traditional image processing and machine learning techniques. However, the image quality covariates associated with blur, resolution, illumination, and noise predominantly affect recognition performance. Therefore, assessing the robustness of the developed solution is another important concern that still needs to be investigated. This article proposes a periocular region-based biometric system and explores the effect of image quality covariates (artifacts) on the performance of periocular recognition. To simulate the real-time scenarios and understand the consequences of blur, resolution, and bit-depth of images on the recognition accuracy of periocular biometrics, we modeled out-of-focus blur, camera shake blur, low-resolution, and low bit-depth image acquisition using Gaussian function, linear motion, interpolation, and bit plan slicing, respectively. All the images of the UBIRIS.v1 database are degraded by varying strength of image quality covariates to obtain degraded versions of the database. Afterward, deep models are trained with each degraded version of the database. The performance of the model is evaluated by measuring statistical parameters calculated from a confusion matrix. Experimental results show that among all types of covariates, camera shake blur has less effect on the recognition performance, while out-of-focus blur significantly impacts it. Irrespective of image quality, the convolutional neural network produces excellent results, which proves the robustness of the developed model.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"74 1","pages":"1 - 25"},"PeriodicalIF":2.1,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89379695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Soft Computing Techniques for Federated Learning- Applications, Challenges and Future Directions 面向联邦学习的软计算技术综述——应用、挑战和未来方向
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-30 DOI: 10.1145/3575810
Y. Supriya, T. Gadekallu
Federated Learning is a distributed, privacy-preserving machine learning model that is gaining more attention these days. Federated Learning has a vast number of applications in different fields. While being more popular, it also suffers some drawbacks like high communication costs, privacy concerns, and data management issues. In this survey, we define federated learning systems and analyse the system to ensure a smooth flow and to guide future research with the help of soft computing techniques. We undertake a complete review of aggregating federated learning systems with soft computing techniques. We also investigate the impacts of collaborating various nature-inspired techniques with federated learning to alleviate its flaws. Finally, this paper discusses the possible future developments of integrating federated learning and soft computing techniques.
联邦学习是一种分布式的、保护隐私的机器学习模型,最近受到了越来越多的关注。联邦学习在不同的领域有大量的应用。虽然它更受欢迎,但也有一些缺点,比如高昂的通信成本、隐私问题和数据管理问题。在本研究中,我们定义了联邦学习系统,并对系统进行了分析,以确保系统的流畅,并在软计算技术的帮助下指导未来的研究。我们进行了一个完整的审查与软计算技术的聚合联邦学习系统。我们还研究了将各种自然启发的技术与联邦学习合作以减轻其缺陷的影响。最后,本文讨论了将联邦学习与软计算技术相结合的可能的未来发展。
{"title":"A Survey on Soft Computing Techniques for Federated Learning- Applications, Challenges and Future Directions","authors":"Y. Supriya, T. Gadekallu","doi":"10.1145/3575810","DOIUrl":"https://doi.org/10.1145/3575810","url":null,"abstract":"Federated Learning is a distributed, privacy-preserving machine learning model that is gaining more attention these days. Federated Learning has a vast number of applications in different fields. While being more popular, it also suffers some drawbacks like high communication costs, privacy concerns, and data management issues. In this survey, we define federated learning systems and analyse the system to ensure a smooth flow and to guide future research with the help of soft computing techniques. We undertake a complete review of aggregating federated learning systems with soft computing techniques. We also investigate the impacts of collaborating various nature-inspired techniques with federated learning to alleviate its flaws. Finally, this paper discusses the possible future developments of integrating federated learning and soft computing techniques.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"208 1","pages":"1 - 28"},"PeriodicalIF":2.1,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88583831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Survey on Edge Intelligence and Lightweight Machine Learning Support for Future Applications and Services 面向未来应用和服务的边缘智能和轻量级机器学习支持调查
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-25 DOI: 10.1145/3581759
Kyle Hoffpauir, Jacob Simmons, Nikolas Schmidt, Rachitha Pittala, Isaac Briggs, Shanmukha Makani, Y. Jararweh
As the number of devices connected to the Internet has grown larger, so too has the intensity of the tasks that these devices need to perform. Modern networks are more frequently working to perform computationally intensive tasks on low-power devices and low-end hardware. Current architectures and platforms tend towards centralized and resource-rich cloud computing approaches to address these deficits. However, edge computing presents a much more viable and flexible alternative. Edge computing refers to a distributed and decentralized network architecture in which demanding tasks such as image recognition, smart city services, and high-intensity data processing tasks can be distributed over a number of integrated network devices. In this article, we provide a comprehensive survey for emerging edge intelligence applications, lightweight machine learning algorithms, and their support for future applications and services. We start by analyzing the rise of cloud computing, discuss its weak points, and identify situations in which edge computing provides advantages over traditional cloud computing architectures. We then divulge details of the survey: the first section identifies opportunities and domains for edge computing growth, the second identifies algorithms and approaches that can be used to enhance edge intelligence implementations, and the third specifically analyzes situations in which edge intelligence can be enhanced using any of the aforementioned algorithms or approaches. In this third section, lightweight machine learning approaches are detailed. A more in-depth analysis and discussion of future developments follows. The primary discourse of this article is in service of an effort to ensure that appropriate approaches are applied adequately to artificial intelligence implementations in edge systems, mainly, the lightweight machine learning approaches.
随着连接到互联网的设备数量越来越多,这些设备需要执行的任务的强度也越来越大。现代网络更频繁地在低功耗设备和低端硬件上执行计算密集型任务。当前的架构和平台倾向于采用集中式和资源丰富的云计算方法来解决这些缺陷。然而,边缘计算提供了一个更加可行和灵活的替代方案。边缘计算是指一种分布式、去中心化的网络架构,将图像识别、智慧城市服务、高强度数据处理等要求较高的任务分布在多个集成的网络设备上。在本文中,我们对新兴的边缘智能应用、轻量级机器学习算法及其对未来应用和服务的支持进行了全面的调查。我们首先分析云计算的兴起,讨论其弱点,并确定边缘计算比传统云计算架构提供优势的情况。然后,我们透露了调查的细节:第一部分确定了边缘计算增长的机会和领域,第二部分确定了可用于增强边缘智能实现的算法和方法,第三部分具体分析了可以使用上述任何算法或方法增强边缘智能的情况。在第三部分中,详细介绍了轻量级机器学习方法。接下来将对未来的发展进行更深入的分析和讨论。本文的主要论述是为了确保适当的方法充分应用于边缘系统中的人工智能实现,主要是轻量级机器学习方法。
{"title":"A Survey on Edge Intelligence and Lightweight Machine Learning Support for Future Applications and Services","authors":"Kyle Hoffpauir, Jacob Simmons, Nikolas Schmidt, Rachitha Pittala, Isaac Briggs, Shanmukha Makani, Y. Jararweh","doi":"10.1145/3581759","DOIUrl":"https://doi.org/10.1145/3581759","url":null,"abstract":"As the number of devices connected to the Internet has grown larger, so too has the intensity of the tasks that these devices need to perform. Modern networks are more frequently working to perform computationally intensive tasks on low-power devices and low-end hardware. Current architectures and platforms tend towards centralized and resource-rich cloud computing approaches to address these deficits. However, edge computing presents a much more viable and flexible alternative. Edge computing refers to a distributed and decentralized network architecture in which demanding tasks such as image recognition, smart city services, and high-intensity data processing tasks can be distributed over a number of integrated network devices. In this article, we provide a comprehensive survey for emerging edge intelligence applications, lightweight machine learning algorithms, and their support for future applications and services. We start by analyzing the rise of cloud computing, discuss its weak points, and identify situations in which edge computing provides advantages over traditional cloud computing architectures. We then divulge details of the survey: the first section identifies opportunities and domains for edge computing growth, the second identifies algorithms and approaches that can be used to enhance edge intelligence implementations, and the third specifically analyzes situations in which edge intelligence can be enhanced using any of the aforementioned algorithms or approaches. In this third section, lightweight machine learning approaches are detailed. A more in-depth analysis and discussion of future developments follows. The primary discourse of this article is in service of an effort to ensure that appropriate approaches are applied adequately to artificial intelligence implementations in edge systems, mainly, the lightweight machine learning approaches.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"67 1","pages":"1 - 30"},"PeriodicalIF":2.1,"publicationDate":"2023-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79103940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Choice of Textual Knowledge Base in Automated Claim Checking 自动索赔检查中文本知识库的选择
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-25 DOI: 10.1145/3561389
Dominik Stammbach, Boya Zhang, Elliott Ash
Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.
自动索赔检查的任务是从可信事实的文本知识库中检索证据,确定索赔的真实性。先前的工作是将知识库作为给定的,并对索赔检查管道进行优化,而我们采取相反的方法——将管道作为给定的,我们探索知识库的选择。我们的第一个见解是,索赔检查管道可以被转移到一个新的索赔领域,并从新领域访问知识库。其次,我们没有找到一个“普遍最佳”的知识库-任务数据集和知识库的高域重叠往往会产生更好的标签准确性。第三,除了使用最接近领域的知识库之外,组合多个知识库并不倾向于提高性能。最后,我们证明了索赔检查管道选择证据的置信度得分可以用来评估知识库是否会在一组新的索赔中表现良好,即使在没有基本事实标签的情况下。
{"title":"The Choice of Textual Knowledge Base in Automated Claim Checking","authors":"Dominik Stammbach, Boya Zhang, Elliott Ash","doi":"10.1145/3561389","DOIUrl":"https://doi.org/10.1145/3561389","url":null,"abstract":"Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"40 1","pages":"1 - 22"},"PeriodicalIF":2.1,"publicationDate":"2023-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87095226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Multifactor Ring Signature based Authentication Scheme for Quality Assessment of IoMT Environment in COVID-19 Scenario 基于多因素环签名的IoMT环境质量评估认证方案
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-20 DOI: 10.1145/3575811
Kakali Chatterjee, Ashutosh Kumar Singh, Neha, K. Yu
The quality of the healthcare environment has become an essential factor for healthcare users to access quality services. Smart healthcare systems use the Internet of Medical Things (IoMT) devices to capture patients’ health data for treatment or diagnostic purposes. This sensitive collected patient data is shared between the different stakeholders across the network to provide quality services. Due to this, healthcare systems are vulnerable to confidentiality, integrity and privacy threats. In the COVID-19 scenario, when collaborative medical consultation is required, the quality assessment of the framework is essential to protect the privacy of doctors and patients. In this paper, a ring signature-based anonymous authentication and quality assessment scheme is designed for collaborative medical consultation environments for quality assessment and protection of the privacy of doctors and patients. This scheme also uses a new KMOV Cryptosystem to ensure the quality of the network and protect the system from different attacks that hamper data confidentiality.
医疗保健环境的质量已成为医疗保健用户获得优质服务的一个重要因素。智能医疗保健系统使用医疗物联网(IoMT)设备来捕获患者的健康数据,用于治疗或诊断目的。这些收集的敏感患者数据在网络上的不同利益相关者之间共享,以提供高质量的服务。因此,医疗保健系统容易受到机密性、完整性和隐私威胁。在COVID-19情况下,当需要协同医疗会诊时,对框架的质量评估对于保护医生和患者的隐私至关重要。本文针对协同医疗会诊环境,设计了一种基于环签名的匿名认证与质量评估方案,用于质量评估和医患隐私保护。该方案还采用了一种新的KMOV密码系统来保证网络的质量,并保护系统免受各种影响数据保密性的攻击。
{"title":"A Multifactor Ring Signature based Authentication Scheme for Quality Assessment of IoMT Environment in COVID-19 Scenario","authors":"Kakali Chatterjee, Ashutosh Kumar Singh, Neha, K. Yu","doi":"10.1145/3575811","DOIUrl":"https://doi.org/10.1145/3575811","url":null,"abstract":"The quality of the healthcare environment has become an essential factor for healthcare users to access quality services. Smart healthcare systems use the Internet of Medical Things (IoMT) devices to capture patients’ health data for treatment or diagnostic purposes. This sensitive collected patient data is shared between the different stakeholders across the network to provide quality services. Due to this, healthcare systems are vulnerable to confidentiality, integrity and privacy threats. In the COVID-19 scenario, when collaborative medical consultation is required, the quality assessment of the framework is essential to protect the privacy of doctors and patients. In this paper, a ring signature-based anonymous authentication and quality assessment scheme is designed for collaborative medical consultation environments for quality assessment and protection of the privacy of doctors and patients. This scheme also uses a new KMOV Cryptosystem to ensure the quality of the network and protect the system from different attacks that hamper data confidentiality.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"6 1","pages":"1 - 24"},"PeriodicalIF":2.1,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75686593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Uniqueness Constraints for Object Stores 对象存储的唯一性约束
IF 2.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-01-19 DOI: 10.1145/3581758
Philipp Skavantzos, Uwe Leck, Kaiqi Zhao, S. Link
Object stores offer an increasingly popular choice for data management and analytics. As with every data model, managing the integrity of objects is fundamental for data quality but also important for the efficiency of update and query operations. In response to shortcomings of unique and existence constraints in object stores, we propose a new principled class of constraints that separates uniqueness from existence dimensions of data quality, and fully supports multiple labels and composite properties. We illustrate benefits of the constraints on real-world examples of property graphs where node integrity is enforced for better update and query performance. The benefits are quantified experimentally in terms of perfectly scaling the access to data through indices that result from the constraints. We establish axiomatic and algorithmic characterizations for the underlying implication problem. In addition, we fully characterize which non-redundant families of constraints attain maximum cardinality for any given finite sets of labels and properties. We exemplify further use cases of the constraints: elicitation of business rules, identification of data quality problems, and design for data quality. Finally, we propose extensions to managing the integrity of objects in object stores such as graph databases.
对象存储为数据管理和分析提供了一个日益流行的选择。与每个数据模型一样,管理对象的完整性是数据质量的基础,但对于更新和查询操作的效率也很重要。针对对象存储中存在的惟一性约束和存在性约束的不足,我们提出了一种新的原则约束,它将数据质量的惟一性维度与存在性维度分离开来,并完全支持多标签和复合属性。我们将在属性图的实际示例中说明约束的好处,其中强制节点完整性以获得更好的更新和查询性能。这些好处是通过实验量化的,即通过约束产生的索引完美地扩展对数据的访问。我们建立了隐含问题的公理和算法表征。此外,我们充分刻画了对于任何给定的有限标签和属性集,哪些非冗余约束族达到了最大基数。我们举例说明了约束的进一步用例:业务规则的推导、数据质量问题的识别以及数据质量的设计。最后,我们提出了扩展来管理对象存储(如图数据库)中对象的完整性。
{"title":"Uniqueness Constraints for Object Stores","authors":"Philipp Skavantzos, Uwe Leck, Kaiqi Zhao, S. Link","doi":"10.1145/3581758","DOIUrl":"https://doi.org/10.1145/3581758","url":null,"abstract":"Object stores offer an increasingly popular choice for data management and analytics. As with every data model, managing the integrity of objects is fundamental for data quality but also important for the efficiency of update and query operations. In response to shortcomings of unique and existence constraints in object stores, we propose a new principled class of constraints that separates uniqueness from existence dimensions of data quality, and fully supports multiple labels and composite properties. We illustrate benefits of the constraints on real-world examples of property graphs where node integrity is enforced for better update and query performance. The benefits are quantified experimentally in terms of perfectly scaling the access to data through indices that result from the constraints. We establish axiomatic and algorithmic characterizations for the underlying implication problem. In addition, we fully characterize which non-redundant families of constraints attain maximum cardinality for any given finite sets of labels and properties. We exemplify further use cases of the constraints: elicitation of business rules, identification of data quality problems, and design for data quality. Finally, we propose extensions to managing the integrity of objects in object stores such as graph databases.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"15 1","pages":"1 - 29"},"PeriodicalIF":2.1,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73458508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
ACM Journal of Data and Information Quality
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1