Frontiers in Big Data最新文献_第10页

impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers 大规模的文本重用。在语义丰富的历史报纸中探索文本重用数据的接口

Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-11-03 DOI: 10.3389/fdata.2023.1249469

Marten Düring, Matteo Romanello, Maud Ehrmann, Kaspar Beelen, Daniele Guido, Brecht Deseure, Estelle Bunout, Jana Keck, Petros Apostolopoulos

Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.

文本重用揭示了大型语料库中文本的有意义的重复。人文学者使用文本再利用来研究，例如，有影响力的文本的后接受或揭示历史媒体不断发展的出版实践。这种研究经常得到交互式可视化的支持，它突出了文本段之间的关系和差异。在本文中，我们以该领域的早期工作为基础。我们提出了impresso大规模文本重用，这是我们的知识第一接口，它将文本重用数据与其他形式的语义丰富集成在一起，从而能够对历史报纸语料库中的互文关系进行通用和可扩展的探索。大规模文本重用界面是作为impresso项目的一部分开发的，它结合了强大的搜索和过滤操作以及近距离和远距离阅读视角。我们将文本重用数据与来自主题建模、命名实体识别和分类、语言和文档类型检测以及一组丰富的报纸元数据的丰富内容集成在一起。我们报告了历史研究目标和常见的用户任务，用于分析历史文本重用数据，并提供了原型界面以及用户评估结果。

{"title":"impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers","authors":"Marten Düring, Matteo Romanello, Maud Ehrmann, Kaspar Beelen, Daniele Guido, Brecht Deseure, Estelle Bunout, Jana Keck, Petros Apostolopoulos","doi":"10.3389/fdata.2023.1249469","DOIUrl":"https://doi.org/10.3389/fdata.2023.1249469","url":null,"abstract":"Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"9 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135820990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Non-invasive detection of anemia using lip mucosa images transfer learning convolutional neural networks 利用唇黏膜图像转移学习卷积神经网络进行无创贫血检测

Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-11-03 DOI: 10.3389/fdata.2023.1291329

Mohammed Mansour, Turker Berk Donmez, Mustafa Kutlu, Shekhar Mahmud

Anemia is defined as a drop in the number of erythrocytes or hemoglobin concentration below normal levels in healthy people. The increase in paleness of the skin might vary based on the color of the skin, although there is currently no quantifiable measurement. The pallor of the skin is best visible in locations where the cuticle is thin, such as the interior of the mouth, lips, or conjunctiva. This work focuses on anemia-related pallors and their relationship to blood count values and artificial intelligence. In this study, a deep learning approach using transfer learning and Convolutional Neural Networks (CNN) was implemented in which VGG16, Xception, MobileNet, and ResNet50 architectures, were pre-trained to predict anemia using lip mucous images. A total of 138 volunteers (100 women and 38 men) participated in the work to develop the dataset that contains two image classes: healthy and anemic. Image processing was first performed on a single frame with only the mouth area visible, data argumentation was preformed, and then CNN models were applied to classify the dataset lip images. Statistical metrics were employed to discriminate the performance of the models in terms of Accuracy, Precision, Recal, and F1 Score. Among the CNN algorithms used, Xception was found to categorize the lip images with 99.28% accuracy, providing the best results. The other CNN architectures had accuracies of 96.38% for MobileNet, 95.65% for ResNet %, and 92.39% for VGG16. Our findings show that anemia may be diagnosed using deep learning approaches from a single lip image. This data set will be enhanced in the future to allow for real-time classification.

贫血被定义为健康人红细胞数量或血红蛋白浓度低于正常水平。尽管目前还没有可量化的测量方法，但皮肤苍白程度的增加可能因肤色而异。皮肤的苍白在角质层较薄的地方最为明显，如口腔、嘴唇或结膜的内部。这项工作的重点是贫血相关的苍白及其与血细胞计数值和人工智能的关系。在这项研究中，使用迁移学习和卷积神经网络(CNN)实现了一种深度学习方法，其中对VGG16、Xception、MobileNet和ResNet50架构进行了预训练，以使用唇粘膜图像预测贫血。共有138名志愿者(100名女性和38名男性)参与了开发数据集的工作，该数据集包含两个图像类别:健康和贫血。首先对仅可见嘴巴区域的单帧图像进行处理，进行数据论证，然后应用CNN模型对数据集嘴唇图像进行分类。采用统计指标来区分模型在准确性、精度、Recal和F1评分方面的表现。在使用的CNN算法中，发现Xception对唇形图像的分类准确率为99.28%，提供了最好的结果。其他CNN架构对于MobileNet的准确率为96.38%，对于ResNet %的准确率为95.65%，对于VGG16的准确率为92.39%。我们的研究结果表明，可以使用深度学习方法从单个嘴唇图像中诊断贫血。该数据集将在未来得到增强，以允许实时分类。

{"title":"Non-invasive detection of anemia using lip mucosa images transfer learning convolutional neural networks","authors":"Mohammed Mansour, Turker Berk Donmez, Mustafa Kutlu, Shekhar Mahmud","doi":"10.3389/fdata.2023.1291329","DOIUrl":"https://doi.org/10.3389/fdata.2023.1291329","url":null,"abstract":"Anemia is defined as a drop in the number of erythrocytes or hemoglobin concentration below normal levels in healthy people. The increase in paleness of the skin might vary based on the color of the skin, although there is currently no quantifiable measurement. The pallor of the skin is best visible in locations where the cuticle is thin, such as the interior of the mouth, lips, or conjunctiva. This work focuses on anemia-related pallors and their relationship to blood count values and artificial intelligence. In this study, a deep learning approach using transfer learning and Convolutional Neural Networks (CNN) was implemented in which VGG16, Xception, MobileNet, and ResNet50 architectures, were pre-trained to predict anemia using lip mucous images. A total of 138 volunteers (100 women and 38 men) participated in the work to develop the dataset that contains two image classes: healthy and anemic. Image processing was first performed on a single frame with only the mouth area visible, data argumentation was preformed, and then CNN models were applied to classify the dataset lip images. Statistical metrics were employed to discriminate the performance of the models in terms of Accuracy, Precision, Recal, and F1 Score. Among the CNN algorithms used, Xception was found to categorize the lip images with 99.28% accuracy, providing the best results. The other CNN architectures had accuracies of 96.38% for MobileNet, 95.65% for ResNet %, and 92.39% for VGG16. Our findings show that anemia may be diagnosed using deep learning approaches from a single lip image. This data set will be enhanced in the future to allow for real-time classification.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"28 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

No longer hype, not yet mainstream? Recalibrating city digital twins' expectations and reality: a case study perspective 不再炒作，还不是主流?重新校准城市数字孪生的期望和现实:一个案例研究的视角

Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-11-02 DOI: 10.3389/fdata.2023.1236397

Stefano Calzati

While the concept of digital twin has already consolidated in industry, its spinoff in the urban environment—in the form of a City Digital Twin (CDT)—is more recent. A CDT is a dynamic digital model of the physical city whereby the physical and the digital are integrated in both directions, thus mutually affecting each other in real time. Replicating the path of smart cities, literature remarks that agendas and discourses around CDTs remain (1) tech-centered, that is, focused on overcoming technical limitations and lacking a proper sociotechnical contextualization of digital twin technologies; (2) practice-first, entailing hands-on applications without a long-term strategic governance for the management of these same technologies. Building on that, the goal of this article is to move beyond high-level conceptualizations of CDT to (a) get a cognizant understanding of what a CDT can do, how, and for whom; (b) map the current state of development and implementation of CDTs in Europe. This will be done by looking at three case studies—Dublin, Helsinki, and Rotterdam—often considered as successful examples of CDTs in Europe. Through exiting literature and official documents, as well as by relying on primary interviews with tech experts and local officials, the article explores the maturity of these CDTs, along the Gartner's hype-mainstream curve of technological innovations. Findings show that, while all three municipalities have long-term plans to deliver an integrated, cyber-physical real-time modeling of the city, currently their CDTs are still at an early stage of development. The focus remains on technical barriers—e.g., integration of different data sources—overlooking the societal dimension, such as the systematic involvement of citizens. As for the governance, all cases embrace a multistakeholder approach; yet CDTs are still not used for policymaking and it remains to see how the power across stakeholders will be distributed in terms of access to, control of, and decisions about CDTs.

虽然数字孪生的概念已经在工业中得到了巩固，但它在城市环境中的衍生——城市数字孪生(CDT)的形式——是最近才出现的。CDT是物理城市的动态数字模型，物理和数字在两个方向上融合，从而实时相互影响。复制智慧城市的路径，文献评论围绕cdt的议程和话语仍然(1)以技术为中心，即专注于克服技术限制，缺乏适当的数字孪生技术的社会技术背景;(2)实践优先，需要实际操作的应用程序，而没有长期的战略治理来管理这些相同的技术。在此基础上，本文的目标是超越CDT的高级概念化，以(a)对CDT可以做什么、如何做以及为谁做有一个认识上的理解;(b)绘制欧洲发展和执行清洁发展技术的现状图。这将通过考察都柏林、赫尔辛基和鹿特丹这三个通常被认为是欧洲cdt成功范例的案例来完成。通过现有文献和官方文件，以及依靠对技术专家和地方官员的初步采访，本文沿着Gartner的技术创新炒作-主流曲线探索了这些cdt的成熟度。调查结果显示，虽然这三个城市都有提供城市综合网络物理实时建模的长期计划，但目前它们的cdt仍处于早期发展阶段。重点仍然是技术障碍，例如。不同数据源的整合——忽略了社会维度，例如公民的系统参与。至于治理，所有案例都采用多利益相关者方法;然而，cdt仍未被用于政策制定，在cdt的获取、控制和决策方面，利益相关者之间的权力如何分配仍有待观察。

{"title":"No longer hype, not yet mainstream? Recalibrating city digital twins' expectations and reality: a case study perspective","authors":"Stefano Calzati","doi":"10.3389/fdata.2023.1236397","DOIUrl":"https://doi.org/10.3389/fdata.2023.1236397","url":null,"abstract":"While the concept of digital twin has already consolidated in industry, its spinoff in the urban environment—in the form of a City Digital Twin (CDT)—is more recent. A CDT is a dynamic digital model of the physical city whereby the physical and the digital are integrated in both directions, thus mutually affecting each other in real time. Replicating the path of smart cities, literature remarks that agendas and discourses around CDTs remain (1) tech-centered, that is, focused on overcoming technical limitations and lacking a proper sociotechnical contextualization of digital twin technologies; (2) practice-first, entailing hands-on applications without a long-term strategic governance for the management of these same technologies. Building on that, the goal of this article is to move beyond high-level conceptualizations of CDT to (a) get a cognizant understanding of what a CDT can do, how, and for whom; (b) map the current state of development and implementation of CDTs in Europe. This will be done by looking at three case studies—Dublin, Helsinki, and Rotterdam—often considered as successful examples of CDTs in Europe. Through exiting literature and official documents, as well as by relying on primary interviews with tech experts and local officials, the article explores the maturity of these CDTs, along the Gartner's hype-mainstream curve of technological innovations. Findings show that, while all three municipalities have long-term plans to deliver an integrated, cyber-physical real-time modeling of the city, currently their CDTs are still at an early stage of development. The focus remains on technical barriers—e.g., integration of different data sources—overlooking the societal dimension, such as the systematic involvement of citizens. As for the governance, all cases embrace a multistakeholder approach; yet CDTs are still not used for policymaking and it remains to see how the power across stakeholders will be distributed in terms of access to, control of, and decisions about CDTs.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"10 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135936136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services 一种基于OpenMp的快速并行DBSCAN算法用于流媒体服务中的犯罪分子检测

Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-31 DOI: 10.3389/fdata.2023.1292923

Lesia Mochurad, Andrii Sydor, Oleh Ratinskiy

Introduction Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. Methods One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. Results In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. Discussion Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research.

流媒体服务在今天非常受欢迎。数百万人观看直播或视频，听音乐。方法Twitch是最流行的流媒体平台之一，该服务的数据可以作为应用本文提出的并行DBSCAN算法的一个很好的例子。与传统的邻居搜索方法不同，该方法避免了冗余，即重复相同的计算。同时，该算法基于经典的DBSCAN方法，充分搜索所有邻居，采用子任务并行化和OpenMP并行计算技术。结果在不降低准确率的情况下，在分析中等规模数据时，我们成功地提高了基于DBSCAN算法的求解速度。因此，加速速率趋向于多核计算机系统的核数，效率趋向于1。在进行数值实验之前，得到了加速和效率的理论估计，并与所得结果相吻合，证实了其有效性。使用轮廓值验证所执行聚类的质量。所有实验都使用不同百分比的中型数据集进行。该算法在广告、市场营销、网络安全、社会学等领域具有广泛的应用前景。值得一提的是，这类数据集经常被用来检测互联网上的欺诈行为，这使得能够考虑所有邻居的算法成为此类研究的有用工具。

{"title":"A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services","authors":"Lesia Mochurad, Andrii Sydor, Oleh Ratinskiy","doi":"10.3389/fdata.2023.1292923","DOIUrl":"https://doi.org/10.3389/fdata.2023.1292923","url":null,"abstract":"Introduction Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. Methods One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. Results In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. Discussion Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"2020 27","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135814023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An overview of video recommender systems: state-of-the-art and research issues. 视频推荐系统综述:最新技术和研究问题。

IF 3.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-30 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1281614

Sebastian Lubos, Alexander Felfernig, Markus Tautschnig

Video platforms have become indispensable components within a diverse range of applications, serving various purposes in entertainment, e-learning, corporate training, online documentation, and news provision. As the volume and complexity of video content continue to grow, the need for personalized access features becomes an inevitable requirement to ensure efficient content consumption. To address this need, recommender systems have emerged as helpful tools providing personalized video access. By leveraging past user-specific video consumption data and the preferences of similar users, these systems excel in recommending videos that are highly relevant to individual users. This article presents a comprehensive overview of the current state of video recommender systems (VRS), exploring the algorithms used, their applications, and related aspects. In addition to an in-depth analysis of existing approaches, this review also addresses unresolved research challenges within this domain. These unexplored areas offer exciting opportunities for advancements and innovations, aiming to enhance the accuracy and effectiveness of personalized video recommendations. Overall, this article serves as a valuable resource for researchers, practitioners, and stakeholders in the video domain. It offers insights into cutting-edge algorithms, successful applications, and areas that merit further exploration to advance the field of video recommendation.

视频平台已经成为各种应用中不可或缺的组成部分，服务于娱乐、电子学习、企业培训、在线文档和新闻提供等各种目的。随着视频内容的数量和复杂性不断增长，个性化访问功能的需求成为确保高效内容消费的必然要求。为了满足这一需求，推荐系统已经成为提供个性化视频访问的有用工具。通过利用过去用户特定的视频消费数据和类似用户的偏好，这些系统在推荐与个人用户高度相关的视频方面表现出色。本文全面概述了视频推荐系统(VRS)的现状，探讨了所使用的算法、它们的应用和相关方面。除了对现有方法的深入分析之外，本综述还解决了该领域内尚未解决的研究挑战。这些未开发的领域为进步和创新提供了令人兴奋的机会，旨在提高个性化视频推荐的准确性和有效性。总的来说，本文为视频领域的研究人员、从业者和利益相关者提供了宝贵的资源。它提供了对前沿算法、成功应用和值得进一步探索的领域的见解，以推进视频推荐领域。

{"title":"An overview of video recommender systems: state-of-the-art and research issues.","authors":"Sebastian Lubos, Alexander Felfernig, Markus Tautschnig","doi":"10.3389/fdata.2023.1281614","DOIUrl":"10.3389/fdata.2023.1281614","url":null,"abstract":"Video platforms have become indispensable components within a diverse range of applications, serving various purposes in entertainment, e-learning, corporate training, online documentation, and news provision. As the volume and complexity of video content continue to grow, the need for personalized access features becomes an inevitable requirement to ensure efficient content consumption. To address this need, recommender systems have emerged as helpful tools providing personalized video access. By leveraging past user-specific video consumption data and the preferences of similar users, these systems excel in recommending videos that are highly relevant to individual users. This article presents a comprehensive overview of the current state of video recommender systems (VRS), exploring the algorithms used, their applications, and related aspects. In addition to an in-depth analysis of existing approaches, this review also addresses unresolved research challenges within this domain. These unexplored areas offer exciting opportunities for advancements and innovations, aiming to enhance the accuracy and effectiveness of personalized video recommendations. Overall, this article serves as a valuable resource for researchers, practitioners, and stakeholders in the video domain. It offers insights into cutting-edge algorithms, successful applications, and areas that merit further exploration to advance the field of video recommendation.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1281614"},"PeriodicalIF":3.1,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642507/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"107592784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recommender systems for sustainability: overview and research issues. 可持续发展的推荐系统:概述和研究问题。

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-30 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1284511

Alexander Felfernig, Manfred Wundara, Thi Ngoc Trang Tran, Seda Polat-Erdeniz, Sebastian Lubos, Merfat El Mansi, Damian Garber, Viet-Man Le

Sustainability development goals (SDGs) are regarded as a universal call to action with the overall objectives of planet protection, ending of poverty, and ensuring peace and prosperity for all people. In order to achieve these objectives, different AI technologies play a major role. Specifically, recommender systems can provide support for organizations and individuals to achieve the defined goals. Recommender systems integrate AI technologies such as machine learning, explainable AI (XAI), case-based reasoning, and constraint solving in order to find and explain user-relevant alternatives from a potentially large set of options. In this article, we summarize the state of the art in applying recommender systems to support the achievement of sustainability development goals. In this context, we discuss open issues for future research.

可持续发展目标(sdg)被认为是一项普遍的行动呼吁，其总体目标是保护地球，消除贫困，确保所有人的和平与繁荣。为了实现这些目标，不同的人工智能技术发挥了重要作用。具体来说，推荐系统可以为组织和个人提供支持，以实现定义的目标。推荐系统集成了人工智能技术，如机器学习、可解释的人工智能(XAI)、基于案例的推理和约束解决，以便从潜在的大量选项中找到并解释与用户相关的替代方案。在本文中，我们总结了应用推荐系统来支持实现可持续发展目标的最新进展。在此背景下，我们讨论了未来研究的开放性问题。

引用次数: 0

A stream processing abstraction framework. 一个流处理抽象框架。

IF 3.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-25 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1227156

Ilaria Bartolini, Marco Patella

Real-time analysis of large multimedia streams is nowadays made efficient by the existence of several Big Data streaming platforms, like Apache Flink and Samza. However, the use of such platforms is difficult due to the fact that facilities they offer are often too raw to be effectively exploited by analysts. We describe the evolution of RAM3S, a software infrastructure for the integration of Big Data stream processing platforms, to SPAF, an abstraction framework able to provide programmers with a simple but powerful API to ease the development of stream processing applications. By using SPAF, the programmer can easily implement real-time complex analyses of massive streams on top of a distributed computing infrastructure, able to manage the volume and velocity of Big Data streams, thus effectively transforming data into value.

现在，大型多媒体流的实时分析由于几个大数据流平台的存在而变得高效，比如Apache Flink和Samza。然而，使用这些平台是困难的，因为它们提供的设施往往太原始，无法被分析师有效利用。我们描述了RAM3S(一种集成大数据流处理平台的软件基础设施)到SPAF(一种抽象框架，能够为程序员提供简单但功能强大的API，以简化流处理应用程序的开发)的演变。通过使用SPAF，程序员可以轻松地在分布式计算基础设施之上实现对海量流的实时复杂分析，能够管理大数据流的数量和速度，从而有效地将数据转化为价值。

引用次数: 0

Erratum: Evaluation of methods for assigning causes of death from verbal autopsies in India. 勘误表：对印度语言尸检死因分配方法的评估。

IF 3.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-24 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1319729

[This corrects the article DOI: 10.3389/fdata.2023.1197471.].

[这更正了文章DOI:10.3389/fdata.20231197471.]。

引用次数: 0

Experimental study and clustering of operating staff of search systems in the sense of stress resistance. 实验研究和聚类搜索系统操作人员的抗压意识。

IF 3.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-23 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1239017

Nataliya Shakhovska, Roman Kaminskyi, Bohdan Khudoba

Introduction: The main goal of this study is to develop a methodology for the organization of experimental selection of operator personnel based on the analysis of their behavior under the influence of micro-stresses.

Methods: A human-machine interface model has been developed, which considers the change in the functional state of the human operator. The presented concept of the difficulty of detecting the object of attention contributed to developing a particular sequence of ordinary test images with stressor images included in it and presented models of the flow of presenting test images to the recipient.

Results: With the help of descriptive statistics, the parameters of individual box-plot diagrams were determined, and the recipient group was clustered.

Discussion: Overall, the proposed approach based on the example of the conducted grouping makes it possible to ensure the objectivity and efficiency of the professional selection of applicants for operator specialties.

引言：本研究的主要目标是在分析操作人员在微压力影响下的行为的基础上，开发一种组织操作人员实验选择的方法。方法：建立了一个考虑操作人员功能状态变化的人机界面模型。所提出的检测关注对象的困难概念有助于开发一个包含压力源图像的普通测试图像的特定序列，并提出了向接受者呈现测试图像的流程模型。结果：在描述性统计的帮助下，确定了个体箱图的参数，并对接受者群体进行了聚类。讨论：总的来说，基于分组的例子提出的方法可以确保操作员专业申请人专业选择的客观性和效率。

{"title":"Experimental study and clustering of operating staff of search systems in the sense of stress resistance.","authors":"Nataliya Shakhovska, Roman Kaminskyi, Bohdan Khudoba","doi":"10.3389/fdata.2023.1239017","DOIUrl":"10.3389/fdata.2023.1239017","url":null,"abstract":"Introduction: The main goal of this study is to develop a methodology for the organization of experimental selection of operator personnel based on the analysis of their behavior under the influence of micro-stresses.Methods: A human-machine interface model has been developed, which considers the change in the functional state of the human operator. The presented concept of the difficulty of detecting the object of attention contributed to developing a particular sequence of ordinary test images with stressor images included in it and presented models of the flow of presenting test images to the recipient.Results: With the help of descriptive statistics, the parameters of individual box-plot diagrams were determined, and the recipient group was clustered.Discussion: Overall, the proposed approach based on the example of the conducted grouping makes it possible to ensure the objectivity and efficiency of the professional selection of applicants for operator specialties.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1239017"},"PeriodicalIF":3.1,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10626476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71488837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Anemia detection through non-invasive analysis of lip mucosa images. 通过无创分析唇粘膜图像检测贫血。

IF 3.1 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data

Pub Date : 2023-10-19 eCollection Date: 2023-01-01 DOI: 10.3389/fdata.2023.1241899

Turker Berk Donmez, Mohammed Mansour, Mustafa Kutlu, Chris Freeman, Shekhar Mahmud

This paper aims to detect anemia using images of the lip mucosa, where the skin tissue is thin, and to confirm the feasibility of detecting anemia noninvasively and in the home environment using machine learning (ML). Data were collected from 138 patients, including 100 women and 38 men. Six ML algorithms: artificial neural network (ANN), decision tree (DT), k-nearest neighbors (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM) which are widely used in medical applications, were used to classify the collected data. Two different data types were obtained from participants' images (RGB red color values and HSV saturation values) as features, with age, sex, and hemoglobin levels utilized to perform classification. The ML algorithm was used to analyze and classify images of the lip mucosa quickly and accurately, potentially increasing the efficiency of anemia screening programs. The accuracy, precision, recall, and F-measure were evaluated to assess how well ML models performed in predicting anemia. The results showed that NB reported the highest accuracy (96%) among the other ML models used. DT, KNN and ANN reported an accuracies of (93%), while LR and SVM had an accuracy of (79%) and (75%) receptively. This research suggests that employing ML approaches to identify anemia will help classify the diagnosis, which will then help to create efficient preventive measures. Compared to blood tests, this noninvasive procedure is more practical and accessible to patients. Furthermore, ML algorithms may be created and trained to assess lip mucosa photos at a minimal cost, making it an affordable screening method in regions with a shortage of healthcare resources.

本文旨在使用皮肤组织较薄的唇粘膜图像检测贫血，并确认使用机器学习（ML）在家庭环境中无创检测贫血的可行性。数据收集自138名患者，其中包括100名女性和38名男性。采用人工神经网络（ANN）、决策树（DT）、k近邻（KNN）、逻辑回归（LR）、朴素贝叶斯（NB）和支持向量机（SVM）六种ML算法对采集的数据进行分类。从参与者的图像中获得两种不同的数据类型（RGB红色值和HSV饱和度值）作为特征，年龄、性别和血红蛋白水平用于进行分类。ML算法用于快速准确地分析和分类唇粘膜图像，有可能提高贫血筛查程序的效率。评估准确性、精密度、召回率和F-测量，以评估ML模型在预测贫血方面的表现。结果表明，在所使用的其他ML模型中，NB报告的准确率最高（96%）。DT、KNN和ANN的准确率为（93%），而LR和SVM的准确率分别为（79%）和（75%）。这项研究表明，采用ML方法来识别贫血将有助于对诊断进行分类，从而有助于制定有效的预防措施。与血液检查相比，这种无创手术更实用，患者也更容易接受。此外，可以创建和训练ML算法，以最低成本评估唇粘膜照片，使其成为医疗资源短缺地区负担得起的筛查方法。

{"title":"Anemia detection through non-invasive analysis of lip mucosa images.","authors":"Turker Berk Donmez, Mohammed Mansour, Mustafa Kutlu, Chris Freeman, Shekhar Mahmud","doi":"10.3389/fdata.2023.1241899","DOIUrl":"10.3389/fdata.2023.1241899","url":null,"abstract":"This paper aims to detect anemia using images of the lip mucosa, where the skin tissue is thin, and to confirm the feasibility of detecting anemia noninvasively and in the home environment using machine learning (ML). Data were collected from 138 patients, including 100 women and 38 men. Six ML algorithms: artificial neural network (ANN), decision tree (DT), k-nearest neighbors (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM) which are widely used in medical applications, were used to classify the collected data. Two different data types were obtained from participants' images (RGB red color values and HSV saturation values) as features, with age, sex, and hemoglobin levels utilized to perform classification. The ML algorithm was used to analyze and classify images of the lip mucosa quickly and accurately, potentially increasing the efficiency of anemia screening programs. The accuracy, precision, recall, and F-measure were evaluated to assess how well ML models performed in predicting anemia. The results showed that NB reported the highest accuracy (96%) among the other ML models used. DT, KNN and ANN reported an accuracies of (93%), while LR and SVM had an accuracy of (79%) and (75%) receptively. This research suggests that employing ML approaches to identify anemia will help classify the diagnosis, which will then help to create efficient preventive measures. Compared to blood tests, this noninvasive procedure is more practical and accessible to patients. Furthermore, ML algorithms may be created and trained to assess lip mucosa photos at a minimal cost, making it an affordable screening method in regions with a shortage of healthcare resources.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1241899"},"PeriodicalIF":3.1,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620602/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71488836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0