IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-04-01 Epub Date: 2023-10-31 DOI: 10.1089/big.2022.0201

Georgios Vranopoulos, Nathan Clarke, Shirley Atkinson

Organizations have been investing in analytics relying on internal and external data to gain a competitive advantage. However, the legal and regulatory acts imposed nationally and internationally have become a challenge, especially for highly regulated sectors such as health or finance/banking. Data handlers such as Facebook and Amazon have already sustained considerable fines or are under investigation due to violations of data governance. The era of big data has further intensified the challenges of minimizing the risk of data loss by introducing the dimensions of Volume, Velocity, and Variety into confidentiality. Although Volume and Velocity have been extensively researched, Variety, "the ugly duckling" of big data, is often neglected and difficult to solve, thus increasing the risk of data exposure and data loss. In mitigating the risk of data exposure and data loss in this article, a framework is proposed to utilize algorithmic classification and workflow capabilities to provide a consistent approach toward data evaluations across the organizations. A rule-based system, implementing the corporate data classification policy, will minimize the risk of exposure by facilitating users to identify the approved guidelines and enforce them quickly. The framework includes an exception handling process with appropriate approval for extenuating circumstances. The system was implemented in a proof of concept working prototype to showcase the capabilities and provide a hands-on experience. The information system was evaluated and accredited by a diverse audience of academics and senior business executives in the fields of security and data management. The audience had an average experience of ∼25 years and amasses a total experience of almost three centuries (294 years). The results confirmed that the 3Vs are of concern and that Variety, with a majority of 90% of the commentators, is the most troubling. In addition to that, with an approximate average of 60%, it was confirmed that appropriate policies, procedure, and prerequisites for classification are in place while implementation tools are lagging.

组织一直在投资于依赖内部和外部数据的分析，以获得竞争优势。然而，国家和国际上实施的法律和监管法案已成为一项挑战，尤其是对卫生或金融/银行等高度监管的部门而言。脸书（Facebook）和亚马逊（Amazon）等数据处理公司已经因违反数据治理规定而被处以巨额罚款，或正在接受调查。大数据时代通过将Volume、Velocity和Variety等维度引入保密性，进一步加剧了将数据丢失风险降至最低的挑战。尽管Volume和Velocity已经得到了广泛的研究，但Variety这个大数据的“丑小鸭”却经常被忽视和难以解决，从而增加了数据暴露和数据丢失的风险。在本文中，为了降低数据暴露和数据丢失的风险，提出了一个框架，利用算法分类和工作流功能，为跨组织的数据评估提供一致的方法。一个基于规则的系统，实施公司数据分类政策，将通过方便用户识别批准的指导方针并迅速执行，将暴露风险降至最低。该框架包括一个例外处理程序，对情有可原的情况给予适当批准。该系统是在概念验证工作原型中实现的，以展示其能力并提供动手体验。安全和数据管理领域的学者和高级企业高管对该信息系统进行了评估和认可。观众平均经历了~25年，积累了近三个世纪（294年）的总经历。结果证实，3V令人担忧，而拥有90%评论员的《综艺》是最令人担忧的。除此之外，平均水平约为60%，证实了适当的分类政策、程序和先决条件已经到位，而实施工具却滞后。

{"title":"Big Data Confidentiality: An Approach Toward Corporate Compliance Using a Rule-Based System.","authors":"Georgios Vranopoulos, Nathan Clarke, Shirley Atkinson","doi":"10.1089/big.2022.0201","DOIUrl":"10.1089/big.2022.0201","url":null,"abstract":"Organizations have been investing in analytics relying on internal and external data to gain a competitive advantage. However, the legal and regulatory acts imposed nationally and internationally have become a challenge, especially for highly regulated sectors such as health or finance/banking. Data handlers such as Facebook and Amazon have already sustained considerable fines or are under investigation due to violations of data governance. The era of big data has further intensified the challenges of minimizing the risk of data loss by introducing the dimensions of Volume, Velocity, and Variety into confidentiality. Although Volume and Velocity have been extensively researched, Variety, \"the ugly duckling\" of big data, is often neglected and difficult to solve, thus increasing the risk of data exposure and data loss. In mitigating the risk of data exposure and data loss in this article, a framework is proposed to utilize algorithmic classification and workflow capabilities to provide a consistent approach toward data evaluations across the organizations. A rule-based system, implementing the corporate data classification policy, will minimize the risk of exposure by facilitating users to identify the approved guidelines and enforce them quickly. The framework includes an exception handling process with appropriate approval for extenuating circumstances. The system was implemented in a proof of concept working prototype to showcase the capabilities and provide a hands-on experience. The information system was evaluated and accredited by a diverse audience of academics and senior business executives in the fields of security and data management. The audience had an average experience of ∼25 years and amasses a total experience of almost three centuries (294 years). The results confirmed that the 3Vs are of concern and that Variety, with a majority of 90% of the commentators, is the most troubling. In addition to that, with an approximate average of 60%, it was confirmed that appropriate policies, procedure, and prerequisites for classification are in place while implementation tools are lagging.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"90-110"},"PeriodicalIF":2.6,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Impact of Big Data Analytics on Decision-Making Within the Government Sector. 大数据分析对政府部门决策的影响。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-04-01 Epub Date: 2024-01-09 DOI: 10.1089/big.2023.0019

Laila Faridoon, Wei Liu, Crawford Spence

The government sector has started adopting big data analytics capability (BDAC) to enhance its service delivery. This study examines the relationship between BDAC and decision-making capability (DMC) in the government sector. It investigates the mediation role of the cognitive style of decision makers and organizational culture in the relationship between BDAC and DMC utilizing the resource-based view of the firm theory. It further investigates the impact of BDAC on organizational performance (OP). This study attempts to extend existing research through significant findings and recommendations to enhance decision-making processes for a successful utilization of BDAC in the government sector. A survey method was adopted to collect data from government organizations in the United Arab Emirates, and partial least-squares structural equation modeling was deployed to analyze the collected data. The results empirically validate the proposed theoretical framework and confirm that BDAC positively impacts DMC via cognitive style and organizational culture, and in turn further positively impacting OP overall.

政府部门已开始采用大数据分析能力（BDAC）来提高服务水平。本研究探讨了政府部门大数据分析能力（BDAC）与决策能力（DMC）之间的关系。研究利用基于资源的企业理论，探讨了决策者的认知风格和组织文化在 BDAC 与 DMC 关系中的中介作用。研究还进一步探讨了 BDAC 对组织绩效（OP）的影响。本研究试图通过重要的发现和建议来扩展现有的研究，以加强决策过程，从而在政府部门成功使用 BDAC。本研究采用调查方法收集阿拉伯联合酋长国政府组织的数据，并采用偏最小二乘结构方程模型对收集到的数据进行分析。研究结果从实证角度验证了所提出的理论框架，并证实 BDAC 通过认知风格和组织文化对 DMC 产生积极影响，进而进一步对 OP 整体产生积极影响。

引用次数: 0

Research on Sports Injury Rehabilitation Detection Based on IoT Models for Digital Health Care. 基于物联网模型的数字医疗运动损伤康复检测研究。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-04-01 Epub Date: 2024-12-17 DOI: 10.1089/big.2023.0134

Zhiyong Wu, Zhida Huang, Nianhua Tang, Kai Wang, Chuanjie Bian, Dandan Li, Vumika Kuraki, Felix Schmid

Physical therapists specializing in sports rehabilitation detection help injured athletes recover from their wounds and avoid further harm. Sports rehabilitators treat not just commonplace sports injuries but also work-related musculoskeletal injuries, discomfort, and disorders. Sensor-equipped Internet of Things (IoT) monitors the real-time location of medical equipment such as scooters, cardioverters, nebulizer treatments, oxygenation pumps, or other monitor gear. Analysis of medicine deployment across sites is possible in real time. Health care delivery based on digital technology to improve access, affordability, and sustainability of medical treatment is known as digital health care. The challenging characteristics of such sports injury rehabilitation for digital health care are playing position, game strategies, and cybersecurity. Hence, in this research, health care IoT-enabled body area networks (HIoT-BAN) have been designed to improve sports injury rehabilitation detection for digital health care. The health care sector may benefit significantly from IoT adoption since it allows for enhanced patient safety; health care investment management includes controlling the hospital's pharmaceutical stock and monitoring the heat and humidity levels. Digital health describes a group of programmers made to aid health care delivery, whether by assisting with clinical decision-making or streamlining back-end operations in health care institutions. A HIoT-BAN effectively predicts the rise in sports injury rehabilitation detection with faster digital health care based on IoT. The research concludes that the HIoT-BAN effectively indicates sports injury rehabilitation detection for digital health care. The experimental analysis of HIoT-BAN outperforms the IoT method in terms of performance, accuracy, prediction ratio, and mean square error rate.

专门从事运动康复检测的物理治疗师帮助受伤的运动员从伤口中恢复，避免进一步的伤害。运动康复师不仅治疗常见的运动损伤，还治疗与工作有关的肌肉骨骼损伤、不适和疾病。配备传感器的物联网（IoT）可以监控医疗设备的实时位置，如踏板车、心律转复器、雾化器治疗、氧合泵或其他监控设备。实时分析跨站点的药物部署是可能的。基于数字技术的医疗保健服务旨在改善医疗的可及性、可负担性和可持续性，这被称为数字医疗保健。这种运动损伤康复对数字医疗的挑战特征是比赛位置，比赛策略和网络安全。因此，在本研究中，医疗保健物联网身体区域网络（iot - ban）被设计用于改善数字医疗保健的运动损伤康复检测。医疗保健部门可能会从物联网的采用中受益匪浅，因为它可以提高患者的安全性；医疗保健投资管理包括控制医院的药品库存和监测热量和湿度水平。数字健康描述了一组帮助医疗保健提供的程序，无论是通过协助临床决策还是简化医疗保健机构的后端操作。基于物联网的更快的数字医疗，HIoT-BAN有效地预测了运动损伤康复检测的增长。研究认为，HIoT-BAN有效地为数字医疗的运动损伤康复检测提供了依据。实验分析表明，IoT- ban在性能、准确率、预测比、均方错误率等方面都优于IoT方法。

{"title":"Research on Sports Injury Rehabilitation Detection Based on IoT Models for Digital Health Care.","authors":"Zhiyong Wu, Zhida Huang, Nianhua Tang, Kai Wang, Chuanjie Bian, Dandan Li, Vumika Kuraki, Felix Schmid","doi":"10.1089/big.2023.0134","DOIUrl":"10.1089/big.2023.0134","url":null,"abstract":"Physical therapists specializing in sports rehabilitation detection help injured athletes recover from their wounds and avoid further harm. Sports rehabilitators treat not just commonplace sports injuries but also work-related musculoskeletal injuries, discomfort, and disorders. Sensor-equipped Internet of Things (IoT) monitors the real-time location of medical equipment such as scooters, cardioverters, nebulizer treatments, oxygenation pumps, or other monitor gear. Analysis of medicine deployment across sites is possible in real time. Health care delivery based on digital technology to improve access, affordability, and sustainability of medical treatment is known as digital health care. The challenging characteristics of such sports injury rehabilitation for digital health care are playing position, game strategies, and cybersecurity. Hence, in this research, health care IoT-enabled body area networks (HIoT-BAN) have been designed to improve sports injury rehabilitation detection for digital health care. The health care sector may benefit significantly from IoT adoption since it allows for enhanced patient safety; health care investment management includes controlling the hospital's pharmaceutical stock and monitoring the heat and humidity levels. Digital health describes a group of programmers made to aid health care delivery, whether by assisting with clinical decision-making or streamlining back-end operations in health care institutions. A HIoT-BAN effectively predicts the rise in sports injury rehabilitation detection with faster digital health care based on IoT. The research concludes that the HIoT-BAN effectively indicates sports injury rehabilitation detection for digital health care. The experimental analysis of HIoT-BAN outperforms the IoT method in terms of performance, accuracy, prediction ratio, and mean square error rate.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"144-160"},"PeriodicalIF":2.6,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Consumer Segmentation Based on Location and Timing Dimensions Using Big Data from Business-to-Customer Retailing Marketplaces. 利用从企业到客户零售市场的大数据，基于位置和时间维度的消费者细分。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-04-01 Epub Date: 2023-10-30 DOI: 10.1089/big.2022.0307

Fatemeh Ehsani, Monireh Hosseini

Consumer segmentation is an electronic marketing practice that involves dividing consumers into groups with similar features to discover their preferences. In the business-to-customer (B2C) retailing industry, marketers explore big data to segment consumers based on various dimensions. However, among these dimensions, the motives of location and time of shopping have received relatively less attention. In this study, we use the recency, frequency, monetary, and tenure (RFMT) method to segment consumers into 10 groups based on their time and geographical features. To explore location, we investigate market distribution, revenue distribution, and consumer distribution. Geographical coordinates and peculiarities are estimated based on consumer density. Regarding time exploration, we evaluate the accuracy of product delivery and the timing of promotions. To pinpoint the target consumers, we display the main hotspots on the distribution heatmap. Furthermore, we identify the optimal time for purchase and the most densely populated locations of beneficial consumers. In addition, we evaluate product distribution to determine the most popular product categories. Based on the RFMT segmentation and product popularity, we have developed a product recommender system to assist marketers in attracting and engaging potential consumers. Through a case study using data from massive B2C retailing, we conclude that the proposed segmentation provides superior insights into consumer behavior and improves product recommendation performance.

消费者细分是一种电子营销实践，包括将消费者分为具有相似特征的群体，以发现他们的偏好。在企业对客户（B2C）零售业中，营销人员探索大数据，根据不同维度对消费者进行细分。然而，在这些维度中，购物地点和时间的动机受到的关注相对较少。在这项研究中，我们使用最近度、频率、货币和保有权（RFMT）方法，根据消费者的时间和地理特征将其分为10组。为了探索地点，我们调查了市场分布、收入分布和消费者分布。地理坐标和特性是根据消费者密度估计的。关于时间探索，我们评估产品交付的准确性和促销时间。为了准确定位目标消费者，我们在分销热图上显示了主要热点。此外，我们确定了有利消费者的最佳购买时间和人口最密集的地点。此外，我们评估产品分布，以确定最受欢迎的产品类别。基于RFMT细分和产品受欢迎程度，我们开发了一个产品推荐系统，以帮助营销人员吸引和吸引潜在消费者。通过使用大规模B2C零售数据的案例研究，我们得出结论，所提出的细分提供了对消费者行为的卓越见解，并提高了产品推荐性能。

{"title":"Consumer Segmentation Based on Location and Timing Dimensions Using Big Data from Business-to-Customer Retailing Marketplaces.","authors":"Fatemeh Ehsani, Monireh Hosseini","doi":"10.1089/big.2022.0307","DOIUrl":"10.1089/big.2022.0307","url":null,"abstract":"Consumer segmentation is an electronic marketing practice that involves dividing consumers into groups with similar features to discover their preferences. In the business-to-customer (B2C) retailing industry, marketers explore big data to segment consumers based on various dimensions. However, among these dimensions, the motives of location and time of shopping have received relatively less attention. In this study, we use the recency, frequency, monetary, and tenure (RFMT) method to segment consumers into 10 groups based on their time and geographical features. To explore location, we investigate market distribution, revenue distribution, and consumer distribution. Geographical coordinates and peculiarities are estimated based on consumer density. Regarding time exploration, we evaluate the accuracy of product delivery and the timing of promotions. To pinpoint the target consumers, we display the main hotspots on the distribution heatmap. Furthermore, we identify the optimal time for purchase and the most densely populated locations of beneficial consumers. In addition, we evaluate product distribution to determine the most popular product categories. Based on the RFMT segmentation and product popularity, we have developed a product recommender system to assist marketers in attracting and engaging potential consumers. Through a case study using data from massive B2C retailing, we conclude that the proposed segmentation provides superior insights into consumer behavior and improves product recommendation performance.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"111-126"},"PeriodicalIF":2.6,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evolutionary Trends in Decision Sciences Education Research from Simulation and Games to Big Data Analytics and Generative Artificial Intelligence.

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-28 DOI: 10.1089/big.2024.0128

Ikpe Justice Akpan, Rouzbeh Razavi, Asuama A Akpan

Decision sciences (DSC) involves studying complex dynamic systems and processes to aid informed choices subject to constraints in uncertain conditions. It integrates multidisciplinary methods and strategies to evaluate decision engineering processes, identifying alternatives and providing insights toward enhancing prudent decision-making. This study analyzes the evolutionary trends and innovation in DSC education and research trends over the past 25 years. Using metadata from bibliographic records and employing the science mapping method and text analytics, we map and evaluate the thematic, intellectual, and social structures of DSC research. The results identify "knowledge management," "decision support systems," "data envelopment analysis," "simulation," and "artificial intelligence" (AI) as some of the prominent critical skills and knowledge requirements for problem-solving in DSC before and during the period (2000-2024). However, these technologies are evolving significantly in the recent wave of digital transformation, with data analytics frameworks (including techniques such as big data analytics, machine learning, business intelligence, data mining, and information visualization) becoming crucial. DSC education and research continue to mirror the development in practice, with sustainable education through virtual/online learning becoming prominent. Innovative pedagogical approaches/strategies also include computer simulation and games ("play and learn" or "role-playing"). The current era witnesses AI adoption in different forms as conversational Chatbot agent and generative AI (GenAI), such as chat generative pretrained transformer in teaching, learning, and scholarly activities amidst challenges (academic integrity, plagiarism, intellectual property violations, and other ethical and legal issues). Future DSC education must innovatively integrate GenAI into DSC education and address the resulting challenges.

决策科学（DSC）涉及研究复杂的动态系统和过程，以帮助人们在不确定的条件下根据制约因素做出明智的选择。它整合了多学科方法和策略，以评估决策工程流程、确定替代方案并提供见解，从而加强审慎决策。本研究分析了过去 25 年中 DSC 教育和研究趋势的演变趋势和创新。利用书目记录中的元数据，并采用科学绘图法和文本分析法，我们对 DSC 研究的主题、知识和社会结构进行了绘图和评估。研究结果表明，"知识管理"、"决策支持系统"、"数据包络分析"、"模拟 "和 "人工智能"（AI）是 2000-2024 年之前和期间（2000-2024 年）DSC 解决问题所需的一些重要技能和知识。然而，在最近的数字化转型浪潮中，这些技术正在发生重大演变，数据分析框架（包括大数据分析、机器学习、商业智能、数据挖掘和信息可视化等技术）变得至关重要。DSC 教育和研究继续反映实践中的发展，通过虚拟/在线学习开展可持续教育的情况日益突出。创新的教学方法/策略还包括计算机模拟和游戏（"边玩边学 "或 "角色扮演"）。当今时代，人工智能以对话式聊天机器人（Chatbot agent）和生成式人工智能（GenAI）等不同形式被广泛采用，如在教学、学习和学术活动中使用的聊天生成式预训练转换器，它面临着各种挑战（学术诚信、剽窃、侵犯知识产权以及其他伦理和法律问题）。未来的 DSC 教育必须创新性地将 GenAI 融入 DSC 教育，并应对由此带来的挑战。

{"title":"Evolutionary Trends in Decision Sciences Education Research from Simulation and Games to Big Data Analytics and Generative Artificial Intelligence.","authors":"Ikpe Justice Akpan, Rouzbeh Razavi, Asuama A Akpan","doi":"10.1089/big.2024.0128","DOIUrl":"https://doi.org/10.1089/big.2024.0128","url":null,"abstract":"Decision sciences (DSC) involves studying complex dynamic systems and processes to aid informed choices subject to constraints in uncertain conditions. It integrates multidisciplinary methods and strategies to evaluate decision engineering processes, identifying alternatives and providing insights toward enhancing prudent decision-making. This study analyzes the evolutionary trends and innovation in DSC education and research trends over the past 25 years. Using metadata from bibliographic records and employing the science mapping method and text analytics, we map and evaluate the thematic, intellectual, and social structures of DSC research. The results identify \"knowledge management,\" \"decision support systems,\" \"data envelopment analysis,\" \"simulation,\" and \"artificial intelligence\" (AI) as some of the prominent critical skills and knowledge requirements for problem-solving in DSC before and during the period (2000-2024). However, these technologies are evolving significantly in the recent wave of digital transformation, with data analytics frameworks (including techniques such as big data analytics, machine learning, business intelligence, data mining, and information visualization) becoming crucial. DSC education and research continue to mirror the development in practice, with sustainable education through virtual/online learning becoming prominent. Innovative pedagogical approaches/strategies also include computer simulation and games (\"play and learn\" or \"role-playing\"). The current era witnesses AI adoption in different forms as conversational Chatbot agent and generative AI (GenAI), such as chat generative pretrained transformer in teaching, learning, and scholarly activities amidst challenges (academic integrity, plagiarism, intellectual property violations, and other ethical and legal issues). Future DSC education must innovatively integrate GenAI into DSC education and address the resulting challenges.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143527974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

gtfs2net: Extraction of General Transit Feed Specification Data Sets to Abstract Networks and Their Analysis. gtfs2net:抽象网络中通用传输馈电规范数据集的提取及其分析。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-01 Epub Date: 2023-04-24 DOI: 10.1089/big.2022.0269

Gergely Kocsis, Imre Varga

Mass transportation networks of cities or regions are interesting and important to be studied to get a picture of the properties of a somehow better topology and system of transportation. One way to do this lies on the basis of spatial information of stations and routes. As we show however interesting findings can be gained also if one studies the abstract network topologies of these systems. To get these abstract types of networks, we have developed a tool that can extract a network of connected stops from General Transit Feed Specification feeds. As we found during the development, service providers do not follow the specification in coherent ways, so as a kind of postprocessing we have introduced virtual stations to the abstract networks that gather close stops together. We analyze the effect of these new stations on the abstract map as well.

城市或地区的大众交通网络是一个有趣且重要的研究对象，它可以帮助我们了解更好的交通拓扑和交通系统的特性。其中一种方法是基于车站和路线的空间信息。然而，正如我们所展示的，如果研究这些系统的抽象网络拓扑结构，也可以获得有趣的发现。为了获得这些抽象类型的网络，我们开发了一个工具，可以从通用运输馈送规范馈送中提取连接站点的网络。我们在开发过程中发现，服务提供商没有以连贯的方式遵循规范，因此作为一种后处理，我们将虚拟站点引入到将紧密站点聚集在一起的抽象网络中。我们还分析了这些新站点对抽象地图的影响。

引用次数: 0

Cloud Resource Scheduling Using Multi-Strategy Fused Honey Badger Algorithm.

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-01 DOI: 10.1089/big.2023.0146

Haitao Xie, Chengkai Li, Zhiwei Ye, Tao Zhao, Hui Xu, Jiangyi Du, Wanfang Bai

Cloud resource scheduling is one of the most significant tasks in the field of big data, which is a combinatorial optimization problem in essence. Scheduling strategies based on meta-heuristic algorithms (MAs) are often chosen to deal with this topic. However, MAs are prone to falling into local optima leading to decreasing quality of the allocation scheme. Algorithms with good global search ability are needed to map available cloud resources to the requirements of the task. Honey Badger Algorithm (HBA) is a newly proposed algorithm with strong search ability. In order to further improve scheduling performance, an Improved Honey Badger Algorithm (IHBA), which combines two local search strategies and a new fitness function, is proposed in this article. IHBA is compared with 6 MAs in four scale load tasks. The comparative simulation results obtained reveal that the proposed algorithm performs better than other algorithms involved in the article. IHBA enhances the diversity of algorithm populations, expands the individual's random search range, and prevents the algorithm from falling into local optima while effectively achieving resource load balancing.

{"title":"Cloud Resource Scheduling Using Multi-Strategy Fused Honey Badger Algorithm.","authors":"Haitao Xie, Chengkai Li, Zhiwei Ye, Tao Zhao, Hui Xu, Jiangyi Du, Wanfang Bai","doi":"10.1089/big.2023.0146","DOIUrl":"https://doi.org/10.1089/big.2023.0146","url":null,"abstract":"Cloud resource scheduling is one of the most significant tasks in the field of big data, which is a combinatorial optimization problem in essence. Scheduling strategies based on meta-heuristic algorithms (MAs) are often chosen to deal with this topic. However, MAs are prone to falling into local optima leading to decreasing quality of the allocation scheme. Algorithms with good global search ability are needed to map available cloud resources to the requirements of the task. Honey Badger Algorithm (HBA) is a newly proposed algorithm with strong search ability. In order to further improve scheduling performance, an Improved Honey Badger Algorithm (IHBA), which combines two local search strategies and a new fitness function, is proposed in this article. IHBA is compared with 6 MAs in four scale load tasks. The comparative simulation results obtained reveal that the proposed algorithm performs better than other algorithms involved in the article. IHBA enhances the diversity of algorithm populations, expands the individual's random search range, and prevents the algorithm from falling into local optima while effectively achieving resource load balancing.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"13 1","pages":"59-72"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143450642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generic User Behavior: A User Behavior Similarity-Based Recommendation Method. 通用用户行为:基于用户行为相似度的推荐方法。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-01 Epub Date: 2023-04-19 DOI: 10.1089/big.2022.0260

Zhengyang Hu, Weiwei Lin, Xiaoying Ye, Haojun Xu, Haocheng Zhong, Huikang Huang, Xinyang Wang

Recommender system (RS) plays an important role in Big Data research. Its main idea is to handle huge amounts of data to accurately recommend items to users. The recommendation method is the core research content of the whole RS. However, the existing recommendation methods still have the following two shortcomings: (1) Most recommendation methods use only one kind of information about the user's interaction with items (such as Browse or Purchase), which makes it difficult to model complete user preference. (2) Most mainstream recommendation methods only consider the final consistency of recommendation (e.g., user preferences) but ignore the process consistency (e.g., user behavior), which leads to the biased final result. In this article, we propose a recommendation method based on the Entity Interaction Knowledge Graph (EIKG), which draws on the idea of collaborative filtering and innovatively uses the similarity of user behaviors to recommend items. The method first extracts fact triples containing interaction relations from relevant data sets to generate the EIKG; then embeds the entities and relations in the EIKG; finally, uses link prediction techniques to recommend items for users. The proposed method is compared with other recommendation methods on two publicly available data sets, Scholat and Lizhi, and the experimental result shows that it exceeds the state of the art in most metrics, verifying the effectiveness of the proposed method.

推荐系统(RS)在大数据研究中扮演着重要的角色。它的主要思想是处理大量数据，以准确地向用户推荐商品。推荐方法是整个RS的核心研究内容，但是现有的推荐方法仍然存在以下两个缺点:(1)大多数推荐方法只使用一种关于用户与物品交互的信息(如Browse或Purchase)，这使得很难对完整的用户偏好建模。(2)大多数主流推荐方法只考虑推荐的最终一致性(如用户偏好)，而忽略了过程一致性(如用户行为)，导致最终结果存在偏差。在本文中，我们提出了一种基于实体交互知识图(EIKG)的推荐方法，该方法借鉴协同过滤的思想，创新地利用用户行为的相似性来推荐项目。该方法首先从相关数据集中提取包含交互关系的事实三元组，生成EIKG;然后在EIKG中嵌入实体和关系;最后，使用链接预测技术为用户推荐商品。在Scholat和Lizhi两个公开的数据集上与其他推荐方法进行了比较，实验结果表明，该方法在大多数指标上都超过了目前的水平，验证了所提方法的有效性。

{"title":"Generic User Behavior: A User Behavior Similarity-Based Recommendation Method.","authors":"Zhengyang Hu, Weiwei Lin, Xiaoying Ye, Haojun Xu, Haocheng Zhong, Huikang Huang, Xinyang Wang","doi":"10.1089/big.2022.0260","DOIUrl":"10.1089/big.2022.0260","url":null,"abstract":"Recommender system (RS) plays an important role in Big Data research. Its main idea is to handle huge amounts of data to accurately recommend items to users. The recommendation method is the core research content of the whole RS. However, the existing recommendation methods still have the following two shortcomings: (1) Most recommendation methods use only one kind of information about the user's interaction with items (such as Browse or Purchase), which makes it difficult to model complete user preference. (2) Most mainstream recommendation methods only consider the final consistency of recommendation (e.g., user preferences) but ignore the process consistency (e.g., user behavior), which leads to the biased final result. In this article, we propose a recommendation method based on the Entity Interaction Knowledge Graph (EIKG), which draws on the idea of collaborative filtering and innovatively uses the similarity of user behaviors to recommend items. The method first extracts fact triples containing interaction relations from relevant data sets to generate the EIKG; then embeds the entities and relations in the EIKG; finally, uses link prediction techniques to recommend items for users. The proposed method is compared with other recommendation methods on two publicly available data sets, Scholat and Lizhi, and the experimental result shows that it exceeds the state of the art in most metrics, verifying the effectiveness of the proposed method.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"3-15"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9477294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pneumonia Detection Using Enhanced Convolutional Neural Network Model on Chest X-Ray Images. 基于增强卷积神经网络模型的胸部x线图像肺炎检测。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-01 Epub Date: 2023-04-17 DOI: 10.1089/big.2022.0261

Shadi A Aljawarneh, Romesaa Al-Quraan

Pneumonia, caused by microorganisms, is a severely contagious disease that damages one or both the lungs of the patients. Early detection and treatment are typically favored to recover infected patients since untreated pneumonia can lead to major complications in the elderly (>65 years) and children (<5 years). The objectives of this work are to develop several models to evaluate big X-ray images (XRIs) of the chest, to determine whether the images show/do not show signs of pneumonia, and to compare the models based on their accuracy, precision, recall, loss, and receiver operating characteristic area under the ROC curve scores. Enhanced convolutional neural network (CNN), VGG-19, ResNet-50, and ResNet-50 with fine-tuning are some of the deep learning (DL) algorithms employed in this study. By training the transfer learning model and enhanced CNN model using a big data set, these techniques are used to identify pneumonia. The data set for the study was obtained from Kaggle. It should be noted that the data set has been expanded to include further records. This data set included 5863 chest XRIs, which were categorized into 3 different folders (i.e., train, val, test). These data are produced every day from personnel records and Internet of Medical Things devices. According to the experimental findings, the ResNet-50 model showed the lowest accuracy, that is, 82.8%, while the enhanced CNN model showed the highest accuracy of 92.4%. Owing to its high accuracy, enhanced CNN was regarded as the best model in this study. The techniques developed in this study outperformed the popular ensemble techniques, and the models showed better results than those generated by cutting-edge methods. Our study implication is that a DL models can detect the progression of pneumonia, which improves the general diagnostic accuracy and gives patients new hope for speedy treatment. Since enhanced CNN and ResNet-50 showed the highest accuracy compared with other algorithms, it was concluded that these techniques could be effectively used to identify pneumonia after performing fine-tuning.

由微生物引起的肺炎是一种严重的传染病，会损害患者的单侧或双侧肺。早期发现和治疗通常有利于感染患者的康复，因为未经治疗的肺炎可导致老年人(>65岁)和儿童的主要并发症。

{"title":"Pneumonia Detection Using Enhanced Convolutional Neural Network Model on Chest X-Ray Images.","authors":"Shadi A Aljawarneh, Romesaa Al-Quraan","doi":"10.1089/big.2022.0261","DOIUrl":"10.1089/big.2022.0261","url":null,"abstract":"Pneumonia, caused by microorganisms, is a severely contagious disease that damages one or both the lungs of the patients. Early detection and treatment are typically favored to recover infected patients since untreated pneumonia can lead to major complications in the elderly (>65 years) and children (<5 years). The objectives of this work are to develop several models to evaluate big X-ray images (XRIs) of the chest, to determine whether the images show/do not show signs of pneumonia, and to compare the models based on their accuracy, precision, recall, loss, and receiver operating characteristic area under the ROC curve scores. Enhanced convolutional neural network (CNN), VGG-19, ResNet-50, and ResNet-50 with fine-tuning are some of the deep learning (DL) algorithms employed in this study. By training the transfer learning model and enhanced CNN model using a big data set, these techniques are used to identify pneumonia. The data set for the study was obtained from Kaggle. It should be noted that the data set has been expanded to include further records. This data set included 5863 chest XRIs, which were categorized into 3 different folders (i.e., train, val, test). These data are produced every day from personnel records and Internet of Medical Things devices. According to the experimental findings, the ResNet-50 model showed the lowest accuracy, that is, 82.8%, while the enhanced CNN model showed the highest accuracy of 92.4%. Owing to its high accuracy, enhanced CNN was regarded as the best model in this study. The techniques developed in this study outperformed the popular ensemble techniques, and the models showed better results than those generated by cutting-edge methods. Our study implication is that a DL models can detect the progression of pneumonia, which improves the general diagnostic accuracy and gives patients new hope for speedy treatment. Since enhanced CNN and ResNet-50 showed the highest accuracy compared with other algorithms, it was concluded that these techniques could be effectively used to identify pneumonia after performing fine-tuning.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"16-29"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9737399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Data-Driven Analysis Method for the Trajectory of Power Carbon Emission in the Urban Area. 城市电力碳排放轨迹的数据驱动分析方法。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data

Pub Date : 2025-02-01 Epub Date: 2023-06-16 DOI: 10.1089/big.2022.0299

Yi Gao, Dawei Yan, Xiangyu Kong, Ning Liu, Zhiyu Zou, Bixuan Gao, Yang Wang, Yue Chen, Shuai Luo

"Industry 4.0" aims to build a highly versatile, individualized digital production model for goods and services. The carbon emission (CE) issue needs to be addressed by changing from centralized control to decentralized and enhanced control. Based on a solid CE monitoring, reporting, and verification system, it is necessary to study future power system CE dynamics simulation technology. In this article, a data-driven approach is proposed to analyzing the trajectory of urban electricity CEs based on empirical mode decomposition, which suggests combining macro-energy thinking and big data thinking by removing the barriers among power systems and related technological, economic, and environmental domains. Based on multisource heterogeneous mass data acquisition, effective secondary data can be extracted through the integration of statistical analysis, causal analysis, and behavior analysis, which can help construct a simulation environment supporting the dynamic interaction among mathematical models, multi-agents, and human participants.

“工业4.0”旨在为商品和服务建立一个高度通用、个性化的数字化生产模式。碳排放问题需要通过从集中控制转向分散和加强控制来解决。基于可靠的CE监测、报告和验证系统，有必要研究未来电力系统CE动态仿真技术。本文提出了一种基于经验模态分解的数据驱动方法，通过消除电力系统与相关技术、经济和环境领域之间的障碍，将宏观能源思维与大数据思维相结合，来分析城市电力消费成本的轨迹。在多源异构海量数据采集的基础上，通过统计分析、因果分析和行为分析相结合，提取有效的辅助数据，构建支持数学模型、多智能体和人类参与者之间动态交互的仿真环境。

引用次数: 0

Big Data最新文献