首页 > 最新文献

Big Data Research最新文献

英文 中文
A Cross-Chain Mechanism for Agricultural Engineering Document Management Blockchain in the Context of Big Data 大数据背景下农业工程文件管理区块链的跨链机制
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-25 DOI: 10.1016/j.bdr.2024.100459
Lei Shi , Yimin Zhou , Wei Wang , Juan Wang , Yang Bai , Chengzong Peng , Ding Chen , Zuli Wang

Cross-chain mechanism functions as typical approaches for information interaction between diverse blockchains tackling the problem of information silos in the big data era. Most of the existing cross-chain mechanisms are targeted at virtual currency blockchains in the financial sector. With more and more engineering documents manufactured by the development of modern smart farming, the need for engineering document management and cross-chaining between various blockchains has become increasingly urgent. This paper proposes a novel attainable cross-chain mechanism for agricultural engineering document management blockchains concerning the unique structure and operation principals of the specific domain. The methodology sufficiently integrated the characteristics of the agricultural engineering document management with the notary scheme, constructed by government supervision nodes with high credibility. Meanwhile, the authentication technology and cryptographic algorithms are internally fused, solving the authentication problem of the document cross-chain and protecting the cross-chain information respectively, which ensures the integrity and security of the file attribute information, alongside file ontology data in the cross-chain process. Adequate security proof and experiments illustrate that the developed mechanism can guarantee the feasibility of the mechanism, authenticity of the cross-chain parties, and the integrality and reliability of the document information, thus catering to the requirements of the cross-chain performance of blockchain in the field of agricultural engineering document management.

跨链机制是不同区块链之间进行信息交互的典型方法,可解决大数据时代的信息孤岛问题。现有的跨链机制大多针对金融领域的虚拟货币区块链。随着现代智能农业的发展,越来越多的工程文档被制造出来,各种区块链之间的工程文档管理和跨链需求日益迫切。本文针对农业工程文档管理区块链的独特结构和运行原理,提出了一种新颖的可实现的跨链机制。该方法充分结合了农业工程文件管理的特点和公证方案,由政府监管节点构建,具有较高的公信力。同时,内部融合了认证技术和密码算法,分别解决了文件跨链的认证问题和跨链信息的保护问题,确保了跨链过程中文件属性信息以及文件本体数据的完整性和安全性。充分的安全证明和实验表明,所开发的机制能够保证机制的可行性、跨链各方的真实性以及文件信息的完整性和可靠性,从而满足了农业工程文件管理领域对区块链跨链性能的要求。
{"title":"A Cross-Chain Mechanism for Agricultural Engineering Document Management Blockchain in the Context of Big Data","authors":"Lei Shi ,&nbsp;Yimin Zhou ,&nbsp;Wei Wang ,&nbsp;Juan Wang ,&nbsp;Yang Bai ,&nbsp;Chengzong Peng ,&nbsp;Ding Chen ,&nbsp;Zuli Wang","doi":"10.1016/j.bdr.2024.100459","DOIUrl":"10.1016/j.bdr.2024.100459","url":null,"abstract":"<div><p>Cross-chain mechanism functions as typical approaches for information interaction between diverse blockchains tackling the problem of information silos in the big data era. Most of the existing cross-chain mechanisms are targeted at virtual currency blockchains in the financial sector. With more and more engineering documents manufactured by the development of modern smart farming, the need for engineering document management and cross-chaining between various blockchains has become increasingly urgent. This paper proposes a novel attainable cross-chain mechanism for agricultural engineering document management blockchains concerning the unique structure and operation principals of the specific domain. The methodology sufficiently integrated the characteristics of the agricultural engineering document management with the notary scheme, constructed by government supervision nodes with high credibility. Meanwhile, the authentication technology and cryptographic algorithms are internally fused, solving the authentication problem of the document cross-chain and protecting the cross-chain information respectively, which ensures the integrity and security of the file attribute information, alongside file ontology data in the cross-chain process. Adequate security proof and experiments illustrate that the developed mechanism can guarantee the feasibility of the mechanism, authenticity of the cross-chain parties, and the integrality and reliability of the document information, thus catering to the requirements of the cross-chain performance of blockchain in the field of agricultural engineering document management.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140782467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tree parameter extraction method based on new remote sensing technology and terrestrial laser scanning technology 基于新型遥感技术和地面激光扫描技术的树木参数提取方法
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-23 DOI: 10.1016/j.bdr.2024.100460
Aiguo Wang , Jun Wang , Haiming Li , Jian Hu , Haiyuan Zhou , Xinyu Zhang , Xuan Liu , Wanying Wang , Wenjin Zhang , Siting Wu , Ningyang Jiao , Yihao Wang

Ground LiDAR is a terrestrial LiDAR system that is often used for terrain and geomorphic mapping. Ground-based LiDAR can be used to collect more local and short-range data, making it ideal for mapping smaller areas with high precision. In order to solve the rapid extraction of tree parameters in the national public welfare forest survey, the ground-based LIDAR was used to obtain the point cloud of trees, and the point cloud data was registered, denoised, normalized, sliced, parameter extracted, etc., and the parameters of individual trees in the forest were obtained. The Bland-Altman consistency test is used to test whether the method of extracting tree parameters from point clouds is consistent with the traditional measurement method. The experimental results show that the point cloud data obtained by the ground-based LIDAR can quickly, conveniently and accurately extract the tree parameters, which is consistent with the traditional tree parameter extraction method, and has the advantages than the traditional tree parameter measurement, such as point cloud, image and traceability. It has a unique advantage in establishing a tree database. It is suggested that LIDAR should be used for forest survey in the future.

地面激光雷达是一种地面激光雷达系统,通常用于地形和地貌测绘。地基激光雷达可用于采集更多局部和短程数据,因此非常适合高精度绘制较小区域的地图。为解决全国公益林调查中树木参数的快速提取问题,利用地基激光雷达获取树木点云,并对点云数据进行注册、去噪、归一化、切片、参数提取等处理,得到森林中单株树木的参数。采用 Bland-Altman 一致性检验法检验从点云提取树木参数的方法与传统测量方法是否一致。实验结果表明,地基激光雷达获取的点云数据可以快速、方便、准确地提取树木参数,与传统的树木参数提取方法一致,与传统的树木参数测量方法相比,具有点云化、影像化、可追溯等优点。在建立树木数据库方面具有独特的优势。建议今后在森林调查中使用激光雷达。
{"title":"Tree parameter extraction method based on new remote sensing technology and terrestrial laser scanning technology","authors":"Aiguo Wang ,&nbsp;Jun Wang ,&nbsp;Haiming Li ,&nbsp;Jian Hu ,&nbsp;Haiyuan Zhou ,&nbsp;Xinyu Zhang ,&nbsp;Xuan Liu ,&nbsp;Wanying Wang ,&nbsp;Wenjin Zhang ,&nbsp;Siting Wu ,&nbsp;Ningyang Jiao ,&nbsp;Yihao Wang","doi":"10.1016/j.bdr.2024.100460","DOIUrl":"10.1016/j.bdr.2024.100460","url":null,"abstract":"<div><p>Ground LiDAR is a terrestrial LiDAR system that is often used for terrain and geomorphic mapping. Ground-based LiDAR can be used to collect more local and short-range data, making it ideal for mapping smaller areas with high precision. In order to solve the rapid extraction of tree parameters in the national public welfare forest survey, the ground-based LIDAR was used to obtain the point cloud of trees, and the point cloud data was registered, denoised, normalized, sliced, parameter extracted, etc., and the parameters of individual trees in the forest were obtained. The Bland-Altman consistency test is used to test whether the method of extracting tree parameters from point clouds is consistent with the traditional measurement method. The experimental results show that the point cloud data obtained by the ground-based LIDAR can quickly, conveniently and accurately extract the tree parameters, which is consistent with the traditional tree parameter extraction method, and has the advantages than the traditional tree parameter measurement, such as point cloud, image and traceability. It has a unique advantage in establishing a tree database. It is suggested that LIDAR should be used for forest survey in the future.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140795530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multiscale electricity theft detection model based on feature engineering 基于特征工程的多尺度窃电检测模型
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-23 DOI: 10.1016/j.bdr.2024.100457
Wei Zhang, Yu Dai

With the widespread adoption of smart meters and the growing availability of data mining and machine learning algorithms, there is a pressing demand for methods that are both accurate and explicable in identifying electricity theft patterns among end-users. To address this need, this study proposes a multi-scale anomaly detection model based on feature engineering.Specifically, tsfresh is utilized in feature engineering to extract electricity consumption features from the raw data, and XGBoost is employed to select features that are highly correlated with anomalous behavior, which have clear physical interpretations. Multi-scale convolutional neural networks are then used to analyze and process the data at different temporal and frequency scales. Attention mechanisms are applied to assign weights to different feature channels, and all of the extracted information is fused for anomaly detection. The combination of feature engineering and multi-scale convolutional neural networks not only enhances the interpretability of the model but also improves its performance, as demonstrated by the experimental results, which show that the proposed method outperforms traditional anomaly detection approaches across multiple evaluation metrics.

随着智能电表的广泛应用以及数据挖掘和机器学习算法的日益普及,人们迫切需要既准确又可解释的方法来识别终端用户的窃电模式。为满足这一需求,本研究提出了一种基于特征工程的多尺度异常检测模型。具体来说,在特征工程中使用 tsfresh 从原始数据中提取用电特征,并使用 XGBoost 选择与异常行为高度相关的特征,这些特征具有明确的物理解释。然后使用多尺度卷积神经网络来分析和处理不同时间和频率尺度的数据。应用注意机制为不同的特征通道分配权重,并融合所有提取的信息进行异常检测。实验结果表明,特征工程与多尺度卷积神经网络的结合不仅增强了模型的可解释性,还提高了模型的性能。
{"title":"A multiscale electricity theft detection model based on feature engineering","authors":"Wei Zhang,&nbsp;Yu Dai","doi":"10.1016/j.bdr.2024.100457","DOIUrl":"10.1016/j.bdr.2024.100457","url":null,"abstract":"<div><p>With the widespread adoption of smart meters and the growing availability of data mining and machine learning algorithms, there is a pressing demand for methods that are both accurate and explicable in identifying electricity theft patterns among end-users. To address this need, this study proposes a multi-scale anomaly detection model based on feature engineering.Specifically, tsfresh is utilized in feature engineering to extract electricity consumption features from the raw data, and XGBoost is employed to select features that are highly correlated with anomalous behavior, which have clear physical interpretations. Multi-scale convolutional neural networks are then used to analyze and process the data at different temporal and frequency scales. Attention mechanisms are applied to assign weights to different feature channels, and all of the extracted information is fused for anomaly detection. The combination of feature engineering and multi-scale convolutional neural networks not only enhances the interpretability of the model but also improves its performance, as demonstrated by the experimental results, which show that the proposed method outperforms traditional anomaly detection approaches across multiple evaluation metrics.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140762245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative analysis of big data for land resource classification and zoning at the township level in Northern Shaanxi 陕北乡镇级土地资源分类与区划的大数据定量分析
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-23 DOI: 10.1016/j.bdr.2024.100458
Hongkun Xie , Minghua Huang , Wentao Lei , Yang Wang , Lu Ou

To analyze and evaluate the conditions and distribution characteristics of rural land resources in northern Shaanxi. The experiment extracts two terrain feature values, namely slope and undulation, which are highly correlated with land resources. Then, the extraction results of all 302-township level administrative regions in northern Shaanxi are processed, and the scoring results of all township level units are sorted. Based on this, optimization and adjustment are made to form a classification result. The experimental results show that land resources in primary townships are most scarce, mainly distributed in the central and western regions of northern Shaanxi, with 53 in Yan'an and 7 in Yulin; Land resources in secondary townships are relatively scarce, mainly distributed along the Yellow River in the central and southern parts of northern Shaanxi, with 40 in Yan'an and 53 in Yulin; The land resources of third level townships are relatively abundant, generally distributed along the Great Wall, and belong to the transitional zone between windblown sand and grassland areas and hilly and gully areas. Except for one third level township located in Yan'an, all 22 other townships are located in Yulin; The fourth level townships have abundant land resources and are located in the loess plateau landform area in the southern part of northern Shaanxi. They belong to Yan'an Luochuan and three surrounding counties, totaling 17 townships; The terrain of the fifth and sixth level townships is flat, and the land resources are the most abundant. They belong to the sandy and grassy terrain north of the Great Wall in northern Shaanxi. A total of 56 townships are located in 7 county-level administrative regions of Yulin City. The experimental results lay the foundation for the research on optimizing the spatial pattern of rural life in northern Shaanxi, and can also provide support for classified guidance and precise policy implementation for rural revitalization, agricultural industry policy formulation, human settlement environment construction, and ecological environment protection.

分析评价陕北农村土地资源状况及分布特征。实验提取了与土地资源高度相关的坡度和起伏两个地形特征值。然后,对陕北所有 302 个乡镇级行政区域的提取结果进行处理,并对所有乡镇级单位的评分结果进行排序。在此基础上进行优化调整,形成分类结果。试验结果表明,一级乡镇土地资源最为稀缺,主要分布在陕北中西部地区,延安 53 个,榆林 7 个;二级乡镇土地资源相对稀缺,主要分布在陕北中南部黄河沿岸地区,延安 40 个,榆林 53 个;三级乡镇土地资源相对丰富,一般分布在长城沿线,属于风沙草原区与丘陵沟壑区的过渡地带。除 1 个三级乡镇位于延安外,其余 22 个乡镇均位于榆林;四级乡镇土地资源丰富,位于陕北南部黄土高原地貌区。分属延安洛川及周边三个县,共 17 个乡镇;五、六级乡镇地势平坦,土地资源最为丰富。属于陕北长城以北的沙草地带。榆林市 7 个县级行政区共有 56 个乡镇。实验结果为陕北农村生活空间格局优化研究奠定了基础,也可为乡村振兴、农业产业政策制定、人居环境建设、生态环境保护等方面的分类指导和精准施策提供支撑。
{"title":"Quantitative analysis of big data for land resource classification and zoning at the township level in Northern Shaanxi","authors":"Hongkun Xie ,&nbsp;Minghua Huang ,&nbsp;Wentao Lei ,&nbsp;Yang Wang ,&nbsp;Lu Ou","doi":"10.1016/j.bdr.2024.100458","DOIUrl":"10.1016/j.bdr.2024.100458","url":null,"abstract":"<div><p>To analyze and evaluate the conditions and distribution characteristics of rural land resources in northern Shaanxi. The experiment extracts two terrain feature values, namely slope and undulation, which are highly correlated with land resources. Then, the extraction results of all 302-township level administrative regions in northern Shaanxi are processed, and the scoring results of all township level units are sorted. Based on this, optimization and adjustment are made to form a classification result. The experimental results show that land resources in primary townships are most scarce, mainly distributed in the central and western regions of northern Shaanxi, with 53 in Yan'an and 7 in Yulin; Land resources in secondary townships are relatively scarce, mainly distributed along the Yellow River in the central and southern parts of northern Shaanxi, with 40 in Yan'an and 53 in Yulin; The land resources of third level townships are relatively abundant, generally distributed along the Great Wall, and belong to the transitional zone between windblown sand and grassland areas and hilly and gully areas. Except for one third level township located in Yan'an, all 22 other townships are located in Yulin; The fourth level townships have abundant land resources and are located in the loess plateau landform area in the southern part of northern Shaanxi. They belong to Yan'an Luochuan and three surrounding counties, totaling 17 townships; The terrain of the fifth and sixth level townships is flat, and the land resources are the most abundant. They belong to the sandy and grassy terrain north of the Great Wall in northern Shaanxi. A total of 56 townships are located in 7 county-level administrative regions of Yulin City. The experimental results lay the foundation for the research on optimizing the spatial pattern of rural life in northern Shaanxi, and can also provide support for classified guidance and precise policy implementation for rural revitalization, agricultural industry policy formulation, human settlement environment construction, and ecological environment protection.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140788495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big Data in organizations: Exploring the adoption of Big Data applications and their impact on organizations in China and the Netherlands 组织中的大数据:探索大数据应用的采用及其对中国和荷兰组织的影响
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-16 DOI: 10.1016/j.bdr.2024.100454
Jörg Raab , Yuting Pang , Joan Baaijens , Honggeng Zhou

Digital technology has rapidly been transforming how organizations operate. However, the literature in management studies has only just started to problematize the fundamental inter-relation of digital technology and organizing and we lack sound data about the actual breadth and depth of these changes. This study therefore explores the state of the implementation of Big Data applications in a wide range of organizations in China and the Netherlands and the impact on organizational structures and processes. Our findings show that most organizations are still in an experimental phase at best. We can therefore observe an evolutionary model of technology adoption

数字技术正在迅速改变组织的运作方式。然而,管理研究方面的文献才刚刚开始对数字技术与组织的基本相互关系提出问题,我们缺乏有关这些变化的实际广度和深度的可靠数据。因此,本研究探讨了大数据应用在中国和荷兰各类组织中的实施情况,以及对组织结构和流程的影响。我们的研究结果表明,大多数组织充其量仍处于试验阶段。因此,我们可以观察到技术采用的演进模式
{"title":"Big Data in organizations: Exploring the adoption of Big Data applications and their impact on organizations in China and the Netherlands","authors":"Jörg Raab ,&nbsp;Yuting Pang ,&nbsp;Joan Baaijens ,&nbsp;Honggeng Zhou","doi":"10.1016/j.bdr.2024.100454","DOIUrl":"10.1016/j.bdr.2024.100454","url":null,"abstract":"<div><p>Digital technology has rapidly been transforming how organizations operate. However, the literature in management studies has only just started to problematize the fundamental inter-relation of digital technology and organizing and we lack sound data about the actual breadth and depth of these changes. This study therefore explores the state of the implementation of Big Data applications in a wide range of organizations in China and the Netherlands and the impact on organizational structures and processes. Our findings show that most organizations are still in an experimental phase at best. We can therefore observe an evolutionary model of technology adoption</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140796332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for Tsunami Waves Forecasting Using Regression Trees 使用回归树进行海啸波预测的机器学习
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-16 DOI: 10.1016/j.bdr.2024.100452
Eugenio Cesario , Salvatore Giampá , Enrico Baglione , Louise Cordrie , Jacopo Selva , Domenico Talia

After a seismic event, tsunami early warning systems (TEWSs) try to accurately forecast the maximum height of incident waves at specific target points in front of the coast, so that early warnings can be launched on locations where the impact of tsunami waves can be destructive to deliver aids in these locations in the immediate post-event management. The uncertainty on the forecast can be quantified with ensembles of alternative scenarios. Similarly, in probabilistic tsunami hazard analysis (PTHA) a large number of simulations is required to cover the natural variability of the source process in each location. To improve the accuracy and computational efficiency of tsunami forecasting methods, scientists have recently started to exploit machine learning techniques to process pre-computed simulation data. However, the approaches proposed in literature, mainly based on neural networks, suffer of high training time and limited model explainability. To overtake these issues, this paper describes a machine learning approach based on regression trees to model and forecast tsunami evolutions. The algorithm takes as input a set of simulations forming an ensemble that describes potential benefit regional impact of tsunami source scenarios in a given source area, and it provides predictive models to forecast the tsunami waves for other potential tsunami sources in the same area. The experimental evaluation, performed on the 2003 M6.8 Zemmouri-Boumerdes earthquake and tsunami simulation data, shows that regression trees achieve high forecasting accuracy. Moreover, they provide domain experts with fully-explainable and interpretable models, which are a valuable support for environmental scientists because they describe underlying rules and patterns behind the models and allow for an explicit inspection of their functioning. This can enable a full and trustable exploration of source uncertainty in tsunami early-warning and urgent computing scenarios, with large ensembles of computationally light tsunami simulations.

地震发生后,海啸预警系统(TEWS)试图准确预报海岸前方特定目标点的最大波浪高度,以便对海啸波浪可能造成破坏性影响的地点发出预警,为这些地点的灾后管理提供帮助。预报的不确定性可以通过替代方案的集合来量化。同样,在海啸危害概率分析(PTHA)中,需要进行大量的模拟,以涵盖每个地点海啸源过程的自然变化。为了提高海啸预测方法的准确性和计算效率,科学家们最近开始利用机器学习技术来处理预先计算的模拟数据。然而,文献中提出的主要基于神经网络的方法存在训练时间长、模型可解释性有限等问题。为了克服这些问题,本文介绍了一种基于回归树的机器学习方法,用于海啸演变的建模和预测。该算法将一组模拟结果作为输入,形成一个集合,描述特定海啸源地区海啸源情景的潜在区域影响,并提供预测模型,预测同一地区其他潜在海啸源的海啸波。在 2003 年 M6.8 Zemmouri-Boumerdes 地震和海啸模拟数据上进行的实验评估表明,回归树达到了很高的预测精度。此外,回归树还为领域专家提供了可充分解释和解读的模型,这对环境科学家来说是一种宝贵的支持,因为它们描述了模型背后的基本规则和模式,并允许对其功能进行明确的检查。这样,就可以利用大量计算轻便的海啸模拟集合,对海啸预警和紧急计算场景中的不确定性源进行全面、可信的探索。
{"title":"Machine Learning for Tsunami Waves Forecasting Using Regression Trees","authors":"Eugenio Cesario ,&nbsp;Salvatore Giampá ,&nbsp;Enrico Baglione ,&nbsp;Louise Cordrie ,&nbsp;Jacopo Selva ,&nbsp;Domenico Talia","doi":"10.1016/j.bdr.2024.100452","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100452","url":null,"abstract":"<div><p>After a seismic event, tsunami early warning systems (TEWSs) try to accurately forecast the maximum height of incident waves at specific target points in front of the coast, so that early warnings can be launched on locations where the impact of tsunami waves can be destructive to deliver aids in these locations in the immediate post-event management. The uncertainty on the forecast can be quantified with ensembles of alternative scenarios. Similarly, in probabilistic tsunami hazard analysis (PTHA) a large number of simulations is required to cover the natural variability of the source process in each location. To improve the accuracy and computational efficiency of tsunami forecasting methods, scientists have recently started to exploit machine learning techniques to process pre-computed simulation data. However, the approaches proposed in literature, mainly based on neural networks, suffer of high training time and limited model explainability. To overtake these issues, this paper describes a machine learning approach based on regression trees to model and forecast tsunami evolutions. The algorithm takes as input a set of simulations forming an ensemble that describes potential benefit regional impact of tsunami source scenarios in a given source area, and it provides predictive models to forecast the tsunami waves for other potential tsunami sources in the same area. The experimental evaluation, performed on the 2003 M6.8 Zemmouri-Boumerdes earthquake and tsunami simulation data, shows that regression trees achieve high forecasting accuracy. Moreover, they provide domain experts with fully-explainable and interpretable models, which are a valuable support for environmental scientists because they describe underlying rules and patterns behind the models and allow for an explicit inspection of their functioning. This can enable a full and trustable exploration of source uncertainty in tsunami early-warning and urgent computing scenarios, with large ensembles of computationally light tsunami simulations.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579624000285/pdfft?md5=942e994d950c715c0c020e511bc26341&pid=1-s2.0-S2214579624000285-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140559033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling critical periodic jobs with selective partial computations along with gang jobs 调度关键的周期性工作,有选择地进行部分计算和帮派工作
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-04-04 DOI: 10.1016/j.bdr.2024.100453
Helen Karatza

One of the main issues with distributed systems, like clouds, is scheduling complex workloads, which are made up of various job types with distinct features. Gang jobs are one kind of parallel applications that these systems support. This paper examines the scheduling of workloads that comprise gangs and critical periodic jobs that can allow for partial computations when necessary to overcome gang job execution. The simulation's results shed important light on how gang performance is impacted by partial computations of critical jobs. The results also reveal that, under the proposed scheduling scheme, partial computations which take into account gangs’ degree of parallelism, might lower the average response time of gang jobs, resulting in an acceptable level of the average results precision of the critical jobs. Additionally, it is observed that as the deviation from the average partial computation increases, the performance improvement due to partial computations increases with the aforementioned tradeoff remaining significant.

云计算等分布式系统的主要问题之一是调度复杂的工作负载,这些负载由具有不同特征的各种作业类型组成。帮派工作是这些系统支持的一种并行应用。本文研究了由帮派和关键周期性作业组成的工作负载的调度问题,这些工作负载可以在必要时进行部分计算,以克服帮派作业的执行问题。模拟结果揭示了帮派性能如何受到关键作业部分计算的影响。结果还显示,在建议的调度方案下,考虑到帮组并行程度的部分计算可能会降低帮组作业的平均响应时间,从而使关键作业的平均结果精度达到可接受的水平。此外,我们还观察到,随着部分计算与平均值的偏差增大,部分计算带来的性能提升也会增大,但上述权衡仍然重要。
{"title":"Scheduling critical periodic jobs with selective partial computations along with gang jobs","authors":"Helen Karatza","doi":"10.1016/j.bdr.2024.100453","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100453","url":null,"abstract":"<div><p>One of the main issues with distributed systems, like clouds, is scheduling complex workloads, which are made up of various job types with distinct features. Gang jobs are one kind of parallel applications that these systems support. This paper examines the scheduling of workloads that comprise gangs and critical periodic jobs that can allow for partial computations when necessary to overcome gang job execution. The simulation's results shed important light on how gang performance is impacted by partial computations of critical jobs. The results also reveal that, under the proposed scheduling scheme, partial computations which take into account gangs’ degree of parallelism, might lower the average response time of gang jobs, resulting in an acceptable level of the average results precision of the critical jobs. Additionally, it is observed that as the deviation from the average partial computation increases, the performance improvement due to partial computations increases with the aforementioned tradeoff remaining significant.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140547395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explanation-Guided Adversarial Example Attacks 解释引导的对抗性示例攻击
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-03-26 DOI: 10.1016/j.bdr.2024.100451
Anli Yan , Xiaozhang Liu , Wanman Li , Hongwei Ye , Lang Li

Neural network-based classifiers are vulnerable to adversarial example attacks even in a black-box setting. Existing adversarial example generation technologies mainly rely on optimization-based attacks, which optimize the objective function by iterative input perturbation. While being able to craft adversarial examples, these techniques require big budgets. Latest transfer-based attacks, though being limited queries, also have a disadvantage of low attack success rate. In this paper, we propose an adversarial example attack method called MEAttack using the model-agnostic explanation technology, which can more efficiently generate adversarial examples in the black-box setting with limited queries. The core idea is to design a novel model-agnostic explanation method for target models, and generate adversarial examples based on model explanations. We experimentally demonstrate that MEAttack outperforms the state-of-the-art attack technology, i.e., AutoZOOM. The success rate of MEAttack is 4.54%-47.42% higher than AutoZOOM, and its query efficiency is reduced by 2.6-4.2 times. Experimental results show that MEAttack is efficient in terms of both attack success rate and query efficiency.

基于神经网络的分类器即使在黑盒环境中也容易受到对抗性示例攻击。现有的对抗示例生成技术主要依赖于基于优化的攻击,即通过迭代输入扰动来优化目标函数。这些技术虽然可以生成对抗示例,但需要大量预算。最新的基于转移的攻击虽然查询受限,但也存在攻击成功率低的缺点。在本文中,我们提出了一种名为 MEAttack 的对抗性示例攻击方法,它采用了模型无关解释技术,能在有限查询的黑盒环境中更高效地生成对抗性示例。其核心思想是为目标模型设计一种新颖的模型无关解释方法,并根据模型解释生成对抗示例。我们通过实验证明,MEAttack 优于最先进的攻击技术,即 AutoZOOM。MEAttack 的成功率比 AutoZOOM 高 4.54%-47.42%,查询效率降低了 2.6-4.2 倍。实验结果表明,MEAttack 在攻击成功率和查询效率方面都很有效。
{"title":"Explanation-Guided Adversarial Example Attacks","authors":"Anli Yan ,&nbsp;Xiaozhang Liu ,&nbsp;Wanman Li ,&nbsp;Hongwei Ye ,&nbsp;Lang Li","doi":"10.1016/j.bdr.2024.100451","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100451","url":null,"abstract":"<div><p>Neural network-based classifiers are vulnerable to adversarial example attacks even in a black-box setting. Existing adversarial example generation technologies mainly rely on optimization-based attacks, which optimize the objective function by iterative input perturbation. While being able to craft adversarial examples, these techniques require big budgets. Latest transfer-based attacks, though being limited queries, also have a disadvantage of low attack success rate. In this paper, we propose an adversarial example attack method called MEAttack using the model-agnostic explanation technology, which can more efficiently generate adversarial examples in the black-box setting with limited queries. The core idea is to design a novel model-agnostic explanation method for target models, and generate adversarial examples based on model explanations. We experimentally demonstrate that MEAttack outperforms the state-of-the-art attack technology, i.e., AutoZOOM. The success rate of MEAttack is 4.54%-47.42% higher than AutoZOOM, and its query efficiency is reduced by 2.6-4.2 times. Experimental results show that MEAttack is efficient in terms of both attack success rate and query efficiency.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140347942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting inconsistencies in knowledge graphs with correlated knowledge 用相关知识纠正知识图谱中的不一致性
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-03-21 DOI: 10.1016/j.bdr.2024.100450
Shichen Zhai , Xiaoping Lu , Chao Wang , Zhiyu Hong , Jing Shan , Zongmin Ma

Knowledge graphs (KGs) have been widely applied for semantic representation and intelligent decision-making. The usefulness and usability of KGs is often limited by quality of KGs. One common issue is the presence of inconsistent assertions in KGs. Inconsistencies in KGs are often caused by diverse data that are applied for automatically constructing large-scale KGs. To improve quality of KGs, in this paper, we investigate how to detect and correct inconsistent triples in KGs. We first identify entity-related inconsistency, relation-related inconsistency and type-related inconsistency. On the basis, we propose a framework of correcting the identified inconsistencies, which combines candidate generation, link prediction and constraint validation. We evaluate the proposed correction framework in the real-word dataset FB15k (from Freebase). The promising results confirm the capability of our framework in correcting the inconsistencies of knowledge graphs.

知识图谱(KG)已被广泛应用于语义表示和智能决策。知识图谱的有用性和可用性往往受到知识图谱质量的限制。一个常见问题是知识图谱中存在不一致的断言。KGs中的不一致性通常是由用于自动构建大规模KGs的各种数据造成的。为了提高 KG 的质量,本文研究了如何检测和纠正 KG 中不一致的三元组。我们首先识别了与实体相关的不一致、与关系相关的不一致和与类型相关的不一致。在此基础上,我们提出了一个结合候选生成、链接预测和约束验证的不一致校正框架。我们在实词数据集 FB15k(来自 Freebase)中对提出的修正框架进行了评估。结果证明了我们的框架在纠正知识图谱不一致方面的能力。
{"title":"Correcting inconsistencies in knowledge graphs with correlated knowledge","authors":"Shichen Zhai ,&nbsp;Xiaoping Lu ,&nbsp;Chao Wang ,&nbsp;Zhiyu Hong ,&nbsp;Jing Shan ,&nbsp;Zongmin Ma","doi":"10.1016/j.bdr.2024.100450","DOIUrl":"https://doi.org/10.1016/j.bdr.2024.100450","url":null,"abstract":"<div><p>Knowledge graphs (KGs) have been widely applied for semantic representation and intelligent decision-making. The usefulness and usability of KGs is often limited by quality of KGs. One common issue is the presence of inconsistent assertions in KGs. Inconsistencies in KGs are often caused by diverse data that are applied for automatically constructing large-scale KGs. To improve quality of KGs, in this paper, we investigate how to detect and correct inconsistent triples in KGs. We first identify entity-related inconsistency, relation-related inconsistency and type-related inconsistency. On the basis, we propose a framework of correcting the identified inconsistencies, which combines candidate generation, link prediction and constraint validation. We evaluate the proposed correction framework in the real-word dataset FB15k (from Freebase). The promising results confirm the capability of our framework in correcting the inconsistencies of knowledge graphs.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140328544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Remote sensing-enhanced transfer learning approach for agricultural damage and change detection: A deep learning perspective 遥感增强转移学习法用于农业损害和变化检测:深度学习视角
IF 3.3 3区 计算机科学 Q1 Business, Management and Accounting Pub Date : 2024-03-20 DOI: 10.1016/j.bdr.2024.100449
Zehua Liu , Jiuhao Li , Mahmood Ashraf , M.S. Syam , Muhammad Asif , Emad Mahrous Awwad , Muna Al-Razgan , Uzair Aslam Bhatti

With the continuous advancement of science and technology, there has been a growing awareness of safety among people worldwide. Natural disasters such as wildfires, earthquakes, and floods pose persistent threats to both lives and property on our planet, which serves as our fundamental habitat. While it is impossible to prevent or entirely avert these calamities, rapid identification of affected areas and prompt damage assessment post-disaster can significantly aid in the formulation of effective rescue strategies, ultimately saving more lives. This article delves into the application of transfer learning in satellite image damage assessment—a methodology that involves transferring previously acquired knowledge to enhance a model's adaptability to new tasks. Given the limited availability of datasets for satellite image analysis, transfer learning proves to be an effective approach. Specifically, the study proposes a transfer learning method based on YOLOv5 for satellite image damage assessment. Initially, a general convolutional neural network model is trained using a substantial dataset of natural images. Subsequently, the early layers of this model are frozen, while the later layers undergo training to adapt to satellite image data. Fine-tuning is then employed to further enhance the overall model performance. The results demonstrate that this approach yields a high accuracy rate in satellite image damage assessment. Moreover, compared to conventional deep learning methods, the proposed method effectively leverages pre-trained models' knowledge, thereby reducing data dependency. Additionally, it displays robust generalization capabilities across diverse tasks and datasets, underscoring its potential for facilitating transfer learning across various domains.

随着科学技术的不断进步,全世界人民的安全意识日益增强。野火、地震和洪水等自然灾害对我们赖以生存的地球的生命和财产构成了持续的威胁。虽然我们不可能预防或完全避免这些灾难,但灾后快速识别受灾地区并及时进行损失评估,可大大有助于制定有效的救援策略,最终挽救更多生命。本文深入探讨了迁移学习在卫星图像损害评估中的应用--这种方法涉及迁移以前获得的知识,以增强模型对新任务的适应性。鉴于用于卫星图像分析的数据集有限,迁移学习被证明是一种有效的方法。具体来说,本研究提出了一种基于 YOLOv5 的迁移学习方法,用于卫星图像损伤评估。首先,使用大量自然图像数据集训练一个通用卷积神经网络模型。随后,该模型的早期层被冻结,而后期层则接受训练以适应卫星图像数据。然后再进行微调,以进一步提高模型的整体性能。结果表明,这种方法在卫星图像损坏评估方面具有很高的准确率。此外,与传统的深度学习方法相比,所提出的方法有效地利用了预训练模型的知识,从而降低了数据依赖性。此外,该方法在不同的任务和数据集上都表现出了强大的泛化能力,凸显了其促进跨领域迁移学习的潜力。
{"title":"Remote sensing-enhanced transfer learning approach for agricultural damage and change detection: A deep learning perspective","authors":"Zehua Liu ,&nbsp;Jiuhao Li ,&nbsp;Mahmood Ashraf ,&nbsp;M.S. Syam ,&nbsp;Muhammad Asif ,&nbsp;Emad Mahrous Awwad ,&nbsp;Muna Al-Razgan ,&nbsp;Uzair Aslam Bhatti","doi":"10.1016/j.bdr.2024.100449","DOIUrl":"10.1016/j.bdr.2024.100449","url":null,"abstract":"<div><p>With the continuous advancement of science and technology, there has been a growing awareness of safety among people worldwide. Natural disasters such as wildfires, earthquakes, and floods pose persistent threats to both lives and property on our planet, which serves as our fundamental habitat. While it is impossible to prevent or entirely avert these calamities, rapid identification of affected areas and prompt damage assessment post-disaster can significantly aid in the formulation of effective rescue strategies, ultimately saving more lives. This article delves into the application of transfer learning in satellite image damage assessment—a methodology that involves transferring previously acquired knowledge to enhance a model's adaptability to new tasks. Given the limited availability of datasets for satellite image analysis, transfer learning proves to be an effective approach. Specifically, the study proposes a transfer learning method based on YOLOv5 for satellite image damage assessment. Initially, a general convolutional neural network model is trained using a substantial dataset of natural images. Subsequently, the early layers of this model are frozen, while the later layers undergo training to adapt to satellite image data. Fine-tuning is then employed to further enhance the overall model performance. The results demonstrate that this approach yields a high accuracy rate in satellite image damage assessment. Moreover, compared to conventional deep learning methods, the proposed method effectively leverages pre-trained models' knowledge, thereby reducing data dependency. Additionally, it displays robust generalization capabilities across diverse tasks and datasets, underscoring its potential for facilitating transfer learning across various domains.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140275813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Big Data Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1