Pub Date : 2023-11-23eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1301812
Muhammad Tukur, Jens Schneider, Mowafa Househ, Ahmed Haruna Dokoro, Usman Idris Ismail, Muhammad Dawaki, Marco Agus
The concept of the "metaverse" has garnered significant attention recently, positioned as the "next frontier" of the internet. This emerging digital realm carries substantial economic and financial implications for both IT and non-IT industries. However, the integration and evolution of these virtual universes bring forth a multitude of intricate issues and quandaries that demand resolution. Within this research endeavor, our objective was to delve into and appraise the array of challenges, privacy concerns, and security issues that have come to light during the development of metaverse virtual environments in the wake of the COVID-19 pandemic. Through a meticulous review and analysis of literature spanning from January 2020 to December 2022, we have meticulously identified and scrutinized 29 distinct challenges, along with 12 policy, privacy, and security matters intertwined with the metaverse. Among the challenges we unearthed, the foremost were concerns pertaining to the costs associated with hardware and software, implementation complexities, digital disparities, and the ethical and moral quandaries surrounding socio-control, collectively cited by 43%, 40%, and 33% of the surveyed articles, respectively. Turning our focus to policy, privacy, and security issues, the top three concerns that emerged from our investigation encompassed the formulation of metaverse rules and principles, the encroachment of privacy threats within the metaverse, and the looming challenges concerning data management, all mentioned in 43%, 40%, and 33% of the examined literature. In summation, the development of virtual environments within the metaverse is a multifaceted and dynamically evolving domain, offering both opportunities and hurdles for researchers and practitioners alike. It is our aspiration that the insights, challenges, and recommendations articulated in this report will catalyze extensive dialogues among industry stakeholders, governmental bodies, and other interested parties concerning the metaverse's destiny and the world they aim to construct or bequeath to future generations.
{"title":"The metaverse digital environments: a scoping review of the challenges, privacy and security issues.","authors":"Muhammad Tukur, Jens Schneider, Mowafa Househ, Ahmed Haruna Dokoro, Usman Idris Ismail, Muhammad Dawaki, Marco Agus","doi":"10.3389/fdata.2023.1301812","DOIUrl":"https://doi.org/10.3389/fdata.2023.1301812","url":null,"abstract":"<p><p>The concept of the \"metaverse\" has garnered significant attention recently, positioned as the \"next frontier\" of the internet. This emerging digital realm carries substantial economic and financial implications for both IT and non-IT industries. However, the integration and evolution of these virtual universes bring forth a multitude of intricate issues and quandaries that demand resolution. Within this research endeavor, our objective was to delve into and appraise the array of challenges, privacy concerns, and security issues that have come to light during the development of metaverse virtual environments in the wake of the COVID-19 pandemic. Through a meticulous review and analysis of literature spanning from January 2020 to December 2022, we have meticulously identified and scrutinized 29 distinct challenges, along with 12 policy, privacy, and security matters intertwined with the metaverse. Among the challenges we unearthed, the foremost were concerns pertaining to the costs associated with hardware and software, implementation complexities, digital disparities, and the ethical and moral quandaries surrounding socio-control, collectively cited by 43%, 40%, and 33% of the surveyed articles, respectively. Turning our focus to policy, privacy, and security issues, the top three concerns that emerged from our investigation encompassed the formulation of metaverse rules and principles, the encroachment of privacy threats within the metaverse, and the looming challenges concerning data management, all mentioned in 43%, 40%, and 33% of the examined literature. In summation, the development of virtual environments within the metaverse is a multifaceted and dynamically evolving domain, offering both opportunities and hurdles for researchers and practitioners alike. It is our aspiration that the insights, challenges, and recommendations articulated in this report will catalyze extensive dialogues among industry stakeholders, governmental bodies, and other interested parties concerning the metaverse's destiny and the world they aim to construct or bequeath to future generations.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1301812"},"PeriodicalIF":3.1,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10702132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138813063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1301903
Elisa Omodei, Dohyung Kim, Manuel Garcia-Herranz, Vedran Sekara
{"title":"Editorial: Are machine learning, AI, and big data tools ready to be used for sustainable development? Challenges, and limitations of current approaches.","authors":"Elisa Omodei, Dohyung Kim, Manuel Garcia-Herranz, Vedran Sekara","doi":"10.3389/fdata.2023.1301903","DOIUrl":"https://doi.org/10.3389/fdata.2023.1301903","url":null,"abstract":"","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1301903"},"PeriodicalIF":3.1,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10703458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138813059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-20eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1282352
Ruipeng Tang, Narendra Kumar Aridas, Mohamad Sofian Abu Talip
With the popularization of big data technology, agricultural data processing systems have become more intelligent. In this study, a data processing method for farmland environmental monitoring based on improved Spark components is designed. It introduces the FAST-Join (Join critical filtering sampling partition optimization) algorithm in the Spark component for equivalence association query optimization to improve the operating efficiency of the Spark component and cluster. The experimental results show that the amount of data written and read in Shuffle by Spark optimized by the FAST-join algorithm only accounts for 0.958 and 1.384% of the original data volume on average, and the calculation speed is 202.11% faster than the original. The average data processing time and occupied memory size of the Spark cluster are reduced by 128.22 and 76.75% compared with the originals. It also compared the cluster performance of the FAST-join and Equi-join algorithms. The Spark cluster optimized by the FAST-join algorithm reduced the processing time and occupied memory size by an average of 68.74 and 37.80% compared with the Equi-join algorithm, which shows that the FAST-join algorithm can effectively improve the efficiency of inter-data table querying and cluster computing.
{"title":"Design of a data processing method for the farmland environmental monitoring based on improved Spark components.","authors":"Ruipeng Tang, Narendra Kumar Aridas, Mohamad Sofian Abu Talip","doi":"10.3389/fdata.2023.1282352","DOIUrl":"https://doi.org/10.3389/fdata.2023.1282352","url":null,"abstract":"<p><p>With the popularization of big data technology, agricultural data processing systems have become more intelligent. In this study, a data processing method for farmland environmental monitoring based on improved Spark components is designed. It introduces the FAST-Join (Join critical filtering sampling partition optimization) algorithm in the Spark component for equivalence association query optimization to improve the operating efficiency of the Spark component and cluster. The experimental results show that the amount of data written and read in Shuffle by Spark optimized by the FAST-join algorithm only accounts for 0.958 and 1.384% of the original data volume on average, and the calculation speed is 202.11% faster than the original. The average data processing time and occupied memory size of the Spark cluster are reduced by 128.22 and 76.75% compared with the originals. It also compared the cluster performance of the FAST-join and Equi-join algorithms. The Spark cluster optimized by the FAST-join algorithm reduced the processing time and occupied memory size by an average of 68.74 and 37.80% compared with the Equi-join algorithm, which shows that the FAST-join algorithm can effectively improve the efficiency of inter-data table querying and cluster computing.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1282352"},"PeriodicalIF":3.1,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10694358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-20eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1270756
Thong Vu, Tyler Petty, Kemal Yakut, Muhammad Usman, Wei Xue, Francis M Haas, Robert A Hirsh, Xinghui Zhao
Cardiovascular diseases, such as heart attack and congestive heart failure, are the leading cause of death both in the United States and worldwide. The current medical practice for diagnosing cardiovascular diseases is not suitable for long-term, out-of-hospital use. A key to long-term monitoring is the ability to detect abnormal cardiac rhythms, i.e., arrhythmia, in real-time. Most existing studies only focus on the accuracy of arrhythmia classification, instead of runtime performance of the workflow. In this paper, we present our work on supporting real-time arrhythmic detection using convolutional neural networks, which take images of electrocardiogram (ECG) segments as input, and classify the arrhythmia conditions. To support real-time processing, we have carried out extensive experiments and evaluated the computational cost of each step of the classification workflow. Our results show that it is feasible to achieve real-time arrhythmic detection using convolutional neural networks. To further demonstrate the generalizability of this approach, we used the trained model with processed data collected by a customized wearable sensor from a lab setting, and the results shown that our approach is highly accurate and efficient. This research provides the potentials to enable in-home real-time heart monitoring based on 2D image data, which opens up opportunities for integrating both machine learning and traditional diagnostic approaches.
{"title":"Real-time arrhythmia detection using convolutional neural networks.","authors":"Thong Vu, Tyler Petty, Kemal Yakut, Muhammad Usman, Wei Xue, Francis M Haas, Robert A Hirsh, Xinghui Zhao","doi":"10.3389/fdata.2023.1270756","DOIUrl":"10.3389/fdata.2023.1270756","url":null,"abstract":"<p><p>Cardiovascular diseases, such as heart attack and congestive heart failure, are the leading cause of death both in the United States and worldwide. The current medical practice for diagnosing cardiovascular diseases is not suitable for long-term, out-of-hospital use. A key to long-term monitoring is the ability to detect abnormal cardiac rhythms, i.e., arrhythmia, in real-time. Most existing studies only focus on the accuracy of arrhythmia classification, instead of runtime performance of the workflow. In this paper, we present our work on supporting real-time arrhythmic detection using convolutional neural networks, which take images of electrocardiogram (ECG) segments as input, and classify the arrhythmia conditions. To support real-time processing, we have carried out extensive experiments and evaluated the computational cost of each step of the classification workflow. Our results show that it is feasible to achieve real-time arrhythmic detection using convolutional neural networks. To further demonstrate the generalizability of this approach, we used the trained model with processed data collected by a customized wearable sensor from a lab setting, and the results shown that our approach is highly accurate and efficient. This research provides the potentials to enable in-home real-time heart monitoring based on 2D image data, which opens up opportunities for integrating both machine learning and traditional diagnostic approaches.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1270756"},"PeriodicalIF":3.1,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10696646/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138500101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-17eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1274135
Lei Zhang, Zhiqian Chen, Chang-Tien Lu, Liang Zhao
Numerous networks in the real world change with time, producing dynamic graphs such as human mobility networks and brain networks. Typically, the "dynamics on graphs" (e.g., changing node attribute values) are visible, and they may be connected to and suggestive of the "dynamics of graphs" (e.g., evolution of the graph topology). Due to two fundamental obstacles, modeling and mapping between them have not been thoroughly explored: (1) the difficulty of developing a highly adaptable model without solid hypotheses and (2) the ineffectiveness and slowness of processing data with varying granularity. To solve these issues, we offer a novel scalable deep echo-state graph dynamics encoder for networks with significant temporal duration and dimensions. A novel neural architecture search (NAS) technique is then proposed and tailored for the deep echo-state encoder to ensure strong learnability. Extensive experiments on synthetic and actual application data illustrate the proposed method's exceptional effectiveness and efficiency.
{"title":"Fast and adaptive dynamics-on-graphs to dynamics-of-graphs translation.","authors":"Lei Zhang, Zhiqian Chen, Chang-Tien Lu, Liang Zhao","doi":"10.3389/fdata.2023.1274135","DOIUrl":"10.3389/fdata.2023.1274135","url":null,"abstract":"<p><p>Numerous networks in the real world change with time, producing dynamic graphs such as human mobility networks and brain networks. Typically, the \"dynamics <b>on</b> graphs\" (e.g., changing node attribute values) are visible, and they may be connected to and suggestive of the \"dynamics <b>of</b> graphs\" (e.g., evolution of the graph topology). Due to two fundamental obstacles, modeling and mapping between them have not been thoroughly explored: (1) the difficulty of developing a highly adaptable model without solid hypotheses and (2) the ineffectiveness and slowness of processing data with varying granularity. To solve these issues, we offer a novel scalable deep echo-state graph dynamics encoder for networks with significant temporal duration and dimensions. A novel neural architecture search (NAS) technique is then proposed and tailored for the deep echo-state encoder to ensure strong learnability. Extensive experiments on synthetic and actual application data illustrate the proposed method's exceptional effectiveness and efficiency.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1274135"},"PeriodicalIF":3.1,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10691542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-17eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1243559
Kellen Donahue, John S Kimball, Jinyang Du, Fredrick Bunt, Andreas Colliander, Mahta Moghaddam, Jesse Johnson, Youngwook Kim, Michael A Rawlins
Satellite microwave sensors are well suited for monitoring landscape freeze-thaw (FT) transitions owing to the strong brightness temperature (TB) or backscatter response to changes in liquid water abundance between predominantly frozen and thawed conditions. The FT retrieval is also a sensitive climate indicator with strong biophysical importance. However, retrieval algorithms can have difficulty distinguishing the FT status of soils from that of overlying features such as snow and vegetation, while variable land conditions can also degrade performance. Here, we applied a deep learning model using a multilayer convolutional neural network driven by AMSR2 and SMAP TB records, and trained on surface (~0-5 cm depth) soil temperature FT observations. Soil FT states were classified for the local morning (6 a.m.) and evening (6 p.m.) conditions corresponding to SMAP descending and ascending orbital overpasses, mapped to a 9 km polar grid spanning a five-year (2016-2020) record and Northern Hemisphere domain. Continuous variable estimates of the probability of frozen or thawed conditions were derived using a model cost function optimized against FT observational training data. Model results derived using combined multi-frequency (1.4, 18.7, 36.5 GHz) TBs produced the highest soil FT accuracy over other models derived using only single sensor or single frequency TB inputs. Moreover, SMAP L-band (1.4 GHz) TBs provided enhanced soil FT information and performance gain over model results derived using only AMSR2 TB inputs. The resulting soil FT classification showed favorable and consistent performance against soil FT observations from ERA5 reanalysis (mean percent accuracy, MPA: 92.7%) and in situ weather stations (MPA: 91.0%). The soil FT accuracy was generally consistent between morning and afternoon predictions and across different land covers and seasons. The model also showed better FT accuracy than ERA5 against regional weather station measurements (91.0% vs. 86.1% MPA). However, model confidence was lower in complex terrain where FT spatial heterogeneity was likely beneath the effective model grain size. Our results provide a high level of precision in mapping soil FT dynamics to improve understanding of complex seasonal transitions and their influence on ecological processes and climate feedbacks, with the potential to inform Earth system model predictions.
{"title":"Deep learning estimation of northern hemisphere soil freeze-thaw dynamics using satellite multi-frequency microwave brightness temperature observations.","authors":"Kellen Donahue, John S Kimball, Jinyang Du, Fredrick Bunt, Andreas Colliander, Mahta Moghaddam, Jesse Johnson, Youngwook Kim, Michael A Rawlins","doi":"10.3389/fdata.2023.1243559","DOIUrl":"10.3389/fdata.2023.1243559","url":null,"abstract":"<p><p>Satellite microwave sensors are well suited for monitoring landscape freeze-thaw (FT) transitions owing to the strong brightness temperature (TB) or backscatter response to changes in liquid water abundance between predominantly frozen and thawed conditions. The FT retrieval is also a sensitive climate indicator with strong biophysical importance. However, retrieval algorithms can have difficulty distinguishing the FT status of soils from that of overlying features such as snow and vegetation, while variable land conditions can also degrade performance. Here, we applied a deep learning model using a multilayer convolutional neural network driven by AMSR2 and SMAP TB records, and trained on surface (~0-5 cm depth) soil temperature FT observations. Soil FT states were classified for the local morning (6 a.m.) and evening (6 p.m.) conditions corresponding to SMAP descending and ascending orbital overpasses, mapped to a 9 km polar grid spanning a five-year (2016-2020) record and Northern Hemisphere domain. Continuous variable estimates of the probability of frozen or thawed conditions were derived using a model cost function optimized against FT observational training data. Model results derived using combined multi-frequency (1.4, 18.7, 36.5 GHz) TBs produced the highest soil FT accuracy over other models derived using only single sensor or single frequency TB inputs. Moreover, SMAP L-band (1.4 GHz) TBs provided enhanced soil FT information and performance gain over model results derived using only AMSR2 TB inputs. The resulting soil FT classification showed favorable and consistent performance against soil FT observations from ERA5 reanalysis (mean percent accuracy, MPA: 92.7%) and <i>in situ</i> weather stations (MPA: 91.0%). The soil FT accuracy was generally consistent between morning and afternoon predictions and across different land covers and seasons. The model also showed better FT accuracy than ERA5 against regional weather station measurements (91.0% vs. 86.1% MPA). However, model confidence was lower in complex terrain where FT spatial heterogeneity was likely beneath the effective model grain size. Our results provide a high level of precision in mapping soil FT dynamics to improve understanding of complex seasonal transitions and their influence on ecological processes and climate feedbacks, with the potential to inform Earth system model predictions.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1243559"},"PeriodicalIF":3.1,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10690831/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-14DOI: 10.3389/fdata.2023.1274831
Yajing Liu, Turgay Caglar, Christopher Peterson, Michael Kirby
This paper investigates the integration of multiple geometries present within a ReLU-based neural network. A ReLU neural network determines a piecewise affine linear continuous map, M , from an input space ℝ m to an output space ℝ n . The piecewise behavior corresponds to a polyhedral decomposition of ℝ m . Each polyhedron in the decomposition can be labeled with a binary vector (whose length equals the number of ReLU nodes in the network) and with an affine linear function (which agrees with M when restricted to points in the polyhedron). We develop a toolbox that calculates the binary vector for a polyhedra containing a given data point with respect to a given ReLU FFNN. We utilize this binary vector to derive bounding facets for the corresponding polyhedron, extraction of “active” bits within the binary vector, enumeration of neighboring binary vectors, and visualization of the polyhedral decomposition (Python code is available at https://github.com/cglrtrgy/GoL_Toolbox ). Polyhedra in the polyhedral decomposition of ℝ m are neighbors if they share a facet. Binary vectors for neighboring polyhedra differ in exactly 1 bit. Using the toolbox, we analyze the Hamming distance between the binary vectors for polyhedra containing points from adversarial/nonadversarial datasets revealing distinct geometric properties. A bisection method is employed to identify sample points with a Hamming distance of 1 along the shortest Euclidean distance path, facilitating the analysis of local geometric interplay between Euclidean geometry and the polyhedral decomposition along the path. Additionally, we study the distribution of Chebyshev centers and related radii across different polyhedra, shedding light on the polyhedral shape, size, clustering, and aiding in the understanding of decision boundaries.
{"title":"Integrating geometries of ReLU feedforward neural networks","authors":"Yajing Liu, Turgay Caglar, Christopher Peterson, Michael Kirby","doi":"10.3389/fdata.2023.1274831","DOIUrl":"https://doi.org/10.3389/fdata.2023.1274831","url":null,"abstract":"This paper investigates the integration of multiple geometries present within a ReLU-based neural network. A ReLU neural network determines a piecewise affine linear continuous map, M , from an input space ℝ m to an output space ℝ n . The piecewise behavior corresponds to a polyhedral decomposition of ℝ m . Each polyhedron in the decomposition can be labeled with a binary vector (whose length equals the number of ReLU nodes in the network) and with an affine linear function (which agrees with M when restricted to points in the polyhedron). We develop a toolbox that calculates the binary vector for a polyhedra containing a given data point with respect to a given ReLU FFNN. We utilize this binary vector to derive bounding facets for the corresponding polyhedron, extraction of “active” bits within the binary vector, enumeration of neighboring binary vectors, and visualization of the polyhedral decomposition (Python code is available at https://github.com/cglrtrgy/GoL_Toolbox ). Polyhedra in the polyhedral decomposition of ℝ m are neighbors if they share a facet. Binary vectors for neighboring polyhedra differ in exactly 1 bit. Using the toolbox, we analyze the Hamming distance between the binary vectors for polyhedra containing points from adversarial/nonadversarial datasets revealing distinct geometric properties. A bisection method is employed to identify sample points with a Hamming distance of 1 along the shortest Euclidean distance path, facilitating the analysis of local geometric interplay between Euclidean geometry and the polyhedral decomposition along the path. Additionally, we study the distribution of Chebyshev centers and related radii across different polyhedra, shedding light on the polyhedral shape, size, clustering, and aiding in the understanding of decision boundaries.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"15 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134991399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-09DOI: 10.3389/fdata.2023.1240660
Damian Eke, Paschal Ochang, Bernd Carsten Stahl
Introduction The study of the brain continues to generate substantial volumes of data, commonly referred to as “big brain data,” which serves various purposes such as the treatment of brain-related diseases, the development of neurotechnological devices, and the training of algorithms. This big brain data, generated in different jurisdictions, is subject to distinct ethical and legal principles, giving rise to various ethical and legal concerns during collaborative efforts. Understanding these ethical and legal principles and concerns is crucial, as it catalyzes the development of a global governance framework, currently lacking in this field. While prior research has advocated for a contextual examination of brain data governance, such studies have been limited. Additionally, numerous challenges, issues, and concerns surround the development of a contextually informed brain data governance framework. Therefore, this study aims to bridge these gaps by exploring the ethical foundations that underlie contextual stakeholder discussions on brain data governance. Method In this study we conducted a secondary analysis of interviews with 21 neuroscientists drafted from the International Brain Initiative (IBI), LATBrain Initiative and the Society of Neuroscientists of Africa (SONA) who are involved in various brain projects globally and employing ethical theories. Ethical theories provide the philosophical frameworks and principles that inform the development and implementation of data governance policies and practices. Results The results of the study revealed various contextual ethical positions that underscore the ethical perspectives of neuroscientists engaged in brain data research globally. Discussion This research highlights the multitude of challenges and deliberations inherent in the pursuit of a globally informed framework for governing brain data. Furthermore, it sheds light on several critical considerations that require thorough examination in advancing global brain data governance.
{"title":"Towards an understanding of global brain data governance: ethical positions that underpin global brain data governance discourse","authors":"Damian Eke, Paschal Ochang, Bernd Carsten Stahl","doi":"10.3389/fdata.2023.1240660","DOIUrl":"https://doi.org/10.3389/fdata.2023.1240660","url":null,"abstract":"Introduction The study of the brain continues to generate substantial volumes of data, commonly referred to as “big brain data,” which serves various purposes such as the treatment of brain-related diseases, the development of neurotechnological devices, and the training of algorithms. This big brain data, generated in different jurisdictions, is subject to distinct ethical and legal principles, giving rise to various ethical and legal concerns during collaborative efforts. Understanding these ethical and legal principles and concerns is crucial, as it catalyzes the development of a global governance framework, currently lacking in this field. While prior research has advocated for a contextual examination of brain data governance, such studies have been limited. Additionally, numerous challenges, issues, and concerns surround the development of a contextually informed brain data governance framework. Therefore, this study aims to bridge these gaps by exploring the ethical foundations that underlie contextual stakeholder discussions on brain data governance. Method In this study we conducted a secondary analysis of interviews with 21 neuroscientists drafted from the International Brain Initiative (IBI), LATBrain Initiative and the Society of Neuroscientists of Africa (SONA) who are involved in various brain projects globally and employing ethical theories. Ethical theories provide the philosophical frameworks and principles that inform the development and implementation of data governance policies and practices. Results The results of the study revealed various contextual ethical positions that underscore the ethical perspectives of neuroscientists engaged in brain data research globally. Discussion This research highlights the multitude of challenges and deliberations inherent in the pursuit of a globally informed framework for governing brain data. Furthermore, it sheds light on several critical considerations that require thorough examination in advancing global brain data governance.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":" 39","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135292352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-06DOI: 10.3389/fdata.2023.1086212
Joe Germino, Annalisa Szymanski, Ronald Metoyer, Nitesh V. Chawla
Introduction Maintaining an affordable and nutritious diet can be challenging, especially for those living under the conditions of poverty. To fulfill a healthy diet, consumers must make difficult decisions within a complicated food landscape. Decisions must factor information on health and budget constraints, the food supply and pricing options at local grocery stores, and nutrition and portion guidelines provided by government services. Information to support food choice decisions is often inconsistent and challenging to find, making it difficult for consumers to make informed, optimal decisions. This is especially true for low-income and Supplemental Nutrition Assistance Program (SNAP) households which have additional time and cost constraints that impact their food purchases and ultimately leave them more susceptible to malnutrition and obesity. The goal of this paper is to demonstrate how the integration of data from local grocery stores and federal government databases can be used to assist specific communities in meeting their unique health and budget challenges. Methods We discuss many of the challenges of integrating multiple data sources, such as inconsistent data availability and misleading nutrition labels. We conduct a case study using linear programming to identify a healthy meal plan that stays within a limited SNAP budget and also adheres to the Dietary Guidelines for Americans. Finally, we explore the main drivers of cost of local food products with emphasis on the nutrients determined by the USDA as areas of focus: added sugars, saturated fat, and sodium. Results and discussion Our case study results suggest that such an optimization model can be used to facilitate food purchasing decisions within a given community. By focusing on the community level, our results will inform future work navigating the complex networks of food information to build global recommendation systems.
{"title":"A community focused approach toward making healthy and affordable daily diet recommendations","authors":"Joe Germino, Annalisa Szymanski, Ronald Metoyer, Nitesh V. Chawla","doi":"10.3389/fdata.2023.1086212","DOIUrl":"https://doi.org/10.3389/fdata.2023.1086212","url":null,"abstract":"Introduction Maintaining an affordable and nutritious diet can be challenging, especially for those living under the conditions of poverty. To fulfill a healthy diet, consumers must make difficult decisions within a complicated food landscape. Decisions must factor information on health and budget constraints, the food supply and pricing options at local grocery stores, and nutrition and portion guidelines provided by government services. Information to support food choice decisions is often inconsistent and challenging to find, making it difficult for consumers to make informed, optimal decisions. This is especially true for low-income and Supplemental Nutrition Assistance Program (SNAP) households which have additional time and cost constraints that impact their food purchases and ultimately leave them more susceptible to malnutrition and obesity. The goal of this paper is to demonstrate how the integration of data from local grocery stores and federal government databases can be used to assist specific communities in meeting their unique health and budget challenges. Methods We discuss many of the challenges of integrating multiple data sources, such as inconsistent data availability and misleading nutrition labels. We conduct a case study using linear programming to identify a healthy meal plan that stays within a limited SNAP budget and also adheres to the Dietary Guidelines for Americans. Finally, we explore the main drivers of cost of local food products with emphasis on the nutrients determined by the USDA as areas of focus: added sugars, saturated fat, and sodium. Results and discussion Our case study results suggest that such an optimization model can be used to facilitate food purchasing decisions within a given community. By focusing on the community level, our results will inform future work navigating the complex networks of food information to build global recommendation systems.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"684 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135636558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.
{"title":"impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers","authors":"Marten Düring, Matteo Romanello, Maud Ehrmann, Kaspar Beelen, Daniele Guido, Brecht Deseure, Estelle Bunout, Jana Keck, Petros Apostolopoulos","doi":"10.3389/fdata.2023.1249469","DOIUrl":"https://doi.org/10.3389/fdata.2023.1249469","url":null,"abstract":"Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"9 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135820990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}