Pub Date : 2023-11-17eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1243559
Kellen Donahue, John S Kimball, Jinyang Du, Fredrick Bunt, Andreas Colliander, Mahta Moghaddam, Jesse Johnson, Youngwook Kim, Michael A Rawlins
Satellite microwave sensors are well suited for monitoring landscape freeze-thaw (FT) transitions owing to the strong brightness temperature (TB) or backscatter response to changes in liquid water abundance between predominantly frozen and thawed conditions. The FT retrieval is also a sensitive climate indicator with strong biophysical importance. However, retrieval algorithms can have difficulty distinguishing the FT status of soils from that of overlying features such as snow and vegetation, while variable land conditions can also degrade performance. Here, we applied a deep learning model using a multilayer convolutional neural network driven by AMSR2 and SMAP TB records, and trained on surface (~0-5 cm depth) soil temperature FT observations. Soil FT states were classified for the local morning (6 a.m.) and evening (6 p.m.) conditions corresponding to SMAP descending and ascending orbital overpasses, mapped to a 9 km polar grid spanning a five-year (2016-2020) record and Northern Hemisphere domain. Continuous variable estimates of the probability of frozen or thawed conditions were derived using a model cost function optimized against FT observational training data. Model results derived using combined multi-frequency (1.4, 18.7, 36.5 GHz) TBs produced the highest soil FT accuracy over other models derived using only single sensor or single frequency TB inputs. Moreover, SMAP L-band (1.4 GHz) TBs provided enhanced soil FT information and performance gain over model results derived using only AMSR2 TB inputs. The resulting soil FT classification showed favorable and consistent performance against soil FT observations from ERA5 reanalysis (mean percent accuracy, MPA: 92.7%) and in situ weather stations (MPA: 91.0%). The soil FT accuracy was generally consistent between morning and afternoon predictions and across different land covers and seasons. The model also showed better FT accuracy than ERA5 against regional weather station measurements (91.0% vs. 86.1% MPA). However, model confidence was lower in complex terrain where FT spatial heterogeneity was likely beneath the effective model grain size. Our results provide a high level of precision in mapping soil FT dynamics to improve understanding of complex seasonal transitions and their influence on ecological processes and climate feedbacks, with the potential to inform Earth system model predictions.
{"title":"Deep learning estimation of northern hemisphere soil freeze-thaw dynamics using satellite multi-frequency microwave brightness temperature observations.","authors":"Kellen Donahue, John S Kimball, Jinyang Du, Fredrick Bunt, Andreas Colliander, Mahta Moghaddam, Jesse Johnson, Youngwook Kim, Michael A Rawlins","doi":"10.3389/fdata.2023.1243559","DOIUrl":"10.3389/fdata.2023.1243559","url":null,"abstract":"<p><p>Satellite microwave sensors are well suited for monitoring landscape freeze-thaw (FT) transitions owing to the strong brightness temperature (TB) or backscatter response to changes in liquid water abundance between predominantly frozen and thawed conditions. The FT retrieval is also a sensitive climate indicator with strong biophysical importance. However, retrieval algorithms can have difficulty distinguishing the FT status of soils from that of overlying features such as snow and vegetation, while variable land conditions can also degrade performance. Here, we applied a deep learning model using a multilayer convolutional neural network driven by AMSR2 and SMAP TB records, and trained on surface (~0-5 cm depth) soil temperature FT observations. Soil FT states were classified for the local morning (6 a.m.) and evening (6 p.m.) conditions corresponding to SMAP descending and ascending orbital overpasses, mapped to a 9 km polar grid spanning a five-year (2016-2020) record and Northern Hemisphere domain. Continuous variable estimates of the probability of frozen or thawed conditions were derived using a model cost function optimized against FT observational training data. Model results derived using combined multi-frequency (1.4, 18.7, 36.5 GHz) TBs produced the highest soil FT accuracy over other models derived using only single sensor or single frequency TB inputs. Moreover, SMAP L-band (1.4 GHz) TBs provided enhanced soil FT information and performance gain over model results derived using only AMSR2 TB inputs. The resulting soil FT classification showed favorable and consistent performance against soil FT observations from ERA5 reanalysis (mean percent accuracy, MPA: 92.7%) and <i>in situ</i> weather stations (MPA: 91.0%). The soil FT accuracy was generally consistent between morning and afternoon predictions and across different land covers and seasons. The model also showed better FT accuracy than ERA5 against regional weather station measurements (91.0% vs. 86.1% MPA). However, model confidence was lower in complex terrain where FT spatial heterogeneity was likely beneath the effective model grain size. Our results provide a high level of precision in mapping soil FT dynamics to improve understanding of complex seasonal transitions and their influence on ecological processes and climate feedbacks, with the potential to inform Earth system model predictions.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10690831/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-14DOI: 10.3389/fdata.2023.1274831
Yajing Liu, Turgay Caglar, Christopher Peterson, Michael Kirby
This paper investigates the integration of multiple geometries present within a ReLU-based neural network. A ReLU neural network determines a piecewise affine linear continuous map, M , from an input space ℝ m to an output space ℝ n . The piecewise behavior corresponds to a polyhedral decomposition of ℝ m . Each polyhedron in the decomposition can be labeled with a binary vector (whose length equals the number of ReLU nodes in the network) and with an affine linear function (which agrees with M when restricted to points in the polyhedron). We develop a toolbox that calculates the binary vector for a polyhedra containing a given data point with respect to a given ReLU FFNN. We utilize this binary vector to derive bounding facets for the corresponding polyhedron, extraction of “active” bits within the binary vector, enumeration of neighboring binary vectors, and visualization of the polyhedral decomposition (Python code is available at https://github.com/cglrtrgy/GoL_Toolbox ). Polyhedra in the polyhedral decomposition of ℝ m are neighbors if they share a facet. Binary vectors for neighboring polyhedra differ in exactly 1 bit. Using the toolbox, we analyze the Hamming distance between the binary vectors for polyhedra containing points from adversarial/nonadversarial datasets revealing distinct geometric properties. A bisection method is employed to identify sample points with a Hamming distance of 1 along the shortest Euclidean distance path, facilitating the analysis of local geometric interplay between Euclidean geometry and the polyhedral decomposition along the path. Additionally, we study the distribution of Chebyshev centers and related radii across different polyhedra, shedding light on the polyhedral shape, size, clustering, and aiding in the understanding of decision boundaries.
{"title":"Integrating geometries of ReLU feedforward neural networks","authors":"Yajing Liu, Turgay Caglar, Christopher Peterson, Michael Kirby","doi":"10.3389/fdata.2023.1274831","DOIUrl":"https://doi.org/10.3389/fdata.2023.1274831","url":null,"abstract":"This paper investigates the integration of multiple geometries present within a ReLU-based neural network. A ReLU neural network determines a piecewise affine linear continuous map, M , from an input space ℝ m to an output space ℝ n . The piecewise behavior corresponds to a polyhedral decomposition of ℝ m . Each polyhedron in the decomposition can be labeled with a binary vector (whose length equals the number of ReLU nodes in the network) and with an affine linear function (which agrees with M when restricted to points in the polyhedron). We develop a toolbox that calculates the binary vector for a polyhedra containing a given data point with respect to a given ReLU FFNN. We utilize this binary vector to derive bounding facets for the corresponding polyhedron, extraction of “active” bits within the binary vector, enumeration of neighboring binary vectors, and visualization of the polyhedral decomposition (Python code is available at https://github.com/cglrtrgy/GoL_Toolbox ). Polyhedra in the polyhedral decomposition of ℝ m are neighbors if they share a facet. Binary vectors for neighboring polyhedra differ in exactly 1 bit. Using the toolbox, we analyze the Hamming distance between the binary vectors for polyhedra containing points from adversarial/nonadversarial datasets revealing distinct geometric properties. A bisection method is employed to identify sample points with a Hamming distance of 1 along the shortest Euclidean distance path, facilitating the analysis of local geometric interplay between Euclidean geometry and the polyhedral decomposition along the path. Additionally, we study the distribution of Chebyshev centers and related radii across different polyhedra, shedding light on the polyhedral shape, size, clustering, and aiding in the understanding of decision boundaries.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134991399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-09DOI: 10.3389/fdata.2023.1240660
Damian Eke, Paschal Ochang, Bernd Carsten Stahl
Introduction The study of the brain continues to generate substantial volumes of data, commonly referred to as “big brain data,” which serves various purposes such as the treatment of brain-related diseases, the development of neurotechnological devices, and the training of algorithms. This big brain data, generated in different jurisdictions, is subject to distinct ethical and legal principles, giving rise to various ethical and legal concerns during collaborative efforts. Understanding these ethical and legal principles and concerns is crucial, as it catalyzes the development of a global governance framework, currently lacking in this field. While prior research has advocated for a contextual examination of brain data governance, such studies have been limited. Additionally, numerous challenges, issues, and concerns surround the development of a contextually informed brain data governance framework. Therefore, this study aims to bridge these gaps by exploring the ethical foundations that underlie contextual stakeholder discussions on brain data governance. Method In this study we conducted a secondary analysis of interviews with 21 neuroscientists drafted from the International Brain Initiative (IBI), LATBrain Initiative and the Society of Neuroscientists of Africa (SONA) who are involved in various brain projects globally and employing ethical theories. Ethical theories provide the philosophical frameworks and principles that inform the development and implementation of data governance policies and practices. Results The results of the study revealed various contextual ethical positions that underscore the ethical perspectives of neuroscientists engaged in brain data research globally. Discussion This research highlights the multitude of challenges and deliberations inherent in the pursuit of a globally informed framework for governing brain data. Furthermore, it sheds light on several critical considerations that require thorough examination in advancing global brain data governance.
{"title":"Towards an understanding of global brain data governance: ethical positions that underpin global brain data governance discourse","authors":"Damian Eke, Paschal Ochang, Bernd Carsten Stahl","doi":"10.3389/fdata.2023.1240660","DOIUrl":"https://doi.org/10.3389/fdata.2023.1240660","url":null,"abstract":"Introduction The study of the brain continues to generate substantial volumes of data, commonly referred to as “big brain data,” which serves various purposes such as the treatment of brain-related diseases, the development of neurotechnological devices, and the training of algorithms. This big brain data, generated in different jurisdictions, is subject to distinct ethical and legal principles, giving rise to various ethical and legal concerns during collaborative efforts. Understanding these ethical and legal principles and concerns is crucial, as it catalyzes the development of a global governance framework, currently lacking in this field. While prior research has advocated for a contextual examination of brain data governance, such studies have been limited. Additionally, numerous challenges, issues, and concerns surround the development of a contextually informed brain data governance framework. Therefore, this study aims to bridge these gaps by exploring the ethical foundations that underlie contextual stakeholder discussions on brain data governance. Method In this study we conducted a secondary analysis of interviews with 21 neuroscientists drafted from the International Brain Initiative (IBI), LATBrain Initiative and the Society of Neuroscientists of Africa (SONA) who are involved in various brain projects globally and employing ethical theories. Ethical theories provide the philosophical frameworks and principles that inform the development and implementation of data governance policies and practices. Results The results of the study revealed various contextual ethical positions that underscore the ethical perspectives of neuroscientists engaged in brain data research globally. Discussion This research highlights the multitude of challenges and deliberations inherent in the pursuit of a globally informed framework for governing brain data. Furthermore, it sheds light on several critical considerations that require thorough examination in advancing global brain data governance.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135292352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-06DOI: 10.3389/fdata.2023.1086212
Joe Germino, Annalisa Szymanski, Ronald Metoyer, Nitesh V. Chawla
Introduction Maintaining an affordable and nutritious diet can be challenging, especially for those living under the conditions of poverty. To fulfill a healthy diet, consumers must make difficult decisions within a complicated food landscape. Decisions must factor information on health and budget constraints, the food supply and pricing options at local grocery stores, and nutrition and portion guidelines provided by government services. Information to support food choice decisions is often inconsistent and challenging to find, making it difficult for consumers to make informed, optimal decisions. This is especially true for low-income and Supplemental Nutrition Assistance Program (SNAP) households which have additional time and cost constraints that impact their food purchases and ultimately leave them more susceptible to malnutrition and obesity. The goal of this paper is to demonstrate how the integration of data from local grocery stores and federal government databases can be used to assist specific communities in meeting their unique health and budget challenges. Methods We discuss many of the challenges of integrating multiple data sources, such as inconsistent data availability and misleading nutrition labels. We conduct a case study using linear programming to identify a healthy meal plan that stays within a limited SNAP budget and also adheres to the Dietary Guidelines for Americans. Finally, we explore the main drivers of cost of local food products with emphasis on the nutrients determined by the USDA as areas of focus: added sugars, saturated fat, and sodium. Results and discussion Our case study results suggest that such an optimization model can be used to facilitate food purchasing decisions within a given community. By focusing on the community level, our results will inform future work navigating the complex networks of food information to build global recommendation systems.
{"title":"A community focused approach toward making healthy and affordable daily diet recommendations","authors":"Joe Germino, Annalisa Szymanski, Ronald Metoyer, Nitesh V. Chawla","doi":"10.3389/fdata.2023.1086212","DOIUrl":"https://doi.org/10.3389/fdata.2023.1086212","url":null,"abstract":"Introduction Maintaining an affordable and nutritious diet can be challenging, especially for those living under the conditions of poverty. To fulfill a healthy diet, consumers must make difficult decisions within a complicated food landscape. Decisions must factor information on health and budget constraints, the food supply and pricing options at local grocery stores, and nutrition and portion guidelines provided by government services. Information to support food choice decisions is often inconsistent and challenging to find, making it difficult for consumers to make informed, optimal decisions. This is especially true for low-income and Supplemental Nutrition Assistance Program (SNAP) households which have additional time and cost constraints that impact their food purchases and ultimately leave them more susceptible to malnutrition and obesity. The goal of this paper is to demonstrate how the integration of data from local grocery stores and federal government databases can be used to assist specific communities in meeting their unique health and budget challenges. Methods We discuss many of the challenges of integrating multiple data sources, such as inconsistent data availability and misleading nutrition labels. We conduct a case study using linear programming to identify a healthy meal plan that stays within a limited SNAP budget and also adheres to the Dietary Guidelines for Americans. Finally, we explore the main drivers of cost of local food products with emphasis on the nutrients determined by the USDA as areas of focus: added sugars, saturated fat, and sodium. Results and discussion Our case study results suggest that such an optimization model can be used to facilitate food purchasing decisions within a given community. By focusing on the community level, our results will inform future work navigating the complex networks of food information to build global recommendation systems.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135636558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.
{"title":"impresso Text Reuse at Scale. An interface for the exploration of text reuse data in semantically enriched historical newspapers","authors":"Marten Düring, Matteo Romanello, Maud Ehrmann, Kaspar Beelen, Daniele Guido, Brecht Deseure, Estelle Bunout, Jana Keck, Petros Apostolopoulos","doi":"10.3389/fdata.2023.1249469","DOIUrl":"https://doi.org/10.3389/fdata.2023.1249469","url":null,"abstract":"Text Reuse reveals meaningful reiterations of text in large corpora. Humanities researchers use text reuse to study, e.g., the posterior reception of influential texts or to reveal evolving publication practices of historical media. This research is often supported by interactive visualizations which highlight relations and differences between text segments. In this paper, we build on earlier work in this domain. We present impresso Text Reuse at Scale, the to our knowledge first interface which integrates text reuse data with other forms of semantic enrichment to enable a versatile and scalable exploration of intertextual relations in historical newspaper corpora. The Text Reuse at Scale interface was developed as part of the impresso project and combines powerful search and filter operations with close and distant reading perspectives. We integrate text reuse data with enrichments derived from topic modeling, named entity recognition and classification, language and document type detection as well as a rich set of newspaper metadata. We report on historical research objectives and common user tasks for the analysis of historical text reuse data and present the prototype interface together with the results of a user evaluation.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135820990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.3389/fdata.2023.1291329
Mohammed Mansour, Turker Berk Donmez, Mustafa Kutlu, Shekhar Mahmud
Anemia is defined as a drop in the number of erythrocytes or hemoglobin concentration below normal levels in healthy people. The increase in paleness of the skin might vary based on the color of the skin, although there is currently no quantifiable measurement. The pallor of the skin is best visible in locations where the cuticle is thin, such as the interior of the mouth, lips, or conjunctiva. This work focuses on anemia-related pallors and their relationship to blood count values and artificial intelligence. In this study, a deep learning approach using transfer learning and Convolutional Neural Networks (CNN) was implemented in which VGG16, Xception, MobileNet, and ResNet50 architectures, were pre-trained to predict anemia using lip mucous images. A total of 138 volunteers (100 women and 38 men) participated in the work to develop the dataset that contains two image classes: healthy and anemic. Image processing was first performed on a single frame with only the mouth area visible, data argumentation was preformed, and then CNN models were applied to classify the dataset lip images. Statistical metrics were employed to discriminate the performance of the models in terms of Accuracy, Precision, Recal, and F1 Score. Among the CNN algorithms used, Xception was found to categorize the lip images with 99.28% accuracy, providing the best results. The other CNN architectures had accuracies of 96.38% for MobileNet, 95.65% for ResNet %, and 92.39% for VGG16. Our findings show that anemia may be diagnosed using deep learning approaches from a single lip image. This data set will be enhanced in the future to allow for real-time classification.
{"title":"Non-invasive detection of anemia using lip mucosa images transfer learning convolutional neural networks","authors":"Mohammed Mansour, Turker Berk Donmez, Mustafa Kutlu, Shekhar Mahmud","doi":"10.3389/fdata.2023.1291329","DOIUrl":"https://doi.org/10.3389/fdata.2023.1291329","url":null,"abstract":"Anemia is defined as a drop in the number of erythrocytes or hemoglobin concentration below normal levels in healthy people. The increase in paleness of the skin might vary based on the color of the skin, although there is currently no quantifiable measurement. The pallor of the skin is best visible in locations where the cuticle is thin, such as the interior of the mouth, lips, or conjunctiva. This work focuses on anemia-related pallors and their relationship to blood count values and artificial intelligence. In this study, a deep learning approach using transfer learning and Convolutional Neural Networks (CNN) was implemented in which VGG16, Xception, MobileNet, and ResNet50 architectures, were pre-trained to predict anemia using lip mucous images. A total of 138 volunteers (100 women and 38 men) participated in the work to develop the dataset that contains two image classes: healthy and anemic. Image processing was first performed on a single frame with only the mouth area visible, data argumentation was preformed, and then CNN models were applied to classify the dataset lip images. Statistical metrics were employed to discriminate the performance of the models in terms of Accuracy, Precision, Recal, and F1 Score. Among the CNN algorithms used, Xception was found to categorize the lip images with 99.28% accuracy, providing the best results. The other CNN architectures had accuracies of 96.38% for MobileNet, 95.65% for ResNet %, and 92.39% for VGG16. Our findings show that anemia may be diagnosed using deep learning approaches from a single lip image. This data set will be enhanced in the future to allow for real-time classification.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-02DOI: 10.3389/fdata.2023.1236397
Stefano Calzati
While the concept of digital twin has already consolidated in industry, its spinoff in the urban environment—in the form of a City Digital Twin (CDT)—is more recent. A CDT is a dynamic digital model of the physical city whereby the physical and the digital are integrated in both directions, thus mutually affecting each other in real time. Replicating the path of smart cities, literature remarks that agendas and discourses around CDTs remain (1) tech-centered, that is, focused on overcoming technical limitations and lacking a proper sociotechnical contextualization of digital twin technologies; (2) practice-first, entailing hands-on applications without a long-term strategic governance for the management of these same technologies. Building on that, the goal of this article is to move beyond high-level conceptualizations of CDT to (a) get a cognizant understanding of what a CDT can do, how, and for whom; (b) map the current state of development and implementation of CDTs in Europe. This will be done by looking at three case studies—Dublin, Helsinki, and Rotterdam—often considered as successful examples of CDTs in Europe. Through exiting literature and official documents, as well as by relying on primary interviews with tech experts and local officials, the article explores the maturity of these CDTs, along the Gartner's hype-mainstream curve of technological innovations. Findings show that, while all three municipalities have long-term plans to deliver an integrated, cyber-physical real-time modeling of the city, currently their CDTs are still at an early stage of development. The focus remains on technical barriers—e.g., integration of different data sources—overlooking the societal dimension, such as the systematic involvement of citizens. As for the governance, all cases embrace a multistakeholder approach; yet CDTs are still not used for policymaking and it remains to see how the power across stakeholders will be distributed in terms of access to, control of, and decisions about CDTs.
{"title":"No longer hype, not yet mainstream? Recalibrating city digital twins' expectations and reality: a case study perspective","authors":"Stefano Calzati","doi":"10.3389/fdata.2023.1236397","DOIUrl":"https://doi.org/10.3389/fdata.2023.1236397","url":null,"abstract":"While the concept of digital twin has already consolidated in industry, its spinoff in the urban environment—in the form of a City Digital Twin (CDT)—is more recent. A CDT is a dynamic digital model of the physical city whereby the physical and the digital are integrated in both directions, thus mutually affecting each other in real time. Replicating the path of smart cities, literature remarks that agendas and discourses around CDTs remain (1) tech-centered, that is, focused on overcoming technical limitations and lacking a proper sociotechnical contextualization of digital twin technologies; (2) practice-first, entailing hands-on applications without a long-term strategic governance for the management of these same technologies. Building on that, the goal of this article is to move beyond high-level conceptualizations of CDT to (a) get a cognizant understanding of what a CDT can do, how, and for whom; (b) map the current state of development and implementation of CDTs in Europe. This will be done by looking at three case studies—Dublin, Helsinki, and Rotterdam—often considered as successful examples of CDTs in Europe. Through exiting literature and official documents, as well as by relying on primary interviews with tech experts and local officials, the article explores the maturity of these CDTs, along the Gartner's hype-mainstream curve of technological innovations. Findings show that, while all three municipalities have long-term plans to deliver an integrated, cyber-physical real-time modeling of the city, currently their CDTs are still at an early stage of development. The focus remains on technical barriers—e.g., integration of different data sources—overlooking the societal dimension, such as the systematic involvement of citizens. As for the governance, all cases embrace a multistakeholder approach; yet CDTs are still not used for policymaking and it remains to see how the power across stakeholders will be distributed in terms of access to, control of, and decisions about CDTs.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135936136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-31DOI: 10.3389/fdata.2023.1292923
Lesia Mochurad, Andrii Sydor, Oleh Ratinskiy
Introduction Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. Methods One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. Results In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. Discussion Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research.
{"title":"A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services","authors":"Lesia Mochurad, Andrii Sydor, Oleh Ratinskiy","doi":"10.3389/fdata.2023.1292923","DOIUrl":"https://doi.org/10.3389/fdata.2023.1292923","url":null,"abstract":"Introduction Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. Methods One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. Results In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. Discussion Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135814023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-30eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1281614
Sebastian Lubos, Alexander Felfernig, Markus Tautschnig
Video platforms have become indispensable components within a diverse range of applications, serving various purposes in entertainment, e-learning, corporate training, online documentation, and news provision. As the volume and complexity of video content continue to grow, the need for personalized access features becomes an inevitable requirement to ensure efficient content consumption. To address this need, recommender systems have emerged as helpful tools providing personalized video access. By leveraging past user-specific video consumption data and the preferences of similar users, these systems excel in recommending videos that are highly relevant to individual users. This article presents a comprehensive overview of the current state of video recommender systems (VRS), exploring the algorithms used, their applications, and related aspects. In addition to an in-depth analysis of existing approaches, this review also addresses unresolved research challenges within this domain. These unexplored areas offer exciting opportunities for advancements and innovations, aiming to enhance the accuracy and effectiveness of personalized video recommendations. Overall, this article serves as a valuable resource for researchers, practitioners, and stakeholders in the video domain. It offers insights into cutting-edge algorithms, successful applications, and areas that merit further exploration to advance the field of video recommendation.
{"title":"An overview of video recommender systems: state-of-the-art and research issues.","authors":"Sebastian Lubos, Alexander Felfernig, Markus Tautschnig","doi":"10.3389/fdata.2023.1281614","DOIUrl":"10.3389/fdata.2023.1281614","url":null,"abstract":"<p><p>Video platforms have become indispensable components within a diverse range of applications, serving various purposes in entertainment, e-learning, corporate training, online documentation, and news provision. As the volume and complexity of video content continue to grow, the need for personalized access features becomes an inevitable requirement to ensure efficient content consumption. To address this need, recommender systems have emerged as helpful tools providing personalized video access. By leveraging past user-specific video consumption data and the preferences of similar users, these systems excel in recommending videos that are highly relevant to individual users. This article presents a comprehensive overview of the current state of <i>video recommender systems (VRS)</i>, exploring the algorithms used, their applications, and related aspects. In addition to an in-depth analysis of existing approaches, this review also addresses unresolved research challenges within this domain. These unexplored areas offer exciting opportunities for advancements and innovations, aiming to enhance the accuracy and effectiveness of personalized video recommendations. Overall, this article serves as a valuable resource for researchers, practitioners, and stakeholders in the video domain. It offers insights into cutting-edge algorithms, successful applications, and areas that merit further exploration to advance the field of video recommendation.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642507/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"107592784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-30eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1284511
Alexander Felfernig, Manfred Wundara, Thi Ngoc Trang Tran, Seda Polat-Erdeniz, Sebastian Lubos, Merfat El Mansi, Damian Garber, Viet-Man Le
Sustainability development goals (SDGs) are regarded as a universal call to action with the overall objectives of planet protection, ending of poverty, and ensuring peace and prosperity for all people. In order to achieve these objectives, different AI technologies play a major role. Specifically, recommender systems can provide support for organizations and individuals to achieve the defined goals. Recommender systems integrate AI technologies such as machine learning, explainable AI (XAI), case-based reasoning, and constraint solving in order to find and explain user-relevant alternatives from a potentially large set of options. In this article, we summarize the state of the art in applying recommender systems to support the achievement of sustainability development goals. In this context, we discuss open issues for future research.
{"title":"Recommender systems for sustainability: overview and research issues.","authors":"Alexander Felfernig, Manfred Wundara, Thi Ngoc Trang Tran, Seda Polat-Erdeniz, Sebastian Lubos, Merfat El Mansi, Damian Garber, Viet-Man Le","doi":"10.3389/fdata.2023.1284511","DOIUrl":"10.3389/fdata.2023.1284511","url":null,"abstract":"<p><p>Sustainability development goals (SDGs) are regarded as a universal call to action with the overall objectives of planet protection, ending of poverty, and ensuring peace and prosperity for all people. In order to achieve these objectives, different AI technologies play a major role. Specifically, recommender systems can provide support for organizations and individuals to achieve the defined goals. Recommender systems integrate AI technologies such as machine learning, explainable AI (XAI), case-based reasoning, and constraint solving in order to find and explain user-relevant alternatives from a potentially large set of options. In this article, we summarize the state of the art in applying recommender systems to support the achievement of sustainability development goals. In this context, we discuss open issues for future research.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"107592785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}