Pub Date : 2023-10-06DOI: 10.1007/s41060-023-00432-6
Laura Pollacci, Letizia Milli, Tuba Bircan, Giulio Rossetti
Abstract Understanding the careers and movements of highly skilled people plays an ever-increasing role in today’s global knowledge-based economy. Researchers and academics are sources of innovation and development for governments and institutions. Our study uses scientific-related data to track careers evolution and Researchers’ movements over time. To this end, we define the Yearly Degree of Collaborations Index, which measures the annual tendency of researchers to collaborate intra-nationally, and two scores to measure the mobility in and out of countries, as well as their balance.
在当今的全球知识经济中,了解高技能人才的职业和流动发挥着越来越重要的作用。研究人员和学者是政府和机构创新和发展的源泉。我们的研究使用与科学相关的数据来跟踪职业发展和研究人员的运动。为此,我们定义了年度合作程度指数(annual Degree of collaboration Index),该指数衡量研究人员在国内合作的年度趋势,以及两个分数来衡量国家内外的流动性及其平衡。
{"title":"Academic mobility from a big data perspective","authors":"Laura Pollacci, Letizia Milli, Tuba Bircan, Giulio Rossetti","doi":"10.1007/s41060-023-00432-6","DOIUrl":"https://doi.org/10.1007/s41060-023-00432-6","url":null,"abstract":"Abstract Understanding the careers and movements of highly skilled people plays an ever-increasing role in today’s global knowledge-based economy. Researchers and academics are sources of innovation and development for governments and institutions. Our study uses scientific-related data to track careers evolution and Researchers’ movements over time. To this end, we define the Yearly Degree of Collaborations Index, which measures the annual tendency of researchers to collaborate intra-nationally, and two scores to measure the mobility in and out of countries, as well as their balance.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135350523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-05DOI: 10.1007/s41060-023-00461-1
Mukesh Soni, Mohammad Shabaz, Renato R. Maaliw, Ismail Keshta, Rasool Altaee, Sanju Das
{"title":"Cloud-based non-invasive cognitive breath monitoring system for patients in health-care system","authors":"Mukesh Soni, Mohammad Shabaz, Renato R. Maaliw, Ismail Keshta, Rasool Altaee, Sanju Das","doi":"10.1007/s41060-023-00461-1","DOIUrl":"https://doi.org/10.1007/s41060-023-00461-1","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134975613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-04DOI: 10.1007/s41060-023-00455-z
Regina Bispo, Francisca G. Vieira, Clara Yokochi, Filipe J. Marques, Pedro Espadinha-Cruz, Alexandre Penha, António Grilo
Abstract Fire stations (FS) are typically non-uniformly distributed across space, and their service area is, in general, defined based on administrative boundaries. Since the location of FS may considerably influence the readiness and the effectiveness of the provided services, national and regional governments need research-based information to adequately plan where to establish firefighting facilities. In this study, we propose a method to reconfigure the fire stations layout using spatial point process models, clustering and space partitioning. First, modelling fire intensity variation across space through a point process model enables to replicate the process independently by simulation. Subsequently, for each simulation, the k -means algorithm is used to define a siting location, minimizing the total within distance between the fire occurrences and the new position. This method allows to obtain a set of locations from which the respective distribution is inferred. Assuming a bivariate normal spatial distribution, we further define confidence siting regions. Ultimately, new FS service areas are defined by Voronoi tessellation. To exemplify the application of the method, we apply it to reconfigure the fire station layout at Aveiro, Portugal.
{"title":"Using spatial point process models, clustering and space partitioning to reconfigure fire stations layout","authors":"Regina Bispo, Francisca G. Vieira, Clara Yokochi, Filipe J. Marques, Pedro Espadinha-Cruz, Alexandre Penha, António Grilo","doi":"10.1007/s41060-023-00455-z","DOIUrl":"https://doi.org/10.1007/s41060-023-00455-z","url":null,"abstract":"Abstract Fire stations (FS) are typically non-uniformly distributed across space, and their service area is, in general, defined based on administrative boundaries. Since the location of FS may considerably influence the readiness and the effectiveness of the provided services, national and regional governments need research-based information to adequately plan where to establish firefighting facilities. In this study, we propose a method to reconfigure the fire stations layout using spatial point process models, clustering and space partitioning. First, modelling fire intensity variation across space through a point process model enables to replicate the process independently by simulation. Subsequently, for each simulation, the k -means algorithm is used to define a siting location, minimizing the total within distance between the fire occurrences and the new position. This method allows to obtain a set of locations from which the respective distribution is inferred. Assuming a bivariate normal spatial distribution, we further define confidence siting regions. Ultimately, new FS service areas are defined by Voronoi tessellation. To exemplify the application of the method, we apply it to reconfigure the fire station layout at Aveiro, Portugal.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135591365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/s41060-023-00465-x
Carson K. Leung, Gabriella Pasi, Li Wang
Big data have become a core technology for providing innovative solutions in numerical applications and services in many fields. Embedded in these big data is valuable information and knowledge. This calls for data science and analytics, which has emerged as an important paradigm for driving the new economy and domains (e.g., Internet of Things, social and mobile networks, cloud computing), reforming classic disciplines (e.g., telecommunications, biology, health and social science), as well as upgrading core business and economic activity. In this article, we focus on both theoretical and practical data science and analytics. We summarize and highlight some of its challenges and solutions, which are covered in the eight articles in the current Special Issue on "theoretical and practical data science and analytics."
{"title":"Theoretical and practical data science and analytics: challenges and solutions","authors":"Carson K. Leung, Gabriella Pasi, Li Wang","doi":"10.1007/s41060-023-00465-x","DOIUrl":"https://doi.org/10.1007/s41060-023-00465-x","url":null,"abstract":"Big data have become a core technology for providing innovative solutions in numerical applications and services in many fields. Embedded in these big data is valuable information and knowledge. This calls for data science and analytics, which has emerged as an important paradigm for driving the new economy and domains (e.g., Internet of Things, social and mobile networks, cloud computing), reforming classic disciplines (e.g., telecommunications, biology, health and social science), as well as upgrading core business and economic activity. In this article, we focus on both theoretical and practical data science and analytics. We summarize and highlight some of its challenges and solutions, which are covered in the eight articles in the current Special Issue on \"theoretical and practical data science and analytics.\"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135568940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-30DOI: 10.1007/s41060-023-00454-0
Hassan Abedi Firouzjaei
Abstract In recent years, online question–answer (Q &A) platforms, such as Stack Exchange (SE), have become increasingly popular for information and knowledge sharing. Despite the vast amount of information available on these platforms, many questions remain unresolved. In this work, we aim to address this issue by proposing a novel approach to identify unresolved questions in SE Q &A communities. Our approach utilises the graph structure of communication formed around a question by users to model the communication network surrounding it. We employ a property graph model and graph neural networks (GNNs), which can effectively capture both the structure of communication and the content of messages exchanged among users. By leveraging the power of graph representation and GNNs, our approach can effectively identify unresolved questions in SE communities. Experimental results on the complete historical data from three distinct Q &A communities demonstrate the superiority of our proposed approach over baseline methods that only consider the content of questions. Finally, our work represents a first but important step towards better understanding the factors that can affect questions becoming and remaining unresolved in SE communities.
{"title":"A deep learning-based approach for identifying unresolved questions on Stack Exchange Q &A communities through graph-based communication modelling","authors":"Hassan Abedi Firouzjaei","doi":"10.1007/s41060-023-00454-0","DOIUrl":"https://doi.org/10.1007/s41060-023-00454-0","url":null,"abstract":"Abstract In recent years, online question–answer (Q &A) platforms, such as Stack Exchange (SE), have become increasingly popular for information and knowledge sharing. Despite the vast amount of information available on these platforms, many questions remain unresolved. In this work, we aim to address this issue by proposing a novel approach to identify unresolved questions in SE Q &A communities. Our approach utilises the graph structure of communication formed around a question by users to model the communication network surrounding it. We employ a property graph model and graph neural networks (GNNs), which can effectively capture both the structure of communication and the content of messages exchanged among users. By leveraging the power of graph representation and GNNs, our approach can effectively identify unresolved questions in SE communities. Experimental results on the complete historical data from three distinct Q &A communities demonstrate the superiority of our proposed approach over baseline methods that only consider the content of questions. Finally, our work represents a first but important step towards better understanding the factors that can affect questions becoming and remaining unresolved in SE communities.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136341742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Duo satellite-based remotely sensed land surface temperature prediction by various methods of machine learning","authors":"Shivam Chauhan, Ajay Singh Jethoo, Ajay Mishra, Vaibhav Varshney","doi":"10.1007/s41060-023-00459-9","DOIUrl":"https://doi.org/10.1007/s41060-023-00459-9","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-25DOI: 10.1007/s41060-023-00453-1
Muneeb Ahmad Wani, Peer Bilal Ahmad, Bilal Ahmad Para, Na Elah
{"title":"A new regression model for count data with applications to health care data","authors":"Muneeb Ahmad Wani, Peer Bilal Ahmad, Bilal Ahmad Para, Na Elah","doi":"10.1007/s41060-023-00453-1","DOIUrl":"https://doi.org/10.1007/s41060-023-00453-1","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135816989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-25DOI: 10.1007/s41060-023-00452-2
Stefan Bloemheuvel, Jurgen van den Hoogen, Martin Atzmueller
Abstract Graph neural networks (GNNs) haven proven to be an indispensable approach in modeling complex data, in particular spatial temporal data, e.g., relating to sensor data given as time series with according spatial information. Although GNNs provide powerful modeling capabilities on such kind of data, they require adequate input data in terms of both signal and the underlying graph structures. However, typically the according graphs are not automatically available or even predefined, such that typically an ad hoc graph representation needs to be constructed. However, often the construction of the underlying graph structure is given insufficient attention. Therefore, this paper performs an in-depth analysis of several methods for constructing graphs from a set of sensors attributed with spatial information, i.e., geographical coordinates, or using their respective attached signal data. We apply a diverse set of standard methods for estimating groups and similarities between graph nodes as location-based as well as signal-driven approaches on multiple benchmark datasets for evaluation and assessment. Here, for both areas, we specifically include distance-based, clustering-based, as well as correlation-based approaches for estimating the relationships between nodes for subsequent graph construction. In addition, we consider two different GNN approaches, i.e., regression and forecasting in order to enable a broader experimental assessment. Typically, no predefined graph is given, such that (ad hoc) graph creation is necessary. Here, our results indicate the criticality of factoring in the crucial step of graph construction into GNN-based research on spatial temporal data. Overall, in our experimentation no single approach for graph construction emerged as a clear winner. However, in our analysis we are able to provide specific indications based on the obtained results, for a specific class of methods. Collectively, the findings highlight the need for researchers to carefully consider graph construction when employing GNNs in the analysis of spatial temporal data.
{"title":"Graph construction on complex spatiotemporal data for enhancing graph neural network-based approaches","authors":"Stefan Bloemheuvel, Jurgen van den Hoogen, Martin Atzmueller","doi":"10.1007/s41060-023-00452-2","DOIUrl":"https://doi.org/10.1007/s41060-023-00452-2","url":null,"abstract":"Abstract Graph neural networks (GNNs) haven proven to be an indispensable approach in modeling complex data, in particular spatial temporal data, e.g., relating to sensor data given as time series with according spatial information. Although GNNs provide powerful modeling capabilities on such kind of data, they require adequate input data in terms of both signal and the underlying graph structures. However, typically the according graphs are not automatically available or even predefined, such that typically an ad hoc graph representation needs to be constructed. However, often the construction of the underlying graph structure is given insufficient attention. Therefore, this paper performs an in-depth analysis of several methods for constructing graphs from a set of sensors attributed with spatial information, i.e., geographical coordinates, or using their respective attached signal data. We apply a diverse set of standard methods for estimating groups and similarities between graph nodes as location-based as well as signal-driven approaches on multiple benchmark datasets for evaluation and assessment. Here, for both areas, we specifically include distance-based, clustering-based, as well as correlation-based approaches for estimating the relationships between nodes for subsequent graph construction. In addition, we consider two different GNN approaches, i.e., regression and forecasting in order to enable a broader experimental assessment. Typically, no predefined graph is given, such that (ad hoc) graph creation is necessary. Here, our results indicate the criticality of factoring in the crucial step of graph construction into GNN-based research on spatial temporal data. Overall, in our experimentation no single approach for graph construction emerged as a clear winner. However, in our analysis we are able to provide specific indications based on the obtained results, for a specific class of methods. Collectively, the findings highlight the need for researchers to carefully consider graph construction when employing GNNs in the analysis of spatial temporal data.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135816920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A machine learning approach to predict geomechanical properties of rocks from well logs","authors":"None Rohit, Shri Ram Manda, Aditya Raj, Nagababu Andraju","doi":"10.1007/s41060-023-00451-3","DOIUrl":"https://doi.org/10.1007/s41060-023-00451-3","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136154592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new generalization of the zero-truncated negative binomial distribution by a Lagrange expansion with associated regression model and applications","authors":"Mohanan Monisha, Radhakumari Maya, Muhammed Rasheed Irshad, Christophe Chesneau, Damodaran Santhamani Shibu","doi":"10.1007/s41060-023-00449-x","DOIUrl":"https://doi.org/10.1007/s41060-023-00449-x","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135307743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}