首页 > 最新文献

Data and information management最新文献

英文 中文
Patterns in paradata preferences among the makers and reusers of archaeological data 考古数据制作者和再使用者对范式的偏好模式
Pub Date : 2024-12-01 DOI: 10.1016/j.dim.2024.100077
Isto Huvila, Lisa Andersson, Olle Sköld
Knowledge of data reusers' and makers' preferences of data that describe processes and practices (paradata) remains limited, especially concerning broader patterns of such priorities. The aim of this study is to address this gap. Drawing on an exploratory factor analysis of a survey of makers and users of archaeological data, the study investigates 1) what patterns related to types of informational content can be identified in data makers' and users’ views of the usefulness of specific types of paradata, 2) how the patterns differ between data makers and users, and 3) how the patterns can be explained in terms of information needs and preferences. The findings show that paradata preferences are patterned and there are differences between data-makers and data-users ideas of what is useful. However, the differences limit to details that make data related processes and practices understandable rather than to the broader patterns of what types of information is needed. We identified five broad categories of uses for paradata (Data collection procedures and tools, Data in context, Standards and guidelines, Credentials, Data processing), and corresponding, applicable types of paradata. The findings point also to indicative possibilities of linking paradata preferences to orientational, contextualising and content-oriented data practices. From a practical perspective, this study underlines the importance of approaching paradata not as a monolith but rather as an arrangement that is structured by different understandings of (para)data and how it is acted upon. Instead of caring for paradata in general, it is crucial to engage with specific types of paradata for different data practices. Keywords: paradata, archaeology, data management, data reuse, research data management.
{"title":"Patterns in paradata preferences among the makers and reusers of archaeological data","authors":"Isto Huvila,&nbsp;Lisa Andersson,&nbsp;Olle Sköld","doi":"10.1016/j.dim.2024.100077","DOIUrl":"10.1016/j.dim.2024.100077","url":null,"abstract":"<div><div>Knowledge of data reusers' and makers' preferences of data that describe processes and practices (paradata) remains limited, especially concerning broader patterns of such priorities. The aim of this study is to address this gap. Drawing on an exploratory factor analysis of a survey of makers and users of archaeological data, the study investigates 1) what patterns related to types of informational content can be identified in data makers' and users’ views of the usefulness of specific types of paradata, 2) how the patterns differ between data makers and users, and 3) how the patterns can be explained in terms of information needs and preferences. The findings show that paradata preferences are patterned and there are differences between data-makers and data-users ideas of what is useful. However, the differences limit to details that make data related processes and practices understandable rather than to the broader patterns of what types of information is needed. We identified five broad categories of uses for paradata (Data collection procedures and tools, Data in context, Standards and guidelines, Credentials, Data processing), and corresponding, applicable types of paradata. The findings point also to indicative possibilities of linking paradata preferences to orientational, contextualising and content-oriented data practices. From a practical perspective, this study underlines the importance of approaching paradata not as a monolith but rather as an arrangement that is structured by different understandings of (para)data and how it is acted upon. Instead of caring for paradata in general, it is crucial to engage with specific types of paradata for different data practices. Keywords: paradata, archaeology, data management, data reuse, research data management.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100077"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141714890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human-AI interaction research agenda: A user-centered perspective 人机交互研究议程:以用户为中心的视角
Pub Date : 2024-12-01 DOI: 10.1016/j.dim.2024.100078
Tingting Jiang , Zhumo Sun , Shiting Fu , Yan Lv
The rapid growth of artificial intelligence (AI) has given rise to the field of Human-AI Interaction (HAII). This study meticulously reviewed the research themes, theoretical foundations, and methodological frameworks of the HAII field, aiming to construct a comprehensive overview of this field and provide robust support for future investigations. HAII research themes include human-AI collaboration, competition, conflict, and symbiosis. Theories drawn from communication, psychology, and sociology support these studies, while the employed methods include both self-reporting and observational approaches commonly utilized in user studies. It is suggested that future research should broaden its focus to encompass diverse user groups, AI roles, and tasks. Moreover, it is necessary to develop multi-disciplinary theories and integrate multi-level research methods to support the sustained development of the field. This study not only furnishes indispensable theoretical and practical insights for forthcoming research endeavors but also catalyzes the realization of a future distinguished by seamless interaction between humans and AI.
{"title":"Human-AI interaction research agenda: A user-centered perspective","authors":"Tingting Jiang ,&nbsp;Zhumo Sun ,&nbsp;Shiting Fu ,&nbsp;Yan Lv","doi":"10.1016/j.dim.2024.100078","DOIUrl":"10.1016/j.dim.2024.100078","url":null,"abstract":"<div><div>The rapid growth of artificial intelligence (AI) has given rise to the field of Human-AI Interaction (HAII). This study meticulously reviewed the research themes, theoretical foundations, and methodological frameworks of the HAII field, aiming to construct a comprehensive overview of this field and provide robust support for future investigations. HAII research themes include human-AI collaboration, competition, conflict, and symbiosis. Theories drawn from communication, psychology, and sociology support these studies, while the employed methods include both self-reporting and observational approaches commonly utilized in user studies. It is suggested that future research should broaden its focus to encompass diverse user groups, AI roles, and tasks. Moreover, it is necessary to develop multi-disciplinary theories and integrate multi-level research methods to support the sustained development of the field. This study not only furnishes indispensable theoretical and practical insights for forthcoming research endeavors but also catalyzes the realization of a future distinguished by seamless interaction between humans and AI.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100078"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141691159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact factor does not predict long-term article impact across 15 journals
Pub Date : 2024-12-01 DOI: 10.1016/j.dim.2024.100079
Bjarne Bartlett , Carter M. Zamora , Jon-Paul Bingham , Amy S Ebesu Hubbard , Michelle Tseng , Bryan Runck , Michael Kantar
Academic journals are ranked using a variety of methods with the most common metric being ‘journal impact factor’. Authors who publish in journals with higher impact factors are deemed to contribute more to their discipline. However, the impact factor of a journal does not indicate how long a specific article stays in the scientific discourse, and metrics that measure the length of time articles within a journal continue to be cited are not typically used. We examined citations of 443,732 research articles [786,064 total] between 1980 and 2020 across 15 journals. We explored the range of longevity values found across different journals as well as the relationship between impact factor and longevity. We found no relationship between impact factor and longevity, indicating that immediate attention to an article is not correlated with longer-term impact. In the set of journals that we examined, articles published in some journals (e.g., Ecology, Genetics) continued to be cited at a steady rate long beyond their initial publication date. This slow but steady citation accumulation resulted in the total citations in these journals approaching those of higher impact journals (e.g., Science, Nature) within the length of a typical academic career (30–40 years).
{"title":"Impact factor does not predict long-term article impact across 15 journals","authors":"Bjarne Bartlett ,&nbsp;Carter M. Zamora ,&nbsp;Jon-Paul Bingham ,&nbsp;Amy S Ebesu Hubbard ,&nbsp;Michelle Tseng ,&nbsp;Bryan Runck ,&nbsp;Michael Kantar","doi":"10.1016/j.dim.2024.100079","DOIUrl":"10.1016/j.dim.2024.100079","url":null,"abstract":"<div><div>Academic journals are ranked using a variety of methods with the most common metric being ‘journal impact factor’. Authors who publish in journals with higher impact factors are deemed to contribute more to their discipline. However, the impact factor of a journal does not indicate how long a specific article stays in the scientific discourse, and metrics that measure the length of time articles within a journal continue to be cited are not typically used. We examined citations of 443,732 research articles [786,064 total] between 1980 and 2020 across 15 journals. We explored the range of longevity values found across different journals as well as the relationship between impact factor and longevity. We found no relationship between impact factor and longevity, indicating that immediate attention to an article is not correlated with longer-term impact. In the set of journals that we examined, articles published in some journals (e.g., Ecology, Genetics) continued to be cited at a steady rate long beyond their initial publication date. This slow but steady citation accumulation resulted in the total citations in these journals approaching those of higher impact journals (e.g., Science, Nature) within the length of a typical academic career (30–40 years).</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100079"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How readers attentive and inattentive to task-related information scan: An eye-tracking study 关注和不关注任务相关信息的读者如何扫描:眼动追踪研究
Pub Date : 2024-12-01 DOI: 10.1016/j.dim.2024.100073
Jing Chen , Lu Zhang , Quan Lu

Purpose

Attentive to task-related information should be the highest priority for all readers engaged in task-reading. Investigating the scanning behaviors of attentive versus inattentive readers shed new insights into the sequential cognitive processes, but it has seldom been studied. This study investigates their global patterns on a global scale, pertaining to the whole length of scanpaths, and further compares local tactics, local strategies, and local strategy transitions on a local scale, related to the isolated regions of scanpath.

Design/methodology/approach

A regular style reading system with the question, navigating, and text areas on its interface, and two types of task, namely fact-finding (FF) and content understanding (CU), were designed in an eye-tracking experiment. 24 participants were placed into attentive (AR) or inattentive (IAR) readers groups according to their fixation duration on task-related paragraphs. A global sequence analysis algorithm, Needleman-Wunsch, was applied to uncover global patterns across the whole length of scanpaths (whole-scanpaths). A local sequence analysis method related to frequent sub-scanpaths was adopted to extract local tactics specific to the reader and task. Coding was performed to identify local strategies by classifying local tactics. A local strategy transition was further identified as a sequence of frequent local strategies at the beginning, middle, and ending phases.

Findings

Whole-scanpaths of AR significantly differed from those of IAR, despite the absence of global patterns for each group. Five types of local strategy were identified, namely locating information (LI), evaluating and verifying text relevance (EVR), navigation heuristics (NH), synthesizing information (SI), and contextual clues (CC). AR applied all types in both tasks, whereas IAR applied only two types and stuck with EVR. Furthermore, two types of local strategy transition were identified: comprehensive exploration and iterative content evaluation. AR employed the former with the linear feature in FF and the spiral feature in CU, while IAR employed the latter in both tasks.

Originality

This study advances the knowledge of dynamic cognitive processing from an attentive and inattentive to task-related information perspective. An objective analysis perspective for obtaining global patterns, local tactics, local strategies, and local strategy transitions is provided, then it can provide new insights into automatically classifying readers. The results also generate detailed and valuable guidance for improving reading system design and training readers.
{"title":"How readers attentive and inattentive to task-related information scan: An eye-tracking study","authors":"Jing Chen ,&nbsp;Lu Zhang ,&nbsp;Quan Lu","doi":"10.1016/j.dim.2024.100073","DOIUrl":"10.1016/j.dim.2024.100073","url":null,"abstract":"<div><h3>Purpose</h3><div>Attentive to task-related information should be the highest priority for all readers engaged in task-reading. Investigating the scanning behaviors of attentive versus inattentive readers shed new insights into the sequential cognitive processes, but it has seldom been studied. This study investigates their global patterns on a global scale, pertaining to the whole length of scanpaths, and further compares local tactics, local strategies, and local strategy transitions on a local scale, related to the isolated regions of scanpath.</div></div><div><h3>Design/methodology/approach</h3><div>A regular style reading system with the question, navigating, and text areas on its interface, and two types of task, namely fact-finding (FF) and content understanding (CU), were designed in an eye-tracking experiment. 24 participants were placed into attentive (AR) or inattentive (IAR) readers groups according to their fixation duration on task-related paragraphs. A global sequence analysis algorithm, Needleman-Wunsch, was applied to uncover global patterns across the whole length of scanpaths (whole-scanpaths). A local sequence analysis method related to frequent sub-scanpaths was adopted to extract local tactics specific to the reader and task. Coding was performed to identify local strategies by classifying local tactics. A local strategy transition was further identified as a sequence of frequent local strategies at the beginning, middle, and ending phases.</div></div><div><h3>Findings</h3><div>Whole<strong>-</strong>scanpaths of AR significantly differed from those of IAR, despite the absence of global patterns for each group. Five types of local strategy were identified, namely <em>locating information</em> <em>(</em><em>LI</em><em>)</em>, <em>evaluating and verifying text relevance (EVR)</em>, <em>navigation heuristics (NH)</em>, <em>synthesizing information</em> <em>(</em><em>SI</em><em>)</em>, and <em>contextual clues (CC)</em>. AR applied all types in both tasks, whereas IAR applied only two types and stuck with <em>EVR</em>. Furthermore, two types of local strategy transition were identified: <em>comprehensive exploration</em> and <em>iterative content evaluation</em>. AR employed the former with the linear feature in FF and the spiral feature in CU, while IAR employed the latter in both tasks.</div></div><div><h3>Originality</h3><div>This study advances the knowledge of dynamic cognitive processing from an attentive and inattentive to task-related information perspective. An objective analysis perspective for obtaining global patterns, local tactics, local strategies, and local strategy transitions is provided, then it can provide new insights into automatically classifying readers. The results also generate detailed and valuable guidance for improving reading system design and training readers.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100073"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empirical insights into the interaction effects of groups at high risk of depression on online social platforms with NLP-based sentiment analysis
Pub Date : 2024-12-01 DOI: 10.1016/j.dim.2024.100080
Yi Xiao , Yutong Yang , Haozhe Xu , Shijuan Li
With the proliferation of digital technology and the increasing prevalence of social media, some users at high risk of depression have opted to seek solace, acceptance, and assistance in online communities. However, the extant research is deficient in terms of the segmentation of groups, particularly subcultural groups. By analyzing the “Super Hashtags” and “Tree Hole” groups on Sina Weibo from January to March 2023 using a crawler and the ERNIE 3.0-Base model for sentiment analysis, the study uncovers distinct sentiment profiles and interaction patterns, revealing significant correlations between interaction metrics and sentiment levels. The findings indicate that while there are no significant differences in sentiment levels between the two communities, the “Tree Hole” community exhibits greater sentiment variability. Moreover, the study identifies that interaction behaviors are closely linked to sentiment states, emphasizing the importance of understanding the complex dynamics between online interactions and mental well-being. These insights contribute to the development of more effective support mechanisms within online platforms for individuals at risk of depression.
{"title":"Empirical insights into the interaction effects of groups at high risk of depression on online social platforms with NLP-based sentiment analysis","authors":"Yi Xiao ,&nbsp;Yutong Yang ,&nbsp;Haozhe Xu ,&nbsp;Shijuan Li","doi":"10.1016/j.dim.2024.100080","DOIUrl":"10.1016/j.dim.2024.100080","url":null,"abstract":"<div><div>With the proliferation of digital technology and the increasing prevalence of social media, some users at high risk of depression have opted to seek solace, acceptance, and assistance in online communities. However, the extant research is deficient in terms of the segmentation of groups, particularly subcultural groups. By analyzing the “Super Hashtags” and “Tree Hole” groups on Sina Weibo from January to March 2023 using a crawler and the ERNIE 3.0-Base model for sentiment analysis, the study uncovers distinct sentiment profiles and interaction patterns, revealing significant correlations between interaction metrics and sentiment levels. The findings indicate that while there are no significant differences in sentiment levels between the two communities, the “Tree Hole” community exhibits greater sentiment variability. Moreover, the study identifies that interaction behaviors are closely linked to sentiment states, emphasizing the importance of understanding the complex dynamics between online interactions and mental well-being. These insights contribute to the development of more effective support mechanisms within online platforms for individuals at risk of depression.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100080"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Erratum regarding missing Declaration of Competing Interest statements in previously published articles (Volume 6, Issues 1–4) 关于以前发表的文章中缺少 "竞争利益声明 "的勘误(第 6 卷第 1-4 期)
Pub Date : 2024-09-18 DOI: 10.1016/j.dim.2024.100085
{"title":"Erratum regarding missing Declaration of Competing Interest statements in previously published articles (Volume 6, Issues 1–4)","authors":"","doi":"10.1016/j.dim.2024.100085","DOIUrl":"10.1016/j.dim.2024.100085","url":null,"abstract":"","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100085"},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142328312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Responsibility toward society: A review and prospect of Savolainen's everyday information practice 对社会的责任:萨沃莱宁《日常信息实践》的回顾与展望。
Pub Date : 2024-09-01 DOI: 10.1016/j.dim.2024.100070

The emphasis on social phenomena that defines the Everyday Information Practice (EIP) domain sets it apart from information behavior fields. This study highlights the importance of researching everyday information practices in contemporary social-cultural contexts by using Savolainen's EIP-related models as examples. A synopsis of the characteristics of earlier studies in terms of research contexts, participants, research questions, and research methods was created by evaluating the pertinent studies using EIP-related models. A trend of social responsibility-focused EIP research was presented, along with recommendations for future research in the field of EIP from the perspectives of participants and research methods.

日常信息实践(EIP)领域强调社会现象,这使其有别于信息行为领域。本研究以萨沃莱宁的 EIP 相关模型为例,强调了在当代社会文化背景下研究日常信息实践的重要性。通过对使用 EIP 相关模型的相关研究进行评估,从研究背景、参与者、研究问题和研究方法等方面概述了早期研究的特点。介绍了以社会责任为重点的 EIP 研究趋势,并从参与者和研究方法的角度对 EIP 领域的未来研究提出了建议。
{"title":"Responsibility toward society: A review and prospect of Savolainen's everyday information practice","authors":"","doi":"10.1016/j.dim.2024.100070","DOIUrl":"10.1016/j.dim.2024.100070","url":null,"abstract":"<div><p>The emphasis on social phenomena that defines the Everyday Information Practice (EIP) domain sets it apart from information behavior fields. This study highlights the importance of researching everyday information practices in contemporary social-cultural contexts by using Savolainen's EIP-related models as examples. A synopsis of the characteristics of earlier studies in terms of research contexts, participants, research questions, and research methods was created by evaluating the pertinent studies using EIP-related models. A trend of social responsibility-focused EIP research was presented, along with recommendations for future research in the field of EIP from the perspectives of participants and research methods.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100070"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925124000068/pdfft?md5=5a4d2516d88e2f08572ccfc67abb9576&pid=1-s2.0-S2543925124000068-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140792044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Does internet use affect public risk perception? — From the perspective of political participation 互联网的使用会影响公众的风险意识吗?- 从政治参与的角度看
Pub Date : 2024-09-01 DOI: 10.1016/j.dim.2023.100059

Internet use has resulted in the flow and interweaving of risks and increased the difficulty of risk governance. Strengthening public risk perception research can not only make up for the shortcomings of traditional government-centered risk governance research but also improve the ability of risk governance. By employing data from Chinese Social Survey (CSS) and the mediating test with the process plug-in in SPSS, this paper tries to explore the influence mechanism of Internet use on public risk perception, as well as the mediating effect of different types of political participation. The results show that Internet use has a significantly positive impact on comprehensive public risk perception. Network political participation has significantly enhanced the public risk perception, while traditional political participation has significantly reduced the public risk perception. Besides, network political participation plays a mediating role in the relationship between Internet use and public risk perception.

互联网的使用导致了风险的流动和交织,增加了风险治理的难度。加强公众风险认知研究,不仅可以弥补传统的以政府为中心的风险治理研究的不足,还可以提高风险治理的能力。本文运用中国社会调查(CSS)数据,利用 SPSS 中的过程插件进行中介检验,试图探讨互联网使用对公众风险认知的影响机制,以及不同类型政治参与的中介效应。结果表明,互联网使用对公众综合风险感知有显著的正向影响。网络政治参与明显增强了公众风险感知,而传统政治参与则明显降低了公众风险感知。此外,网络政治参与在互联网使用与公众风险感知的关系中发挥着中介作用。
{"title":"Does internet use affect public risk perception? — From the perspective of political participation","authors":"","doi":"10.1016/j.dim.2023.100059","DOIUrl":"10.1016/j.dim.2023.100059","url":null,"abstract":"<div><p>Internet use has resulted in the flow and interweaving of risks and increased the difficulty of risk governance. Strengthening public risk perception research can not only make up for the shortcomings of traditional government-centered risk governance research but also improve the ability of risk governance. By employing data from Chinese Social Survey (CSS) and the mediating test with the process plug-in in SPSS, this paper tries to explore the influence mechanism of Internet use on public risk perception, as well as the mediating effect of different types of political participation. The results show that Internet use has a significantly positive impact on comprehensive public risk perception. Network political participation has significantly enhanced the public risk perception, while traditional political participation has significantly reduced the public risk perception. Besides, network political participation plays a mediating role in the relationship between Internet use and public risk perception.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100059"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000335/pdfft?md5=963a1f5301e23f6905ef0c6d6fe962ed&pid=1-s2.0-S2543925123000335-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139306100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive K-means clustering based under-sampling methods to solve the class imbalance problem 解决类不平衡问题的基于欠采样的自适应 K 均值聚类方法
Pub Date : 2024-09-01 DOI: 10.1016/j.dim.2023.100064

In the field of machine learning, the issue of class imbalance is a common problem. It refers to an imbalance in the quantity of data collected, where one class has a significantly larger number of data compared to another class, which can negatively affect the classification efficiency of algorithms. Under-sampling methods address class imbalance by reducing the quantity of data in the majority class, thereby achieving a balanced dataset and mitigating the class imbalance problem. Traditional under-sampling methods based on k-means clustering either set the unified value of k (number of clusters) or determine it directly based on the quantity of data in the minority or majority class. This paper proposes an adaptive k-means clustering under-sampling algorithm that calculates an appropriate k for each dataset. After clustering the majority class dataset into k clusters, our algorithm calculates the distances between the data within each cluster and the cluster centroids from two perspectives and selects data based on these distances. Subsequently, the subset of the majority class dataset are combined with the minority class dataset to generate a new balanced dataset, which is then used for classification algorithms. The performance of our algorithm is evaluated on 45 datasets. Experimental results demonstrate that our algorithm can dynamically determine appropriate k for different datasets and output a balanced dataset, thus enhancing the classification efficiency of machine learning algorithms. This work can provide new algorithmic ensemble strategies for addressing class imbalance problem.

在机器学习领域,类不平衡是一个常见问题。它指的是收集到的数据数量不平衡,即一个类别的数据数量明显多于另一个类别,这会对算法的分类效率产生负面影响。欠采样方法通过减少多数类的数据量来解决类不平衡问题,从而获得平衡的数据集,缓解类不平衡问题。传统的基于 k-means 聚类的欠采样方法要么设置统一的 k 值(聚类数),要么直接根据少数类或多数类的数据量来确定。本文提出了一种自适应 k 均值聚类低采样算法,它能为每个数据集计算出合适的 k 值。将多数类数据集聚类成 k 个聚类后,我们的算法从两个角度计算每个聚类内的数据与聚类中心点之间的距离,并根据这些距离选择数据。随后,将多数类数据集的子集与少数类数据集合并,生成一个新的平衡数据集,然后用于分类算法。我们在 45 个数据集上评估了算法的性能。实验结果表明,我们的算法可以为不同的数据集动态确定合适的 k,并输出平衡数据集,从而提高机器学习算法的分类效率。这项工作可以为解决类不平衡问题提供新的算法集合策略。
{"title":"Adaptive K-means clustering based under-sampling methods to solve the class imbalance problem","authors":"","doi":"10.1016/j.dim.2023.100064","DOIUrl":"10.1016/j.dim.2023.100064","url":null,"abstract":"<div><p>In the field of machine learning, the issue of class imbalance is a common problem. It refers to an imbalance in the quantity of data collected, where one class has a significantly larger number of data compared to another class, which can negatively affect the classification efficiency of algorithms. Under-sampling methods address class imbalance by reducing the quantity of data in the majority class, thereby achieving a balanced dataset and mitigating the class imbalance problem. Traditional under-sampling methods based on k-means clustering either set the unified value of <em>k</em> (number of clusters) or determine it directly based on the quantity of data in the minority or majority class. This paper proposes an adaptive k-means clustering under-sampling algorithm that calculates an appropriate <em>k</em> for each dataset. After clustering the majority class dataset into <em>k</em> clusters, our algorithm calculates the distances between the data within each cluster and the cluster centroids from two perspectives and selects data based on these distances. Subsequently, the subset of the majority class dataset are combined with the minority class dataset to generate a new balanced dataset, which is then used for classification algorithms. The performance of our algorithm is evaluated on 45 datasets. Experimental results demonstrate that our algorithm can dynamically determine appropriate <em>k</em> for different datasets and output a balanced dataset, thus enhancing the classification efficiency of machine learning algorithms. This work can provide new algorithmic ensemble strategies for addressing class imbalance problem.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100064"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000384/pdfft?md5=25a3920a1a4e803650366fa56c8a9827&pid=1-s2.0-S2543925123000384-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139189188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved detection of transient events in wide area sky survey using convolutional neural networks 基于卷积神经网络的广域巡天瞬态事件改进检测
Pub Date : 2024-09-01 DOI: 10.1016/j.dim.2023.100035

The aim of data science is to catch up with the data-intensive life style as well as the demand for decision support, which becomes common in various domains such as medical, education and other smart solutions. As such, high quality of data analysis is greatly desired for accurate and effective downstreaming exploitations. This is also true for the domain of astronomical survey like GOTO (Gravitational-wave Optical Transient Observer), where large amount of raw data has been collected daily. This is one of recognised projects that search for transient events with the new breed of optical survey telescopes that can detect the sky faster and deeper. This is accomplished by comparing the night-specific data with the reference such that new bright sources are obtained for further study. However, the huge size of data makes it difficult to sift by naked eyes, thus requiring an automated system. Yet, many conventional machine-learning models have been sub-optimal for this task, as true positives can hardly be recognised due to the nature of imbalance data. This motivates the exploration of convolutional neural networks or CNN for this binary classification problem. Based on existing technologies, the paper reports the original application of basic CNN model to a representative data, which has been designed and generated within the GOTO project. In addition to the improvement over those previous works, this empirical study also includes details of parameter analysis, which will be useful for practice and further investigation.

数据科学的目标是满足数据密集型生活方式和决策支持的需求,这在医疗、教育和其他智能解决方案等各个领域已变得十分普遍。因此,要想准确有效地进行下游开发,就需要高质量的数据分析。像 GOTO(引力波光学瞬变观测器)这样的天文观测领域也是如此,每天都要收集大量的原始数据。这是公认的利用新型光学巡天望远镜搜索瞬变事件的项目之一,这种望远镜可以更快、更深入地探测天空。其方法是将特定夜晚的数据与参考数据进行比较,从而获得新的亮源供进一步研究。然而,由于数据量巨大,肉眼难以筛选,因此需要一个自动化系统。然而,许多传统的机器学习模型在完成这项任务时并不理想,因为不平衡数据的特性很难识别真阳性。这就促使人们探索用卷积神经网络或 CNN 来解决二元分类问题。在现有技术的基础上,本文报告了基本 CNN 模型在代表性数据中的原始应用,这些数据是在 GOTO 项目中设计和生成的。与之前的工作相比,本实证研究不仅有所改进,还包括参数分析的细节,这将有助于实践和进一步研究。
{"title":"Improved detection of transient events in wide area sky survey using convolutional neural networks","authors":"","doi":"10.1016/j.dim.2023.100035","DOIUrl":"10.1016/j.dim.2023.100035","url":null,"abstract":"<div><p>The aim of data science is to catch up with the data-intensive life style as well as the demand for decision support, which becomes common in various domains such as medical, education and other smart solutions. As such, high quality of data analysis is greatly desired for accurate and effective downstreaming exploitations. This is also true for the domain of astronomical survey like GOTO (Gravitational-wave Optical Transient Observer), where large amount of raw data has been collected daily. This is one of recognised projects that search for transient events with the new breed of optical survey telescopes that can detect the sky faster and deeper. This is accomplished by comparing the night-specific data with the reference such that new bright sources are obtained for further study. However, the huge size of data makes it difficult to sift by naked eyes, thus requiring an automated system. Yet, many conventional machine-learning models have been sub-optimal for this task, as true positives can hardly be recognised due to the nature of imbalance data. This motivates the exploration of convolutional neural networks or CNN for this binary classification problem. Based on existing technologies, the paper reports the original application of basic CNN model to a representative data, which has been designed and generated within the GOTO project. In addition to the improvement over those previous works, this empirical study also includes details of parameter analysis, which will be useful for practice and further investigation.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100035"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000098/pdfft?md5=2a55597016759c169b3af4100cbcbbb7&pid=1-s2.0-S2543925123000098-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44409897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data and information management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1