Journal of the Association for Information Science and Technology最新文献_第3页

Valuing curation infrastructures 评估策展基础设施

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-13 DOI: 10.1002/asi.70015

Morgan F. Wofford, Andrea K. Thomer, Libby Hemphill, Katherine Polasek, Elizabeth Yakel

This study uses a theoretical lens of infrastructural dimensions to examine stakeholders' perceptions of the value of curation, focusing on the social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). Drawing on 67 interviews with both internal (ICPSR staff) and external (funders, data producers, and reusers) stakeholders, we analyze how value is ascribed to curation across technical, organizational, and social components of infrastructure. We identify five key ways interviewees conceptualized the value of curation infrastructures: supporting sustainability and durability, enabling research efficiency, fostering trust, building community, and advancing data equity. Our findings highlight the role of curation in knowledge generation by reframing curation as infrastructure rather than a set of discrete practices. We clarify how transparency operates in dual—and sometimes conflicting—ways: as both understandability and invisibility, shaping trust in and access to data repositories. Second, we demonstrate how data equity is increasingly perceived by stakeholders as a core infrastructural value, enacted through practices that lower barriers to access. Finally, we surface the persistent challenges in evaluating and funding curation infrastructures due to their long time horizons and often-invisible nature. This work advocates recognizing and funding curation infrastructures as essential for long-term scientific and societal progress.

本研究使用基础设施维度的理论视角来检查利益相关者对策展价值的看法，重点关注社会科学数据存储库，大学间政治和社会研究联盟（ICPSR）。通过对内部（ICPSR员工）和外部（资助者、数据生产者和重用者）利益相关者的67次访谈，我们分析了如何将价值归功于基础设施的技术、组织和社会组成部分。我们确定了受访者概念化策展基础设施价值的五种关键方式：支持可持续性和持久性，提高研究效率，培养信任，建立社区，推进数据公平。我们的研究结果强调了策展在知识生成中的作用，将策展重新定义为基础设施，而不是一组离散的实践。我们阐明了透明度如何以双重（有时是相互冲突的）方式运作：既可理解又不可见，塑造了对数据存储库的信任和访问。其次，我们展示了利益相关者如何越来越多地将数据公平视为核心基础设施价值，并通过降低访问障碍的实践来实施。最后，我们提出了在评估和资助策展基础设施方面的持续挑战，因为它们的时间跨度长，而且往往是无形的。这项工作提倡认识和资助策展基础设施，因为它们对长期的科学和社会进步至关重要。

{"title":"Valuing curation infrastructures","authors":"Morgan F. Wofford, Andrea K. Thomer, Libby Hemphill, Katherine Polasek, Elizabeth Yakel","doi":"10.1002/asi.70015","DOIUrl":"https://doi.org/10.1002/asi.70015","url":null,"abstract":"This study uses a theoretical lens of infrastructural dimensions to examine stakeholders' perceptions of the value of curation, focusing on the social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). Drawing on 67 interviews with both internal (ICPSR staff) and external (funders, data producers, and reusers) stakeholders, we analyze how value is ascribed to curation across technical, organizational, and social components of infrastructure. We identify five key ways interviewees conceptualized the value of curation infrastructures: supporting sustainability and durability, enabling research efficiency, fostering trust, building community, and advancing data equity. Our findings highlight the role of curation in knowledge generation by reframing curation as infrastructure rather than a set of discrete practices. We clarify how transparency operates in dual—and sometimes conflicting—ways: as both understandability and invisibility, shaping trust in and access to data repositories. Second, we demonstrate how data equity is increasingly perceived by stakeholders as a core infrastructural value, enacted through practices that lower barriers to access. Finally, we surface the persistent challenges in evaluating and funding curation infrastructures due to their long time horizons and often-invisible nature. This work advocates recognizing and funding curation infrastructures as essential for long-term scientific and societal progress.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 12","pages":"1607-1624"},"PeriodicalIF":4.3,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145533724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classifying ultra-short scientific texts using a hybrid hierarchical multi-label classification framework 使用混合层次多标签分类框架对超短科学文本进行分类

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-13 DOI: 10.1002/asi.70018

Dengsheng Wu, Huidong Wu, Fan Meng, Jianping Li

Scientific text classification is essential for efficiently organizing and assimilating scientific knowledge. However, existing methods struggle to classify ultra-short scientific texts due to their limited content and complex hierarchical labeling. To overcome these challenges, we introduce the BERT-HMCN framework, which combines Bidirectional Encoder Representations from Transformers (BERT) with a Hierarchical Multi-label Classification Network (HMCN). This framework introduces a novel level-fixed fine-tuning strategy that strengthens the connection between text semantics and hierarchical labels, enhancing the representation of ultra-short texts. We evaluated BERT-HMCN's performance on a dataset of 75,065 program titles from the National Natural Science Foundation of China. Our results show that BERT-HMCN outperforms existing models in both overall performance and hierarchical accuracy. We also conducted a comparative analysis with autoregressive large language models (LLMs), illustrating the strengths of each in different contexts. Further analysis confirms the effectiveness and robustness of the BERT-HMCN framework. We discuss its theoretical contributions and practical applications, underscoring the broader implications of these results in scientific text classification and other related fields.

科学文本分类是有效组织和吸收科学知识的必要条件。然而，现有的方法难以分类超短的科学文本由于其有限的内容和复杂的分层标签。为了克服这些挑战，我们引入了BERT-HMCN框架，该框架将来自变压器的双向编码器表示（BERT）与分层多标签分类网络（HMCN）相结合。该框架引入了一种新的级别固定微调策略，该策略加强了文本语义和分层标签之间的联系，增强了超短文本的表示。我们在中国国家自然科学基金的75,065个项目标题数据集上评估了BERT-HMCN的性能。我们的研究结果表明，BERT-HMCN在整体性能和层次精度方面都优于现有模型。我们还与自回归大型语言模型（llm）进行了比较分析，说明了每种模型在不同背景下的优势。进一步的分析证实了BERT-HMCN框架的有效性和鲁棒性。我们讨论了它的理论贡献和实际应用，强调了这些结果在科学文本分类和其他相关领域的更广泛的影响。

{"title":"Classifying ultra-short scientific texts using a hybrid hierarchical multi-label classification framework","authors":"Dengsheng Wu, Huidong Wu, Fan Meng, Jianping Li","doi":"10.1002/asi.70018","DOIUrl":"https://doi.org/10.1002/asi.70018","url":null,"abstract":"Scientific text classification is essential for efficiently organizing and assimilating scientific knowledge. However, existing methods struggle to classify ultra-short scientific texts due to their limited content and complex hierarchical labeling. To overcome these challenges, we introduce the BERT-HMCN framework, which combines Bidirectional Encoder Representations from Transformers (BERT) with a Hierarchical Multi-label Classification Network (HMCN). This framework introduces a novel level-fixed fine-tuning strategy that strengthens the connection between text semantics and hierarchical labels, enhancing the representation of ultra-short texts. We evaluated BERT-HMCN's performance on a dataset of 75,065 program titles from the National Natural Science Foundation of China. Our results show that BERT-HMCN outperforms existing models in both overall performance and hierarchical accuracy. We also conducted a comparative analysis with autoregressive large language models (LLMs), illustrating the strengths of each in different contexts. Further analysis confirms the effectiveness and robustness of the BERT-HMCN framework. We discuss its theoretical contributions and practical applications, underscoring the broader implications of these results in scientific text classification and other related fields.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 12","pages":"1625-1646"},"PeriodicalIF":4.3,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145533723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Researchers' data processing descriptions—Understanding paradata creation practices and their underpinning instrumentalities 研究人员的数据处理描述-理解范式创建实践及其基础工具

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-06 DOI: 10.1002/asi.70003

Isto Huvila, Lisa Andersson, Olle Sköld

Researchers increasingly share data, both on their own initiative and as a result of requirements by funding agencies and publishers. For data to be accessible and reusable, it must be understandable. While typical metadata covers rudimentary information about data, data re-users often need more contextual information, including paradata informative of data-related practices and processes. To better understand the practices and types of data descriptions researchers produce, this paper analyzes 33 interviews with researchers and professionals working with archeological data in different capacities. We identified five data description practices: (1) prescribing, (2) keeping track, (3) describing (of what was done (processes); of structures, techniques, methods; of principles, rationales, decisions; of limitations of data), (4) flagging, and (5) publishing, formatting, and making available. A part evinces integrated paradata creation where paradata generation is tightly incorporated in the enactment of specific research methods, and a part standalone paradata creation prompted by aspirations to produce specific types of outputs. The findings suggest that underpinning instrumentalities, and the extent to which paradata creation is integral to research practice is central when developing means to support paradata generation, identifying where to find and how to manage it.

研究人员越来越多地共享数据，这既是出于他们自己的主动，也是由于资助机构和出版商的要求。为了使数据可访问和可重用，它必须是可理解的。虽然典型的元数据涵盖了关于数据的基本信息，但数据重用者通常需要更多的上下文信息，包括与数据相关的实践和流程的参数信息。为了更好地理解研究人员所产生的数据描述的做法和类型，本文分析了33个与不同能力的考古数据研究人员和专业人员的访谈。我们确定了五种数据描述实践：(1)规定，(2)跟踪，(3)描述所做的工作（过程）；结构、技术、方法；原则、理由、决定；数据的限制),(4)标记，(5)发布，格式化和提供。一部分是综合的para - ata创造，其中para - ata生成紧密地结合在特定研究方法的制定中，另一部分是由产生特定类型输出的愿望所推动的独立的para - ata创造。研究结果表明，在开发支持范式生成的方法、确定在哪里找到范式以及如何管理范式时，基础工具以及范式创建在研究实践中不可或缺的程度是核心。

{"title":"Researchers' data processing descriptions—Understanding paradata creation practices and their underpinning instrumentalities","authors":"Isto Huvila, Lisa Andersson, Olle Sköld","doi":"10.1002/asi.70003","DOIUrl":"https://doi.org/10.1002/asi.70003","url":null,"abstract":"Researchers increasingly share data, both on their own initiative and as a result of requirements by funding agencies and publishers. For data to be accessible and reusable, it must be understandable. While typical metadata covers rudimentary information about data, data re-users often need more contextual information, including paradata informative of data-related practices and processes. To better understand the practices and types of data descriptions researchers produce, this paper analyzes 33 interviews with researchers and professionals working with archeological data in different capacities. We identified five data description practices: (1) prescribing, (2) keeping track, (3) describing (of what was done (processes); of structures, techniques, methods; of principles, rationales, decisions; of limitations of data), (4) flagging, and (5) publishing, formatting, and making available. A part evinces integrated paradata creation where paradata generation is tightly incorporated in the enactment of specific research methods, and a part standalone paradata creation prompted by aspirations to produce specific types of outputs. The findings suggest that underpinning instrumentalities, and the extent to which paradata creation is integral to research practice is central when developing means to support paradata generation, identifying where to find and how to manage it.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1570-1590"},"PeriodicalIF":4.3,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding discrepancies in the coverage of OpenAlex: The case of China 理解OpenAlex报道中的差异：以中国为例

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-06 DOI: 10.1002/asi.70013

Mengxue Zheng, Lili Miao, Yi Bu, Vincent Larivière

Citations indexes play a crucial role for understanding how science is produced, disseminated, and used. However, these databases often face a critical trade-off: those offering extensive and high-quality coverage are typically proprietary, whereas publicly accessible datasets frequently exhibit fragmented coverage and inconsistent data quality. OpenAlex was developed to address this challenge, providing a freely available database with broad open coverage, with a particular emphasis on non-English speaking countries. Yet, few studies have assessed the quality of the OpenAlex dataset. This paper assesses the coverage by OpenAlex of China's papers, which shows an abnormal trend, and compares it with other countries that do not have English as their main language. Our analysis reveals that while OpenAlex increases the coverage of China's publications, primarily those disseminated by a national database, this coverage is incomplete and discontinuous when compared to other countries' records in the database. We observe similar issues in other non-English-speaking countries, with coverage varying across regions. These findings indicate that although OpenAlex expands coverage of research outputs, continuity issues persist and disproportionately affect certain countries. We emphasize the need for researchers to use OpenAlex data cautiously, being mindful of its potential limitations in cross-national analyses.

引文索引在理解科学是如何产生、传播和使用方面起着至关重要的作用。然而，这些数据库经常面临一个关键的权衡：那些提供广泛和高质量覆盖的数据库通常是专有的，而可公开访问的数据集经常表现出碎片化的覆盖和不一致的数据质量。OpenAlex的开发就是为了应对这一挑战，它提供了一个具有广泛开放覆盖范围的免费数据库，特别强调非英语国家。然而，很少有研究评估OpenAlex数据集的质量。本文对OpenAlex对中国论文的覆盖情况进行了评估，发现其呈现出一种异常趋势，并将其与其他非英语为主要语言的国家进行了比较。我们的分析表明，虽然OpenAlex增加了中国出版物的覆盖范围，主要是那些通过国家数据库传播的出版物，但与数据库中其他国家的记录相比，这种覆盖范围是不完整和不连续的。我们在其他非英语国家也观察到类似的问题，不同地区的覆盖率不同。这些发现表明，尽管OpenAlex扩大了研究成果的覆盖范围，但连续性问题仍然存在，并对某些国家产生了不成比例的影响。我们强调研究人员需要谨慎使用OpenAlex数据，并注意其在跨国分析中的潜在局限性。

{"title":"Understanding discrepancies in the coverage of OpenAlex: The case of China","authors":"Mengxue Zheng, Lili Miao, Yi Bu, Vincent Larivière","doi":"10.1002/asi.70013","DOIUrl":"https://doi.org/10.1002/asi.70013","url":null,"abstract":"Citations indexes play a crucial role for understanding how science is produced, disseminated, and used. However, these databases often face a critical trade-off: those offering extensive and high-quality coverage are typically proprietary, whereas publicly accessible datasets frequently exhibit fragmented coverage and inconsistent data quality. OpenAlex was developed to address this challenge, providing a freely available database with broad open coverage, with a particular emphasis on non-English speaking countries. Yet, few studies have assessed the quality of the OpenAlex dataset. This paper assesses the coverage by OpenAlex of China's papers, which shows an abnormal trend, and compares it with other countries that do not have English as their main language. Our analysis reveals that while OpenAlex increases the coverage of China's publications, primarily those disseminated by a national database, this coverage is incomplete and discontinuous when compared to other countries' records in the database. We observe similar issues in other non-English-speaking countries, with coverage varying across regions. These findings indicate that although OpenAlex expands coverage of research outputs, continuity issues persist and disproportionately affect certain countries. We emphasize the need for researchers to use OpenAlex data cautiously, being mindful of its potential limitations in cross-national analyses.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1591-1601"},"PeriodicalIF":4.3,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quality teamwork on Wikipedia: Interpersonal communication networks as a social capital booster 维基百科上的高质量团队合作：人际沟通网络作为社会资本助推器

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-05 DOI: 10.1002/asi.70014

Agnieszka Rychwalska, Szymon Talaga, Karolina Ziembowicz, Dariusz Jemielniak

With the growing ubiquity of virtual collaboration, it is important to understand what contributes to effective online teamwork. In large, online social systems, public, task-related discussions are vital for effectiveness, but direct, interpersonal communication may also play a role. We hypothesize that the positive effects of direct communication on online teamwork may be due to its role in building social capital. To verify this proposition, we analyzed network properties of interpersonal communication among Wikipedia editors, employing novel measures from network science - Effective Information - to measure social capital. We discovered that groups producing high-quality articles have communication networks that allow for local sharing of knowledge as well as for integration of information among the whole group: a structure promoting high social capital. Our results underscore the importance of direct communication for groups collaborating online and suggest that platforms for such communities should allow for ample one-on-one interactions.

随着虚拟协作的日益普及，了解什么有助于有效的在线团队合作是很重要的。在大型的在线社会系统中，与任务相关的公开讨论对于有效性至关重要，但直接的人际沟通也可能发挥作用。我们假设直接沟通对在线团队合作的积极影响可能是由于它在建立社会资本方面的作用。为了验证这一命题，我们分析了维基百科编辑之间人际交往的网络属性，采用网络科学的新措施-有效信息-来衡量社会资本。我们发现，生产高质量文章的群体拥有沟通网络，允许本地知识共享以及整个群体之间的信息整合：这是一种促进高社会资本的结构。我们的研究结果强调了在线合作群体直接沟通的重要性，并建议此类社区的平台应该允许充分的一对一互动。

{"title":"Quality teamwork on Wikipedia: Interpersonal communication networks as a social capital booster","authors":"Agnieszka Rychwalska, Szymon Talaga, Karolina Ziembowicz, Dariusz Jemielniak","doi":"10.1002/asi.70014","DOIUrl":"https://doi.org/10.1002/asi.70014","url":null,"abstract":"With the growing ubiquity of virtual collaboration, it is important to understand what contributes to effective online teamwork. In large, online social systems, public, task-related discussions are vital for effectiveness, but direct, interpersonal communication may also play a role. We hypothesize that the positive effects of direct communication on online teamwork may be due to its role in building social capital. To verify this proposition, we analyzed network properties of interpersonal communication among Wikipedia editors, employing novel measures from network science - Effective Information - to measure social capital. We discovered that groups producing high-quality articles have communication networks that allow for local sharing of knowledge as well as for integration of information among the whole group: a structure promoting high social capital. Our results underscore the importance of direct communication for groups collaborating online and suggest that platforms for such communities should allow for ample one-on-one interactions.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1553-1569"},"PeriodicalIF":4.3,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research data management services in academic libraries to support the research data life cycle: A systematic review. An Annual Review of Information Science and Technology (ARIST) paper 学术图书馆研究数据管理服务支持研究数据生命周期：系统综述。信息科学与技术年鉴（alist）论文

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-05 DOI: 10.1002/asi.70008

Richard Cheng Yong Ho, Suei Nee Wong, Patsy Chia, Chris Tang, Magdeline Tao Tao Ng

Academic libraries play an increasingly crucial role in providing services, information, education, and infrastructure support related to research data management (RDM). This systematic review aims to provide a comprehensive and critical analysis of the state of RDM services offered by academic libraries worldwide. Utilizing the systematic review methodology, the paper examines 89 empirical studies to answer four research questions: (1) the types of RDM services implemented by academic libraries; (2) what are the infrastructure, workflow, and resources used to support these services; (3) what are the reasons for implementing these RDM services; and (4) the effectiveness of these RDM services in supporting the research data life cycle, if any. This review highlights the critical reasons academic libraries provide RDM services and how they implemented these services through partnerships, infrastructure, and systems, and adapting to new workflows within the library. These findings also examine the balance between institutional contexts, researchers' needs, and library resources required to provide these RDM services. By investigating these questions, the results will provide recommendations and guidance for academic libraries interested in implementing RDM services in their own library and institutional contexts.

学术图书馆在提供与研究数据管理（RDM）相关的服务、信息、教育和基础设施支持方面发挥着越来越重要的作用。本系统综述旨在对全球学术图书馆提供的RDM服务状况进行全面和批判性的分析。本文运用系统回顾的方法，对89项实证研究进行分析，回答了四个研究问题：(1)高校图书馆RDM服务的类型；(2)用于支持这些服务的基础设施、工作流程和资源是什么；(3)实施这些RDM服务的原因是什么；(4)如果有的话，这些RDM服务在支持研究数据生命周期方面的有效性。这篇综述强调了学术图书馆提供RDM服务的关键原因，以及他们如何通过合作伙伴关系、基础设施和系统实施这些服务，以及如何适应图书馆内部的新工作流程。这些发现还检查了机构背景、研究人员需求和提供这些RDM服务所需的图书馆资源之间的平衡。通过调查这些问题，结果将为有兴趣在自己的图书馆和机构环境中实施RDM服务的学术图书馆提供建议和指导。

{"title":"Research data management services in academic libraries to support the research data life cycle: A systematic review. An Annual Review of Information Science and Technology (ARIST) paper","authors":"Richard Cheng Yong Ho, Suei Nee Wong, Patsy Chia, Chris Tang, Magdeline Tao Tao Ng","doi":"10.1002/asi.70008","DOIUrl":"https://doi.org/10.1002/asi.70008","url":null,"abstract":"Academic libraries play an increasingly crucial role in providing services, information, education, and infrastructure support related to research data management (RDM). This systematic review aims to provide a comprehensive and critical analysis of the state of RDM services offered by academic libraries worldwide. Utilizing the systematic review methodology, the paper examines 89 empirical studies to answer four research questions: (1) the types of RDM services implemented by academic libraries; (2) what are the infrastructure, workflow, and resources used to support these services; (3) what are the reasons for implementing these RDM services; and (4) the effectiveness of these RDM services in supporting the research data life cycle, if any. This review highlights the critical reasons academic libraries provide RDM services and how they implemented these services through partnerships, infrastructure, and systems, and adapting to new workflows within the library. These findings also examine the balance between institutional contexts, researchers' needs, and library resources required to provide these RDM services. By investigating these questions, the results will provide recommendations and guidance for academic libraries interested in implementing RDM services in their own library and institutional contexts.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"77 1","pages":"272-300"},"PeriodicalIF":4.3,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Which is the cited source? A new perspective on article evaluation based on semantic similarity—Citation contribution attribution 哪个是引用的来源？基于语义相似度的文章评价新视角——引文贡献归因

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-01 DOI: 10.1002/asi.70012

Siluo Yang, Lijuan Wu, Biyao Wu, Yanhui Song

Citation analysis, an essential method in bibliometrics and scientometrics, has been widely applied to assess scholarly publications. Traditionally, studies attribute citation contributions directly to cited papers, overlooking indirect citations. Citation cascade research improves on this by assuming papers inherit contributions from their citation generations but does not consider the source of the citation content (CC). We argue that citation contributions should be attributed to the source of the CC, which includes not only the cited paper but also its references. This study introduces a semantic similarity-driven approach (CCA_SS_RS) to allocate citation contributions. CCA_SS_RS evaluates semantic similarity between CC in the citing paper (CC_FP) and the reference span in the cited paper (RS_FP), as well as between CC_FP and CCs associated with the cited paper's references (CC_{FP_R}). If similarity between CC_FP and CC_{FP_Ri} exceeds or equals that between CC_FP and RS_FP, the i-th reference is credited; otherwise, the cited paper receives full credit. Tested on the CL-SciSumm 2017 dataset, CCA_SS_RS outperformed three established methods in identifying implicit cited sources, enabling references to receive varying contributions based on semantic similarity. This study highlights the significant impact of citation contribution attribution on paper evaluation and ranking.

引文分析是文献计量学和科学计量学的一种重要方法，已被广泛应用于学术出版物的评价。传统上，研究将引用贡献直接归因于被引论文，忽略了间接引用。引用级联研究在此基础上进行了改进，假设论文继承了其引用代的贡献，但不考虑引用内容（CC）的来源。我们认为，引文贡献应该归因于CC的来源，这不仅包括被引用的论文，还包括其参考文献。本研究引入语义相似度驱动方法（CCA_SSRS）来分配引文贡献。CCA_SSRS评估被引论文中的CC （CCFP）与被引论文中的参考跨度（RSFP）之间的语义相似度，以及CCFP与被引论文中参考文献相关的CC （CCFP_R）之间的语义相似度。如果CCFP与CCFP_Ri之间的相似度超过或等于CCFP与RSFP之间的相似度，则第i条参考文献被记录；否则，被引论文将获得全额学分。在CL-SciSumm 2017数据集上测试，CCA_SSRS在识别隐式引用来源方面优于三种既定方法，使参考文献能够根据语义相似度获得不同的贡献。本研究强调了引文贡献归因对论文评价和排名的显著影响。

{"title":"Which is the cited source? A new perspective on article evaluation based on semantic similarity—Citation contribution attribution","authors":"Siluo Yang, Lijuan Wu, Biyao Wu, Yanhui Song","doi":"10.1002/asi.70012","DOIUrl":"https://doi.org/10.1002/asi.70012","url":null,"abstract":"Citation analysis, an essential method in bibliometrics and scientometrics, has been widely applied to assess scholarly publications. Traditionally, studies attribute citation contributions directly to cited papers, overlooking indirect citations. Citation cascade research improves on this by assuming papers inherit contributions from their citation generations but does not consider the source of the citation content (CC). We argue that citation contributions should be attributed to the source of the CC, which includes not only the cited paper but also its references. This study introduces a semantic similarity-driven approach (CCA_SSRS) to allocate citation contributions. CCA_SSRS evaluates semantic similarity between CC in the citing paper (CCFP) and the reference span in the cited paper (RSFP), as well as between CCFP and CCs associated with the cited paper's references (CCFP_R). If similarity between CCFP and CCFP_Ri exceeds or equals that between CCFP and RSFP, the i-th reference is credited; otherwise, the cited paper receives full credit. Tested on the CL-SciSumm 2017 dataset, CCA_SSRS outperformed three established methods in identifying implicit cited sources, enabling references to receive varying contributions based on semantic similarity. This study highlights the significant impact of citation contribution attribution on paper evaluation and ranking.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1532-1552"},"PeriodicalIF":4.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning on blockchain data: A systematic mapping study. An Annual Review of Information Science and Technology (ARIST) paper 区块链数据上的机器学习：一个系统的映射研究。信息科学与技术年鉴（alist）论文

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-08-01 DOI: 10.1002/asi.70009

Georgios Palaiokrassas, Sarah Bouraga, Leandros Tassiulas

Blockchain technology has drawn growing attention in the literature and in practice. Blockchain technology generates considerable amounts of data and has thus been a topic of interest for machine learning (ML). The objective of this paper is to provide a comprehensive review of the state of the art on ML applied to on-chain data. This work aims to systematically identify, analyze, and classify the literature on ML applied to blockchain data. This will allow us to discover the fields where more effort should be placed in future research. A systematic mapping study has been conducted to identify the relevant literature. Ultimately, 211 articles were selected and classified according to various dimensions, specifically the domain use case, the blockchain, the data, and the ML models. The majority of the papers (43.35%) fall within the Anomaly use case. Ethereum (46.31%) was the blockchain that drew the most attention. A dataset consisting of more than 1,000,000 data points was used by (29.06%) of the papers. Classification (43.84%) was the ML task most applied to on-chain data. The results confirm that ML applied to on-chain data is a relevant and a growing topic of interest both in the literature and in practice. Researchers have studied interesting use cases such as address classification, anomaly detection, cryptocurrency price prediction, performance evaluation, and smart contract vulnerability detection. Nevertheless, some open challenges and gaps remain, which can lead to future research directions. Specifically, we identify novel ML algorithms, the lack of a standardization framework, blockchain scalability issues, and cross-chain interactions as areas worth exploring in the future.

区块链技术在文献和实践中引起了越来越多的关注。区块链技术产生了大量的数据，因此一直是机器学习（ML）感兴趣的话题。本文的目的是全面回顾应用于链上数据的机器学习技术的现状。本工作旨在系统地识别、分析和分类应用于区块链数据的ML文献。这将使我们能够发现在未来的研究中应该投入更多努力的领域。我们进行了系统的地图研究，以确定相关文献。最终，211篇文章被选择并根据不同的维度进行分类，特别是领域用例、区块链、数据和ML模型。大多数论文（43.35%）属于异常用例。以太坊（46.31%）是最受关注的区块链。29.06%的论文使用了超过1,000,000个数据点的数据集。分类（43.84%）是最适用于链上数据的ML任务。结果证实，应用于链上数据的机器学习在文献和实践中都是一个相关且日益增长的话题。研究人员研究了一些有趣的用例，如地址分类、异常检测、加密货币价格预测、性能评估和智能合约漏洞检测。然而，仍然存在一些开放的挑战和差距，这可以引导未来的研究方向。具体来说，我们确定了新的ML算法，缺乏标准化框架，区块链可扩展性问题以及跨链交互作为未来值得探索的领域。

{"title":"Machine learning on blockchain data: A systematic mapping study. An Annual Review of Information Science and Technology (ARIST) paper","authors":"Georgios Palaiokrassas, Sarah Bouraga, Leandros Tassiulas","doi":"10.1002/asi.70009","DOIUrl":"https://doi.org/10.1002/asi.70009","url":null,"abstract":"Blockchain technology has drawn growing attention in the literature and in practice. Blockchain technology generates considerable amounts of data and has thus been a topic of interest for machine learning (ML). The objective of this paper is to provide a comprehensive review of the state of the art on ML applied to on-chain data. This work aims to systematically identify, analyze, and classify the literature on ML applied to blockchain data. This will allow us to discover the fields where more effort should be placed in future research. A systematic mapping study has been conducted to identify the relevant literature. Ultimately, 211 articles were selected and classified according to various dimensions, specifically the domain use case, the blockchain, the data, and the ML models. The majority of the papers (43.35%) fall within the Anomaly use case. Ethereum (46.31%) was the blockchain that drew the most attention. A dataset consisting of more than 1,000,000 data points was used by (29.06%) of the papers. Classification (43.84%) was the ML task most applied to on-chain data. The results confirm that ML applied to on-chain data is a relevant and a growing topic of interest both in the literature and in practice. Researchers have studied interesting use cases such as address classification, anomaly detection, cryptocurrency price prediction, performance evaluation, and smart contract vulnerability detection. Nevertheless, some open challenges and gaps remain, which can lead to future research directions. Specifically, we identify novel ML algorithms, the lack of a standardization framework, blockchain scalability issues, and cross-chain interactions as areas worth exploring in the future.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"77 1","pages":"224-271"},"PeriodicalIF":4.3,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic primitives and compositionality: An Annual Review of Information Science and Technology (ARIST) paper 语义原语与组合性：信息科学与技术年度回顾（alist）论文

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-07-31 DOI: 10.1002/asi.70011

Birger Hjørland

The term semantic primitives refers to a set of basic, atomic concepts from which all other (compound) concepts are constructed. It presupposes the principle of compositionality—the idea that complex items or expressions can be formed by combining simpler constituents. Both notions are of particular relevance to knowledge organization (KO), where concepts are understood to be the primary objects of organization in knowledge organization systems (KOS). Semantic primitives, therefore, may be viewed as candidates for foundational units in such systems. Moreover, these concepts play important roles in fields such as automatic language processing, lexicography, word sense disambiguation, and artificial intelligence. In KO, they relate to methods such as semantic factoring and facet analysis, while in linguistics they parallel componential analysis. Nevertheless, semantic primitives and compositionality remain controversial, with strong arguments both for and against their very existence. The philosophical assumptions underlying these debates have significant implications for information science and knowledge organization.

术语语义原语指的是一组基本的原子概念，所有其他（复合）概念都是由这些概念构成的。它以组合性原则为前提，即复杂的项目或表达式可以通过组合更简单的成分来形成。这两个概念都与知识组织（KO）特别相关，其中概念被理解为知识组织系统（KOS）中组织的主要对象。因此，语义原语可以看作是这类系统中基本单元的候选者。此外，这些概念在自动语言处理、词典编纂、词义消歧和人工智能等领域发挥着重要作用。在KO中，它们与语义分解和面分析等方法有关，而在语言学中，它们与成分分析并行。然而，语义原语和组合性仍然存在争议，支持和反对它们的存在都有强有力的论据。这些争论背后的哲学假设对信息科学和知识组织具有重要意义。

{"title":"Semantic primitives and compositionality: An Annual Review of Information Science and Technology (ARIST) paper","authors":"Birger Hjørland","doi":"10.1002/asi.70011","DOIUrl":"https://doi.org/10.1002/asi.70011","url":null,"abstract":"The term semantic primitives refers to a set of basic, atomic concepts from which all other (compound) concepts are constructed. It presupposes the principle of compositionality—the idea that complex items or expressions can be formed by combining simpler constituents. Both notions are of particular relevance to knowledge organization (KO), where concepts are understood to be the primary objects of organization in knowledge organization systems (KOS). Semantic primitives, therefore, may be viewed as candidates for foundational units in such systems. Moreover, these concepts play important roles in fields such as automatic language processing, lexicography, word sense disambiguation, and artificial intelligence. In KO, they relate to methods such as semantic factoring and facet analysis, while in linguistics they parallel componential analysis. Nevertheless, semantic primitives and compositionality remain controversial, with strong arguments both for and against their very existence. The philosophical assumptions underlying these debates have significant implications for information science and knowledge organization.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"77 1","pages":"198-223"},"PeriodicalIF":4.3,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Right answers to wrong questions: The dysfunctional nature of information needs 错误问题的正确答案：信息需求的功能失调

IF 4.3 2区管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the Association for Information Science and Technology

Pub Date : 2025-07-31 DOI: 10.1002/asi.70010

Melanie A. Kilian, David Elsweiler, Ian Ruthven

People frequently experience difficulties when seeking information to complete tasks. To overcome these difficulties, people require help. Regarding struggles with information needs, past research focuses on unclear information requests, such as ambiguous, under-specified, and ill-defined queries, and repairing these by user-led strategies (e.g., clarification). In an exploratory qualitative study where information clerks were interviewed, we, however, found that well-formed and seemingly reasonable requests can conceal misconceptions inquirers have (e.g., about what information is required for their current task) and, therefore, interfere with information seeking and task completion, too. Besides being more difficult to identify than unclear requests, such hidden misconceptions also undermine current user-led repair strategies as they cause inquirers to believe they are making appropriate requests. Understanding misconceptions in information seeking and requests concealing these is, therefore, essential to building more effective information systems. Our study contributes to addressing this task: It is the first to provide empirical insights into how misconceptions can negatively influence information requests, information-seeking conversations, and task completion. Ultimately, our findings highlight that inquirers' perceived information needs can present an unreliable and even counterproductive basis for task support, implying that researchers and professionals should rethink the prevailing focus on user requests in designing information systems.

人们在寻找信息以完成任务时经常遇到困难。为了克服这些困难，人们需要帮助。关于与信息需求的斗争，过去的研究侧重于不明确的信息请求，例如模棱两可、未指定和定义不清的查询，并通过用户主导的策略（例如澄清）修复这些请求。然而，在一项对信息员进行访谈的探索性定性研究中，我们发现，格式良好且看似合理的请求可能会掩盖询问者的误解（例如，关于他们当前的任务需要什么信息），因此，也会干扰信息查找和任务完成。除了比不明确的请求更难识别之外，这些隐藏的误解还会破坏当前用户主导的修复策略，因为它们会导致查询者认为他们正在提出适当的请求。因此，了解信息寻求中的误解和隐藏这些误解的要求对于建立更有效的信息系统至关重要。我们的研究有助于解决这一问题：它首次提供了关于误解如何对信息请求、信息寻求对话和任务完成产生负面影响的实证见解。最后，我们的研究结果强调，询问者感知到的信息需求可能会成为任务支持的不可靠甚至适得其反的基础，这意味着研究人员和专业人员应该重新考虑在设计信息系统时普遍关注用户请求的做法。

{"title":"Right answers to wrong questions: The dysfunctional nature of information needs","authors":"Melanie A. Kilian, David Elsweiler, Ian Ruthven","doi":"10.1002/asi.70010","DOIUrl":"https://doi.org/10.1002/asi.70010","url":null,"abstract":"People frequently experience difficulties when seeking information to complete tasks. To overcome these difficulties, people require help. Regarding struggles with information needs, past research focuses on unclear information requests, such as ambiguous, under-specified, and ill-defined queries, and repairing these by user-led strategies (e.g., clarification). In an exploratory qualitative study where information clerks were interviewed, we, however, found that well-formed and seemingly reasonable requests can conceal misconceptions inquirers have (e.g., about what information is required for their current task) and, therefore, interfere with information seeking and task completion, too. Besides being more difficult to identify than unclear requests, such hidden misconceptions also undermine current user-led repair strategies as they cause inquirers to believe they are making appropriate requests. Understanding misconceptions in information seeking and requests concealing these is, therefore, essential to building more effective information systems. Our study contributes to addressing this task: It is the first to provide empirical insights into how misconceptions can negatively influence information requests, information-seeking conversations, and task completion. Ultimately, our findings highlight that inquirers' perceived information needs can present an unreliable and even counterproductive basis for task support, implying that researchers and professionals should rethink the prevailing focus on user requests in designing information systems.","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1508-1531"},"PeriodicalIF":4.3,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0