首页 > 最新文献

Journal of the Association for Information Science and Technology最新文献

英文 中文
Death by AI: Will large language models diminish Wikipedia? 人工智能之死:大型语言模型会削弱维基百科吗?
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-03 DOI: 10.1002/asi.24975
Christian Wagner, Ling Jiang

We argue that advances in large language models (LLMs) and generative Artificial Intelligence (AI) will diminish the value of Wikipedia, due to a withdrawal by human content producers, who will withhold their efforts, perceiving less need for their efforts and increased “AI competition.” We believe the greatest threat to Wikipedia stems from the fact that Wikipedia is a user-generated product, relying on the “selfish altruism” of its human contributors. Contributors who reduce their contribution efforts as AI pervades the platform, will thus leave Wikipedia increasingly dependent on additional AI activity. This, combined with a dynamic where readership creates authorship and readers being disintermediated, will inevitably cause a vicious cycle leading to a staling of the content and diminishing value of this venerable knowledge resource.

我们认为,大型语言模型(llm)和生成式人工智能(AI)的进步将降低维基百科的价值,因为人类内容生产者会退出,他们会放弃努力,认为不需要他们的努力,并增加“人工智能竞争”。我们认为对维基百科最大的威胁来自于这样一个事实:维基百科是一个用户生成的产品,依赖于它的人类贡献者的“自私的利他主义”。随着人工智能在平台上的普及,贡献者减少了他们的贡献努力,因此维基百科将越来越依赖于额外的人工智能活动。这一点,再加上读者创造作者身份和读者被去中介化的动态,将不可避免地导致一个恶性循环,导致内容的停滞和这种可敬的知识资源的价值下降。
{"title":"Death by AI: Will large language models diminish Wikipedia?","authors":"Christian Wagner,&nbsp;Ling Jiang","doi":"10.1002/asi.24975","DOIUrl":"10.1002/asi.24975","url":null,"abstract":"<p>We argue that advances in large language models (LLMs) and generative Artificial Intelligence (AI) will diminish the value of Wikipedia, due to a withdrawal by human content producers, who will withhold their efforts, perceiving less need for their efforts and increased “AI competition.” We believe the greatest threat to Wikipedia stems from the fact that Wikipedia is a user-generated product, relying on the “selfish altruism” of its human contributors. Contributors who reduce their contribution efforts as AI pervades the platform, will thus leave Wikipedia increasingly dependent on additional AI activity. This, combined with a dynamic where readership creates authorship and readers being disintermediated, will inevitably cause a vicious cycle leading to a staling of the content and diminishing value of this venerable knowledge resource.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 5","pages":"743-751"},"PeriodicalIF":4.3,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24975","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143801501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The use of bibliometrics for ranking the all-time greatest music artists 使用文献计量学对历史上最伟大的音乐艺术家进行排名
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-30 DOI: 10.1002/asi.24976
Timothy L. Urban

This brief communication presents a novel adaptation of common bibliometric measures to provide a quantitative assessment of an artist's music catalog that incorporates both impact and productivity. Data from Billboard's weekly Hot 100™ music charts are used to rank the all-time greatest artists. Since the sorted data are increasing in value—that is, a number 1 hit is best—a transformation is applied to provide a convex, monotonically decreasing curve. Furthermore, since conventional bibliometrics result in several artists with identical measures, metrics inspired by the multidimensional h- and g-indices are used to rank the artists. We find that this approach provides a simple, yet unbiased, approach for ranking the all-time greatest music artists.

这篇简短的通讯介绍了一种对常见文献计量方法的新颖改编,以提供对艺术家的音乐目录的定量评估,其中包括影响和生产力。来自Billboard每周Hot 100™音乐排行榜的数据用于对历史上最伟大的艺术家进行排名。由于排序后的数据值是递增的(即,命中数字1是最好的),因此应用转换来提供一个凸的、单调递减的曲线。此外,由于传统文献计量学的结果是几个艺术家具有相同的衡量标准,受多维h -和g -指数启发的指标被用来对艺术家进行排名。我们发现这种方法为历史上最伟大的音乐艺术家排名提供了一种简单而公正的方法。
{"title":"The use of bibliometrics for ranking the all-time greatest music artists","authors":"Timothy L. Urban","doi":"10.1002/asi.24976","DOIUrl":"10.1002/asi.24976","url":null,"abstract":"<p>This brief communication presents a novel adaptation of common bibliometric measures to provide a quantitative assessment of an artist's music catalog that incorporates both impact and productivity. Data from <i>Billboard</i>'s weekly Hot 100™ music charts are used to rank the all-time greatest artists. Since the sorted data are increasing in value—that is, a number 1 hit is best—a transformation is applied to provide a convex, monotonically decreasing curve. Furthermore, since conventional bibliometrics result in several artists with identical measures, metrics inspired by the multidimensional <span></span><math>\u0000 <mrow>\u0000 <mi>h</mi>\u0000 </mrow></math>- and <span></span><math>\u0000 <mrow>\u0000 <mi>g</mi>\u0000 </mrow></math>-indices are used to rank the artists. We find that this approach provides a simple, yet unbiased, approach for ranking the all-time greatest music artists.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 6","pages":"843-847"},"PeriodicalIF":4.3,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study of drag-and-drop query refinement and query history visualization for mobile exploratory search 移动探索性搜索的拖放查询细化和查询历史可视化研究
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-30 DOI: 10.1002/asi.24977
Mohammad Hasan Payandeh, Orland Hoeber, Miriam Boon, Dale Storie, Veronica Ramshaw

When undertaking complex search scenarios, the underlying information need cannot be satisfied by finding a single optimal resource; instead, searchers need to engage in exploratory search processes to find multiple resources by iteratively revising and reformulation their queries. This process of query refinement is particularly challenging when using a mobile device, where typing is difficult. Furthermore, in mobile search contexts interruptions can lead to searchers losing track of what they were doing. To address these challenges, we designed a public digital library search interface for mobile devices that includes two novel features: drag-and-drop query refinement and query history visualization. To assess the value of this interface compared to a typical baseline, we conducted a controlled laboratory study with 32 participants that included pursuing complex search scenarios, being interrupted in the midst of the search, and resuming the search after the interruption. While participants took more time, they generated longer queries and reported positive subjective opinions about the usability of the exploratory search and task resumption features, along with a greater increase in certainty. These findings show the value of leveraging new touch-based interaction mechanisms within mobile search contexts, and the benefits that visualization can bring to supporting search task resumption.

在进行复杂的搜索场景时,不能通过寻找单个最优资源来满足底层信息需求;相反,搜索者需要参与探索性搜索过程,通过迭代修改和重新制定他们的查询来找到多个资源。在使用打字困难的移动设备时,查询细化的过程尤其具有挑战性。此外,在移动搜索环境中,中断可能会导致搜索者忘记他们正在做什么。为了应对这些挑战,我们为移动设备设计了一个公共数字图书馆搜索界面,其中包括两个新功能:拖放查询细化和查询历史可视化。为了评估该界面与典型基线相比的价值,我们对32名参与者进行了一项对照实验室研究,包括追求复杂的搜索场景,在搜索过程中被中断,以及在中断后恢复搜索。虽然参与者花了更多的时间,但他们提出了更长时间的问题,并对探索性搜索和任务恢复功能的可用性给出了积极的主观评价,同时确定性也有了更大的提高。这些发现显示了在移动搜索环境中利用新的基于触摸的交互机制的价值,以及可视化可以为支持搜索任务恢复带来的好处。
{"title":"A study of drag-and-drop query refinement and query history visualization for mobile exploratory search","authors":"Mohammad Hasan Payandeh,&nbsp;Orland Hoeber,&nbsp;Miriam Boon,&nbsp;Dale Storie,&nbsp;Veronica Ramshaw","doi":"10.1002/asi.24977","DOIUrl":"10.1002/asi.24977","url":null,"abstract":"<p>When undertaking complex search scenarios, the underlying information need cannot be satisfied by finding a single optimal resource; instead, searchers need to engage in exploratory search processes to find multiple resources by iteratively revising and reformulation their queries. This process of query refinement is particularly challenging when using a mobile device, where typing is difficult. Furthermore, in mobile search contexts interruptions can lead to searchers losing track of what they were doing. To address these challenges, we designed a public digital library search interface for mobile devices that includes two novel features: drag-and-drop query refinement and query history visualization. To assess the value of this interface compared to a typical baseline, we conducted a controlled laboratory study with 32 participants that included pursuing complex search scenarios, being interrupted in the midst of the search, and resuming the search after the interruption. While participants took more time, they generated longer queries and reported positive subjective opinions about the usability of the exploratory search and task resumption features, along with a greater increase in certainty. These findings show the value of leveraging new touch-based interaction mechanisms within mobile search contexts, and the benefits that visualization can bring to supporting search task resumption.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 6","pages":"848-866"},"PeriodicalIF":4.3,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24977","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spoken conversational search: Evaluating the effect of system clarifications on user experience through Wizard-of-Oz study 口语会话搜索:通过“绿野仙踪”研究评估系统说明对用户体验的影响
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-29 DOI: 10.1002/asi.24974
Souvick Ghosh, Chirag Shah

Prior research in human–computer interaction suggests that system-level clarifications are necessary for understanding user intent and communicating effectively with the user. Such clarifications or explanations could contain the system's abstract knowledge of the search or a functional description of the search process (queries and information sources employed). While these interactions may aid the user and the agent in better understanding each other, very few studies have explored the influence of such clarifications on the users' search experience. This research examines whether and how system-level clarifications (or explanations) affect the user experience when searching through spoken dialogues. We analyzed user satisfaction and preferences in systems with and without explicit clarifications in a within-subjects Wizard-of-Oz user study. We recruited 25 participants and collected user–system interaction data for 50 search sessions. The user feedback was collected using pre- and post-task surveys and exit interviews. Statistical and qualitative analysis of user responses yielded some interesting findings. While Wilcoxon Signed Rank Test found that using explicit system-level clarifications had no positive influence on the user's search experience, the overall search experience degraded with system clarifications (Z = −2.066, p = 0.04). The user interview data provided valuable insights into how and when clarifications should be offered to the user.

先前的人机交互研究表明,系统级的澄清对于理解用户意图和有效地与用户沟通是必要的。这样的说明或解释可以包含系统对搜索的抽象知识或搜索过程的功能描述(所使用的查询和信息源)。虽然这些交互可以帮助用户和代理更好地相互理解,但很少有研究探索这种澄清对用户搜索体验的影响。本研究考察了当通过口语对话进行搜索时,系统级的说明(或解释)是否以及如何影响用户体验。我们在《Wizard-of-Oz》用户研究中分析了有明确说明和没有明确说明的系统中的用户满意度和偏好。我们招募了25名参与者,并收集了50个搜索会话的用户-系统交互数据。使用任务前和任务后调查和离职访谈收集用户反馈。对用户反应的统计和定性分析产生了一些有趣的发现。虽然Wilcoxon sign Rank检验发现,使用明确的系统级澄清对用户的搜索体验没有积极影响,但总体搜索体验随着系统澄清而下降(Z =−2.066,p = 0.04)。用户访谈数据为如何以及何时向用户提供澄清提供了有价值的见解。
{"title":"Spoken conversational search: Evaluating the effect of system clarifications on user experience through Wizard-of-Oz study","authors":"Souvick Ghosh,&nbsp;Chirag Shah","doi":"10.1002/asi.24974","DOIUrl":"10.1002/asi.24974","url":null,"abstract":"<p>Prior research in human–computer interaction suggests that system-level clarifications are necessary for understanding user intent and communicating effectively with the user. Such clarifications or explanations could contain the system's abstract knowledge of the search or a functional description of the search process (queries and information sources employed). While these interactions may aid the user and the agent in better understanding each other, very few studies have explored the influence of such clarifications on the users' search experience. This research examines whether and how system-level clarifications (or explanations) affect the user experience when searching through spoken dialogues. We analyzed user satisfaction and preferences in systems with and without explicit clarifications in a within-subjects Wizard-of-Oz user study. We recruited 25 participants and collected user–system interaction data for 50 search sessions. The user feedback was collected using pre- and post-task surveys and exit interviews. Statistical and qualitative analysis of user responses yielded some interesting findings. While Wilcoxon Signed Rank Test found that using explicit system-level clarifications had no positive influence on the user's search experience, the overall search experience degraded with system clarifications (<i>Z</i> = −2.066, <i>p</i> = 0.04). The user interview data provided valuable insights into how and when clarifications should be offered to the user.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 5","pages":"819-839"},"PeriodicalIF":4.3,"publicationDate":"2024-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143801495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
College students' credibility assessments of GenAI-generated information for academic tasks: An interview study 大学生对基因人工智能生成的学术任务信息的可信度评估:一项访谈研究
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-28 DOI: 10.1002/asi.24978
Wonchan Choi, Hyerin Bak, Jiaxin An, Yan Zhang, Besiki Stvilia

The study explored college students' use of generative artificial intelligence (GenAI) tools, such as ChatGPT, for academic tasks and their perceptions and behaviors in assessing the credibility of GenAI-generated information. Semistructured interviews were conducted with 25 college students in the United States. Interview transcripts were analyzed using the qualitative content analysis method. The study identified various types of academic tasks for which students used ChatGPT, including writing, programming, and learning. Guided by two models of credibility assessment Hilligoss and Rieh (2008); Metzger (2007), six factors influencing students' motivation and ability to assess the credibility of GenAI-generated information were identified (e.g., task salience, social pressure). We also identified 9 constructs (e.g., refinedness, explainability), 5 heuristics (e.g., inter- and intrasystem consistency heuristics), and 10 cues (e.g., version and tone) used by students to assess the credibility of GenAI-generated information. This study provides theoretical and empirical findings regarding students' use of GenAI tools in the academic context and credibility evaluation of the system outputs using rich, qualitative interview data.

该研究探讨了大学生在学术任务中使用生成式人工智能(GenAI)工具(如ChatGPT)的情况,以及他们在评估GenAI生成信息可信度方面的看法和行为。对25名美国大学生进行了半结构化访谈。访谈记录采用定性内容分析法进行分析。该研究确定了学生使用ChatGPT完成的各种学术任务,包括写作、编程和学习。在两个可信度评估模型的指导下,Hilligoss和Rieh (2008);Metzger(2007),确定了影响学生评估genai生成信息可信度的动机和能力的六个因素(例如,任务显著性,社会压力)。我们还确定了学生用来评估genai生成信息可信度的9个构念(例如,精炼性、可解释性)、5个启发式(例如,系统间和系统内一致性启发式)和10个线索(例如,版本和语气)。本研究提供了关于学生在学术背景下使用GenAI工具的理论和实证研究结果,并使用丰富的定性访谈数据对系统输出进行可信度评估。
{"title":"College students' credibility assessments of GenAI-generated information for academic tasks: An interview study","authors":"Wonchan Choi,&nbsp;Hyerin Bak,&nbsp;Jiaxin An,&nbsp;Yan Zhang,&nbsp;Besiki Stvilia","doi":"10.1002/asi.24978","DOIUrl":"10.1002/asi.24978","url":null,"abstract":"<p>The study explored college students' use of generative artificial intelligence (GenAI) tools, such as ChatGPT, for academic tasks and their perceptions and behaviors in assessing the credibility of GenAI-generated information. Semistructured interviews were conducted with 25 college students in the United States. Interview transcripts were analyzed using the qualitative content analysis method. The study identified various types of academic tasks for which students used ChatGPT, including writing, programming, and learning. Guided by two models of credibility assessment Hilligoss and Rieh (2008); Metzger (2007), six factors influencing students' motivation and ability to assess the credibility of GenAI-generated information were identified (e.g., task salience, social pressure). We also identified 9 constructs (e.g., refinedness, explainability), 5 heuristics (e.g., inter- and intrasystem consistency heuristics), and 10 cues (e.g., version and tone) used by students to assess the credibility of GenAI-generated information. This study provides theoretical and empirical findings regarding students' use of GenAI tools in the academic context and credibility evaluation of the system outputs using rich, qualitative interview data.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 6","pages":"867-883"},"PeriodicalIF":4.3,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using the S-DIKW framework to transform data visualization into data storytelling 使用S-DIKW框架将数据可视化转换为数据叙述
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-24 DOI: 10.1002/asi.24973
Angelica Lo Duca, Kate McDowell

Communicating insights from data effectively requires design skills, technical knowledge, and experience. Data must be accurately represented with aesthetically pleasing visuals and engaging text to effectively communicate to the intended audience. Data storytelling has received much attention lately, but as of yet, it does not have a theoretical and practical foundation in information science. A data story adds context, narrative, and structure to the visual representation of data, providing audiences with character, plot, and a holistic experience of narrative. This paper proposes a methodological approach to transform a data visualization into a data story based on the Data-Information-Knowledge-Wisdom (DIKW) pyramid and the S-DIKW Framework. Starting from the bottom of the pyramid, the proposed approach defines a strategy to represent insights extracted from data. Data is then turned into information by identifying character(s) facing a problem, adding textual and graphic content; information is turned into knowledge by organizing what happens as a plot. Finally, a call to wise action—always informed by cultural and community values—completes the storytelling transformation to create a data story. This article contributes to the theoretical understanding of data stories as emerging information forms, supporting richer understandings of a story as information in the information sciences.

有效地从数据中传达见解需要设计技能、技术知识和经验。数据必须用美观的视觉效果和引人入胜的文本准确地表示,以便有效地与目标受众进行沟通。数据讲故事最近受到了很多关注,但到目前为止,它在信息科学中还没有理论和实践基础。数据故事将背景、叙事和结构添加到数据的视觉表现中,为观众提供人物、情节和叙事的整体体验。本文提出了一种基于数据-信息-知识-智慧(data - information - knowledge - wisdom, DIKW)金字塔和S-DIKW框架将数据可视化转化为数据故事的方法方法。从金字塔的底部开始,提出的方法定义了一种策略来表示从数据中提取的见解。然后,通过识别面临问题的字符,添加文本和图形内容,将数据转化为信息;通过将发生的事情组织成一个情节,信息变成了知识。最后,呼吁采取明智的行动——总是在文化和社区价值观的指导下——完成了从讲故事到创造数据故事的转变。本文有助于从理论上理解作为新兴信息形式的数据故事,支持将故事作为信息科学中的信息进行更丰富的理解。
{"title":"Using the S-DIKW framework to transform data visualization into data storytelling","authors":"Angelica Lo Duca,&nbsp;Kate McDowell","doi":"10.1002/asi.24973","DOIUrl":"10.1002/asi.24973","url":null,"abstract":"<p>Communicating insights from data effectively requires design skills, technical knowledge, and experience. Data must be accurately represented with aesthetically pleasing visuals and engaging text to effectively communicate to the intended audience. Data storytelling has received much attention lately, but as of yet, it does not have a theoretical and practical foundation in information science. A data story adds context, narrative, and structure to the visual representation of data, providing audiences with character, plot, and a holistic experience of narrative. This paper proposes a methodological approach to transform a data visualization into a data story based on the Data-Information-Knowledge-Wisdom (DIKW) pyramid and the S-DIKW Framework. Starting from the bottom of the pyramid, the proposed approach defines a strategy to represent insights extracted from data. Data is then turned into information by identifying character(s) facing a problem, adding textual and graphic content; information is turned into knowledge by organizing what happens as a plot. Finally, a call to wise action—always informed by cultural and community values—completes the storytelling transformation to create a data story. This article contributes to the theoretical understanding of data stories as emerging information forms, supporting richer understandings of a story as information in the information sciences.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 5","pages":"803-818"},"PeriodicalIF":4.3,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24973","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143801665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolution of the “long-tail” concept for scientific data: An Annual Review of Information Science and Technology (ARIST) paper 科学数据“长尾”概念的演变:信息科学与技术年度回顾(alist)论文
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-15 DOI: 10.1002/asi.24967
Gretchen R. Stahlman, Inna Kouper

This review paper explores the evolution of discussions about “long-tail” scientific data in the scholarly literature. The “long-tail” concept, originally used to explain trends in digital consumer goods, was first applied to scientific data in 2007 to refer to a vast array of smaller, heterogeneous data collections that cumulatively represent a substantial portion of scientific knowledge. However, these datasets, often referred to as “long-tail data,” are frequently mismanaged or overlooked due to inadequate data management practices and institutional support. This paper examines the changing landscape of discussions about long-tail data over time, situated within broader ecosystems of research data management and the natural interplay between “big” and “small” data. The review also bridges discussions on data curation in Library & Information Science (LIS) and domain-specific contexts, contributing to a more comprehensive understanding of the long-tail concept's utility for effective data management outcomes. The review aims to provide a more comprehensive understanding of this concept, its terminological diversity in the literature, and its utility for guiding data management, overall informing current and future information science research and practice.

本综述探讨了学术文献中关于“长尾”科学数据讨论的演变。“长尾”概念最初用于解释数字消费品的趋势,2007年首次应用于科学数据,指的是大量较小的、异构的数据集合,这些数据集合累积起来代表了科学知识的很大一部分。然而,由于数据管理实践和机构支持不足,这些通常被称为“长尾数据”的数据集经常管理不善或被忽视。本文考察了长尾数据讨论随着时间的推移而发生的变化,它位于更广泛的研究数据管理生态系统中,以及“大”数据和“小”数据之间的自然相互作用。该综述还将图书馆与信息科学(LIS)和特定领域背景下的数据管理讨论联系起来,有助于更全面地理解长尾概念对有效数据管理结果的效用。这篇综述的目的是提供一个更全面的理解这个概念,它在文献中的术语多样性,它的实用性指导数据管理,全面告知当前和未来的信息科学研究和实践。
{"title":"Evolution of the “long-tail” concept for scientific data: An Annual Review of Information Science and Technology (ARIST) paper","authors":"Gretchen R. Stahlman,&nbsp;Inna Kouper","doi":"10.1002/asi.24967","DOIUrl":"https://doi.org/10.1002/asi.24967","url":null,"abstract":"<p>This review paper explores the evolution of discussions about “long-tail” scientific data in the scholarly literature. The “long-tail” concept, originally used to explain trends in digital consumer goods, was first applied to scientific data in 2007 to refer to a vast array of smaller, heterogeneous data collections that cumulatively represent a substantial portion of scientific knowledge. However, these datasets, often referred to as “long-tail data,” are frequently mismanaged or overlooked due to inadequate data management practices and institutional support. This paper examines the changing landscape of discussions about long-tail data over time, situated within broader ecosystems of research data management and the natural interplay between “big” and “small” data. The review also bridges discussions on data curation in Library &amp; Information Science (LIS) and domain-specific contexts, contributing to a more comprehensive understanding of the long-tail concept's utility for effective data management outcomes. The review aims to provide a more comprehensive understanding of this concept, its terminological diversity in the literature, and its utility for guiding data management, overall informing current and future information science research and practice.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"77 1","pages":"3-22"},"PeriodicalIF":4.3,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond decomposition: Hierarchical dependency management in multi-document question answering 超越分解:多文档问答中的分层依赖管理
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-13 DOI: 10.1002/asi.24971
Xiaoyan Zheng, Zhi Li, Qianglong Chen, Yin Zhang

When using retrieval-augmented generation (RAG) to handle multi-document question answering (MDQA) tasks, it is beneficial to decompose complex queries into multiple simpler ones to enhance retrieval results. However, previous strategies always employ a one-shot approach of question decomposition, overlooking subquestions dependency problem and failing to ensure that the derived subqueries are single-hop. To overcome this challenge, we introduce a novel framework called DSRC-QCS. Decompose-solve-renewal-cycle (DSRC) is an iterative multi-hop question processing module. The key idea of DSRC involves using a unique symbol to achieve hierarchical dependency management and employing a cyclical process of question decomposition, solving, and renewal to continuously generate and resolve all single-hop subquestions. Query-chain selector (QCS) functions as a voting mechanism that effectively utilizes the reasoning process of DSRC to assess and select solutions. We compare DSRC-QCS against five RAG approaches across three datasets and three LLMs. DSRC-QCS demonstrates superior performance. Compared to the Direct Retrieval method, DSRC-QCS improves the average F1 score by 17.36% with Alpaca-7b, 10.83% with LLaMa2-Chat-7b, and 11.88% with GPT-3.5-Turbo. We also conduct ablation studies to validate the performance of both DSRC and QCS and explore factors influencing the effectiveness of DSRC. We have included all prompts in the Appendix.

在使用检索增强生成(RAG)处理多文档问答(MDQA)任务时,将复杂的查询分解为多个更简单的查询有助于提高检索结果。然而,以前的策略总是采用一次性的问题分解方法,忽略了子问题的依赖性问题,不能确保派生的子查询是单跳的。为了克服这一挑战,我们引入了一个名为DSRC-QCS的新框架。分解-求解-更新循环(DSRC)是一个迭代的多跳问题处理模块。DSRC的核心思想是使用唯一的符号来实现分层依赖管理,并采用问题分解、求解和更新的循环过程来连续地生成和解决所有单跳子问题。查询链选择器(Query-chain selector, QCS)作为一种投票机制,有效地利用DSRC的推理过程来评估和选择解决方案。我们在三个数据集和三个llm中比较了DSRC-QCS与五种RAG方法。DSRC-QCS性能优越。与直接检索方法相比,DSRC-QCS对Alpaca-7b、LLaMa2-Chat-7b和GPT-3.5-Turbo的平均F1分数分别提高了17.36%、10.83%和11.88%。我们还进行了消融研究,以验证DSRC和QCS的性能,并探讨影响DSRC有效性的因素。我们在附录中包含了所有提示。
{"title":"Beyond decomposition: Hierarchical dependency management in multi-document question answering","authors":"Xiaoyan Zheng,&nbsp;Zhi Li,&nbsp;Qianglong Chen,&nbsp;Yin Zhang","doi":"10.1002/asi.24971","DOIUrl":"10.1002/asi.24971","url":null,"abstract":"<p>When using retrieval-augmented generation (RAG) to handle multi-document question answering (MDQA) tasks, it is beneficial to decompose complex queries into multiple simpler ones to enhance retrieval results. However, previous strategies always employ a one-shot approach of question decomposition, overlooking subquestions dependency problem and failing to ensure that the derived subqueries are single-hop. To overcome this challenge, we introduce a novel framework called DSRC-QCS. Decompose-solve-renewal-cycle (DSRC) is an iterative multi-hop question processing module. The key idea of DSRC involves using a unique symbol to achieve hierarchical dependency management and employing a cyclical process of question decomposition, solving, and renewal to continuously generate and resolve all single-hop subquestions. Query-chain selector (QCS) functions as a voting mechanism that effectively utilizes the reasoning process of DSRC to assess and select solutions. We compare DSRC-QCS against five RAG approaches across three datasets and three LLMs. DSRC-QCS demonstrates superior performance. Compared to the Direct Retrieval method, DSRC-QCS improves the average F1 score by 17.36% with Alpaca-7b, 10.83% with LLaMa2-Chat-7b, and 11.88% with GPT-3.5-Turbo. We also conduct ablation studies to validate the performance of both DSRC and QCS and explore factors influencing the effectiveness of DSRC. We have included all prompts in the Appendix.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 5","pages":"770-789"},"PeriodicalIF":4.3,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143801466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic algorithmic awareness based on FAT evaluation: Heuristic intervention and multidimensional prediction 基于FAT评价的动态算法感知:启发式干预与多维预测
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-12-06 DOI: 10.1002/asi.24969
Jing Liu, Dan Wu, Guoye Sun, Yuyang Deng

As the widespread use of algorithms and artificial intelligence (AI) technologies, understanding the interaction process of human–algorithm interaction becomes increasingly crucial. From the human perspective, algorithmic awareness is recognized as a significant factor influencing how users evaluate algorithms and engage with them. In this study, a formative study identified four dimensions of algorithmic awareness: conceptions awareness (AC), data awareness (AD), functions awareness (AF), and risks awareness (AR). Subsequently, we implemented a heuristic intervention and collected data on users' algorithmic awareness and FAT (fairness, accountability, and transparency) evaluation in both pre-test and post-test stages (N = 622). We verified the dynamics of algorithmic awareness and FAT evaluation through fuzzy clustering and identified three patterns of FAT evaluation changes: “Stable high rating pattern,” “Variable medium rating pattern,” and “Unstable low rating pattern.” Using the clustering results and FAT evaluation scores, we trained classification models to predict different dimensions of algorithmic awareness by applying different machine learning techniques, namely Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and XGBoost (XGB). Comparatively, experimental results show that the SVM algorithm accomplishes the task of predicting the four dimensions of algorithmic awareness with better results and interpretability. Its F1 scores are 0.6377, 0.6780, 0.6747, and 0.75. These findings hold great potential for informing human-centered algorithmic practices and HCI design.

随着算法和人工智能(AI)技术的广泛应用,理解人-算法交互的交互过程变得越来越重要。从人类的角度来看,算法意识被认为是影响用户如何评估算法并与之互动的重要因素。在本研究中,形成性研究确定了算法意识的四个维度:概念意识(AC)、数据意识(AD)、功能意识(AF)和风险意识(AR)。随后,我们实施了启发式干预,并收集了用户在测试前和测试后阶段的算法意识和FAT(公平性、问责性和透明度)评估数据(N = 622)。我们通过模糊聚类验证了算法认知和FAT评价的动态,确定了FAT评价变化的三种模式:“稳定的高评级模式”、“可变的中等评级模式”和“不稳定的低评级模式”。利用聚类结果和FAT评价分数,我们使用不同的机器学习技术,即逻辑回归(LR)、随机森林(RF)、支持向量机(SVM)、线性判别分析(LDA)和XGBoost (XGB),训练分类模型来预测算法意识的不同维度。相比之下,实验结果表明,SVM算法完成了算法感知四个维度的预测任务,具有较好的结果和可解释性。其F1得分分别为0.6377、0.6780、0.6747、0.75。这些发现对于指导以人为中心的算法实践和HCI设计具有巨大的潜力。
{"title":"Dynamic algorithmic awareness based on FAT evaluation: Heuristic intervention and multidimensional prediction","authors":"Jing Liu,&nbsp;Dan Wu,&nbsp;Guoye Sun,&nbsp;Yuyang Deng","doi":"10.1002/asi.24969","DOIUrl":"10.1002/asi.24969","url":null,"abstract":"<p>As the widespread use of algorithms and artificial intelligence (AI) technologies, understanding the interaction process of human–algorithm interaction becomes increasingly crucial. From the human perspective, algorithmic awareness is recognized as a significant factor influencing how users evaluate algorithms and engage with them. In this study, a formative study identified four dimensions of algorithmic awareness: conceptions awareness (AC), data awareness (AD), functions awareness (AF), and risks awareness (AR). Subsequently, we implemented a heuristic intervention and collected data on users' algorithmic awareness and FAT (fairness, accountability, and transparency) evaluation in both pre-test and post-test stages (<i>N</i> = 622). We verified the dynamics of algorithmic awareness and FAT evaluation through fuzzy clustering and identified three patterns of FAT evaluation changes: “Stable high rating pattern,” “Variable medium rating pattern,” and “Unstable low rating pattern.” Using the clustering results and FAT evaluation scores, we trained classification models to predict different dimensions of algorithmic awareness by applying different machine learning techniques, namely Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and XGBoost (XGB). Comparatively, experimental results show that the SVM algorithm accomplishes the task of predicting the four dimensions of algorithmic awareness with better results and interpretability. Its F1 scores are 0.6377, 0.6780, 0.6747, and 0.75. These findings hold great potential for informing human-centered algorithmic practices and HCI design.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 4","pages":"718-739"},"PeriodicalIF":4.3,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143622598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Science for whom? The influence of the regional academic circuit on gender inequalities in Latin America 科学对谁有利?区域学术圈对拉丁美洲性别不平等的影响
IF 4.3 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-11-27 DOI: 10.1002/asi.24972
Carolina Pradier, Diego Kozlowski, Natsumi S. Shokida, Vincent Larivière

The Latin-American scientific community has achieved significant progress towards gender parity, with nearly equal representation of women and men scientists. Nevertheless, women continue to be underrepresented in scholarly communication. Throughout the 20th century, Latin America established its academic circuit, focusing on research topics of regional significance. Through an analysis of scientific publications, this article explores the relationship between gender inequalities in science and the integration of Latin-American researchers into the regional and global academic circuits between 1993 and 2022. We find that women are more likely to engage in the regional circuit, while men are more active within the global circuit. This trend is attributed to a thematic alignment between women's research interests and issues specific to Latin America. Furthermore, our results reveal that the mechanisms contributing to gender differences in symbolic capital accumulation vary between circuits. Women's work achieves equal or greater recognition compared to men's within the regional circuit, but generally garners less attention in the global circuit. Our findings suggest that policies aimed at strengthening the regional academic circuit would encourage scientists to address locally relevant topics while simultaneously fostering gender equality in science.

拉丁美洲科学界在实现性别均等方面取得了重大进展,男女科学家的比例几乎相等。然而,妇女在学术交流中的代表性仍然不足。在整个 20 世纪,拉丁美洲建立了自己的学术回路,重点关注具有地区意义的研究课题。本文通过对科学出版物的分析,探讨了 1993 至 2022 年间科学领域的性别不平等与拉美研究人员融入地区和全球学术圈之间的关系。我们发现,女性更有可能参与地区学术圈,而男性在全球学术圈中更为活跃。这一趋势可归因于女性的研究兴趣与拉丁美洲特有问题之间的主题一致性。此外,我们的研究结果表明,导致象征性资本积累中性别差异的机制在不同的环路中有所不同。与男性相比,女性的工作在地区范围内获得了同等或更高的认可,但在全球范围内,女性的工作通常获得的关注较少。我们的研究结果表明,旨在加强区域学术循环的政策将鼓励科学家解决与当地相关的课题,同时促进科学领域的性别平等。
{"title":"Science for whom? The influence of the regional academic circuit on gender inequalities in Latin America","authors":"Carolina Pradier,&nbsp;Diego Kozlowski,&nbsp;Natsumi S. Shokida,&nbsp;Vincent Larivière","doi":"10.1002/asi.24972","DOIUrl":"10.1002/asi.24972","url":null,"abstract":"<p>The Latin-American scientific community has achieved significant progress towards gender parity, with nearly equal representation of women and men scientists. Nevertheless, women continue to be underrepresented in scholarly communication. Throughout the 20th century, Latin America established its academic circuit, focusing on research topics of regional significance. Through an analysis of scientific publications, this article explores the relationship between gender inequalities in science and the integration of Latin-American researchers into the regional and global academic circuits between 1993 and 2022. We find that women are more likely to engage in the regional circuit, while men are more active within the global circuit. This trend is attributed to a thematic alignment between women's research interests and issues specific to Latin America. Furthermore, our results reveal that the mechanisms contributing to gender differences in symbolic capital accumulation vary between circuits. Women's work achieves equal or greater recognition compared to men's within the regional circuit, but generally garners less attention in the global circuit. Our findings suggest that policies aimed at strengthening the regional academic circuit would encourage scientists to address locally relevant topics while simultaneously fostering gender equality in science.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 5","pages":"790-802"},"PeriodicalIF":4.3,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24972","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143801322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Association for Information Science and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1