Searching across diverse information platforms, such as digital humanities archives, academic digital libraries, and encyclopedias, poses challenges in managing the queries issued to each platform and synthesizing the resources discovered. While search result aggregation interfaces address this problem, how best to present the search results from different platforms in the search engine results page remains an open question. In this research, we implemented three common approaches and developed a new technique for aggregating search results across three platforms: Europeana, our University's academic library, and Wikipedia. The three common approaches (1) use tabs to switch between the platforms, (2) interleave results from each platform producing a single list, and (3) use a bento box approach to group results from each platform. The new technique organizes the search results into thematic clusters irrespective of their source platform. We designed a controlled laboratory study using a within-subjects design and exploratory search tasks conducted in the context of digital humanities searching. We collected data from 32 student participants, focusing on utility, perceived value, and diversity of saved resources. This study provides evidence that thematic clustering can be a beneficial aggregation approach, opening opportunities for studying different ways of representing and visualizing aggregated search results.
{"title":"A study of search result aggregation approaches for the digital humanities","authors":"Milad Momeni, Orland Hoeber","doi":"10.1002/asi.70006","DOIUrl":"https://doi.org/10.1002/asi.70006","url":null,"abstract":"<p>Searching across diverse information platforms, such as digital humanities archives, academic digital libraries, and encyclopedias, poses challenges in managing the queries issued to each platform and synthesizing the resources discovered. While search result aggregation interfaces address this problem, how best to present the search results from different platforms in the search engine results page remains an open question. In this research, we implemented three common approaches and developed a new technique for aggregating search results across three platforms: Europeana, our University's academic library, and Wikipedia. The three common approaches (1) use tabs to switch between the platforms, (2) interleave results from each platform producing a single list, and (3) use a bento box approach to group results from each platform. The new technique organizes the search results into thematic clusters irrespective of their source platform. We designed a controlled laboratory study using a within-subjects design and exploratory search tasks conducted in the context of digital humanities searching. We collected data from 32 student participants, focusing on utility, perceived value, and diversity of saved resources. This study provides evidence that thematic clustering can be a beneficial aggregation approach, opening opportunities for studying different ways of representing and visualizing aggregated search results.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1488-1507"},"PeriodicalIF":4.3,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mengjia Wu, Gunnar Sivertsen, Lin Zhang, Fan Qi, Yi Zhang
The classification of research according to its aims has been a longstanding focus in the fields of quantitative science studies and R&D statistics. Since 1963, the Organization for Economic Co-operation and Development (OECD) has employed a classical distinction among basic, applied, and experimental research. Building on this framework, our previous work highlighted the utility of differentiating between scientific and societal progress as two primary research objectives. This distinction enabled the quantitative analysis of scientific publication abstracts and the development of an automated method for large-scale classification. In the current study, we systematically evaluate text classification techniques, including traditional text mining models, classification tools, BERT-based language models, and decoder-only large language models (LLMs) such as ChatGPT. Our findings show that the fine-tuned GPT-4o-mini model performs the best among single-model approaches. However, traditional and BERT-based models outperform in certain fine-grained classification tasks. Leveraging majority voting strategies to incorporate their strengths yields performance comparable to closed-source GPT models. A case study on 10 biomedical journals further validates the method, demonstrating strong alignment between journal scopes, model predictions, and outputs generated by the fine-tuned GPT-4o-mini model. These results highlight the robustness and practical effectiveness of the proposed methodology for nuanced research aim classification.
{"title":"Scaling research aim identification: Language models for classifying scientific and societal-oriented studies","authors":"Mengjia Wu, Gunnar Sivertsen, Lin Zhang, Fan Qi, Yi Zhang","doi":"10.1002/asi.70004","DOIUrl":"https://doi.org/10.1002/asi.70004","url":null,"abstract":"<p>The classification of research according to its aims has been a longstanding focus in the fields of quantitative science studies and R&D statistics. Since 1963, the Organization for Economic Co-operation and Development (OECD) has employed a classical distinction among basic, applied, and experimental research. Building on this framework, our previous work highlighted the utility of differentiating between scientific and societal progress as two primary research objectives. This distinction enabled the quantitative analysis of scientific publication abstracts and the development of an automated method for large-scale classification. In the current study, we systematically evaluate text classification techniques, including traditional text mining models, classification tools, BERT-based language models, and decoder-only large language models (LLMs) such as ChatGPT. Our findings show that the fine-tuned GPT-4o-mini model performs the best among single-model approaches. However, traditional and BERT-based models outperform in certain fine-grained classification tasks. Leveraging majority voting strategies to incorporate their strengths yields performance comparable to closed-source GPT models. A case study on 10 biomedical journals further validates the method, demonstrating strong alignment between journal scopes, model predictions, and outputs generated by the fine-tuned GPT-4o-mini model. These results highlight the robustness and practical effectiveness of the proposed methodology for nuanced research aim classification.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1470-1487"},"PeriodicalIF":4.3,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Novelty is a crucial criterion in the peer-review process for evaluating academic papers. Traditionally, it is judged by experts or measured by unique reference combinations. Both methods have limitations: experts have limited knowledge, and the effectiveness of the combination method is uncertain. Moreover, it is unclear if unique citations truly measure novelty. The large language model (LLM) possesses a wealth of knowledge, while human experts possess judgment abilities that the LLM does not possess. Therefore, our research integrates the knowledge and abilities of LLM and human experts to address the limitations of novelty assessment. One of the most common types of novelty in academic papers is the introduction of new methods. In this paper, we propose leveraging human knowledge and LLM to assist pre-trained language models (PLMs, e.g., BERT, etc.) in predicting the method novelty of papers. Specifically, we extract sentences related to the novelty of the academic paper from peer-review reports and use LLM to summarize the methodology section of the academic paper, which are then used to fine-tune PLMs. In addition, we have designed a text-guided fusion module with novel Sparse-Attention to better integrate human and LLM knowledge. We compared the method we proposed with a large number of baselines. Extensive experiments demonstrate that our method achieves superior performance.
{"title":"Automated novelty evaluation of academic paper: A collaborative approach integrating human and large language model knowledge","authors":"Wenqing Wu, Chengzhi Zhang, Yi Zhao","doi":"10.1002/asi.70005","DOIUrl":"https://doi.org/10.1002/asi.70005","url":null,"abstract":"<p>Novelty is a crucial criterion in the peer-review process for evaluating academic papers. Traditionally, it is judged by experts or measured by unique reference combinations. Both methods have limitations: experts have limited knowledge, and the effectiveness of the combination method is uncertain. Moreover, it is unclear if unique citations truly measure novelty. The large language model (LLM) possesses a wealth of knowledge, while human experts possess judgment abilities that the LLM does not possess. Therefore, our research integrates the knowledge and abilities of LLM and human experts to address the limitations of novelty assessment. One of the most common types of novelty in academic papers is the introduction of new methods. In this paper, we propose leveraging human knowledge and LLM to assist pre-trained language models (PLMs, e.g., BERT, etc.) in predicting the method novelty of papers. Specifically, we extract sentences related to the novelty of the academic paper from peer-review reports and use LLM to summarize the methodology section of the academic paper, which are then used to fine-tune PLMs. In addition, we have designed a text-guided fusion module with novel Sparse-Attention to better integrate human and LLM knowledge. We compared the method we proposed with a large number of baselines. Extensive experiments demonstrate that our method achieves superior performance.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1452-1469"},"PeriodicalIF":4.3,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Wakeling, Monica Lestari Paramita, Stephen Pinfield
It has long been recognized that there are issues with the appropriateness of citations in the academic literature. Citations of sources that do not support the statement they are cited against are known as quotation errors, and there have been many previous studies of their prevalence. The vast majority of these studies rely on researchers evaluating the accuracy of citations in a small sample of the literature, and show large variation in quotation error rates. In this article we report a novel approach to assessing quotation accuracy via an online survey in which 2648 corresponding authors of articles evaluated a real-world citation of their work. Respondents were also asked to categorize the perceived purpose of the citation, and what action, if any, they take when encountering inaccurate citations of their work. We found a quotation error rate of 16.6%, with no significant difference across academic disciplines, suggesting that variation in previous studies may be a result of methodological differences. Only 11.3% of respondents indicated they had taken action after encountering an inaccurate citation of their work. This work reveals reasons contributing to inaccurate quotations and issues with citation practices, and offers suggestions of areas for future research.
{"title":"How do authors perceive the way their work is cited? Findings from a large-scale survey on quotation accuracy","authors":"Simon Wakeling, Monica Lestari Paramita, Stephen Pinfield","doi":"10.1002/asi.70000","DOIUrl":"10.1002/asi.70000","url":null,"abstract":"<p>It has long been recognized that there are issues with the appropriateness of citations in the academic literature. Citations of sources that do not support the statement they are cited against are known as quotation errors, and there have been many previous studies of their prevalence. The vast majority of these studies rely on researchers evaluating the accuracy of citations in a small sample of the literature, and show large variation in quotation error rates. In this article we report a novel approach to assessing quotation accuracy via an online survey in which 2648 corresponding authors of articles evaluated a real-world citation of their work. Respondents were also asked to categorize the perceived purpose of the citation, and what action, if any, they take when encountering inaccurate citations of their work. We found a quotation error rate of 16.6%, with no significant difference across academic disciplines, suggesting that variation in previous studies may be a result of methodological differences. Only 11.3% of respondents indicated they had taken action after encountering an inaccurate citation of their work. This work reveals reasons contributing to inaccurate quotations and issues with citation practices, and offers suggestions of areas for future research.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 10","pages":"1396-1410"},"PeriodicalIF":4.3,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70000","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145062682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To unravel the linguistic dynamics of science communication on social media, this study presents a large-scale, cross-disciplinary analysis of language use in over 21 million Twitter mentions of 6.7 million scientific publications. While English dominates—accounting for 90.8% of all mentions and serving as a bridging language for the international dissemination of research—90 non-English languages contribute to a rich and diverse multilingual ecosystem. A strong alignment is observed between the language of non-English publications and their corresponding Twitter mentions, particularly for languages such as Japanese and Spanish, reflecting linguistic proximity and regional engagement. Importantly, non-English tweets achieve user engagement levels comparable to those written in English, whereas tweets lacking meaningful textual content consistently receive lower interaction. These findings highlight the inherently multilingual nature of science communication on Twitter and underscore the importance of incorporating non-English activities into altmetric analyses to ensure a more inclusive and equitable understanding of global scientific discourse.
{"title":"The Tower of Babel in science communication on social media: An analysis of linguistic diversity in Twitter mentions of scientific publications","authors":"Yanqing Zhang, Zhichao Fang","doi":"10.1002/asi.70002","DOIUrl":"https://doi.org/10.1002/asi.70002","url":null,"abstract":"<p>To unravel the linguistic dynamics of science communication on social media, this study presents a large-scale, cross-disciplinary analysis of language use in over 21 million Twitter mentions of 6.7 million scientific publications. While English dominates—accounting for 90.8% of all mentions and serving as a bridging language for the international dissemination of research—90 non-English languages contribute to a rich and diverse multilingual ecosystem. A strong alignment is observed between the language of non-English publications and their corresponding Twitter mentions, particularly for languages such as Japanese and Spanish, reflecting linguistic proximity and regional engagement. Importantly, non-English tweets achieve user engagement levels comparable to those written in English, whereas tweets lacking meaningful textual content consistently receive lower interaction. These findings highlight the inherently multilingual nature of science communication on Twitter and underscore the importance of incorporating non-English activities into altmetric analyses to ensure a more inclusive and equitable understanding of global scientific discourse.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 11","pages":"1431-1451"},"PeriodicalIF":4.3,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classification schemes are a key way of organizing bibliographic knowledge, yet the way that classification schemes communicate their information to classifiers receives little attention. This article takes a novel approach by exploring the visual aspects contained within classification schemes. The research uses a classification scheme analysis methodology. Three different classification scheme phenomena are discussed in terms of their visualization: hierarchy, notation, and notes. Indentation is found to be a significant—and implicit—method of communicating hierarchy to classifiers and offers intriguing solutions to the issues of transmuting from two dimensions into one. The visual elements of notation reveal a strong separation between notation and class, while the visual elements of notes illuminate a varying narrative around the position of notes in the classification scheme. A categorization system for visual elements in classification schemes is presented. Model 1 proffers visual elements as a fourth plane of classification, which extends and remodels Ranganathan's Three Planes of Work. Model 2 shows how visual elements could fit into classification scheme versioning. Ultimately, looking at visual aspects of classification schemes is a novel way of thinking about knowledge organization and can help us to better understand—and ultimately, to better use—classification schemes.
{"title":"The visual, the textual, and the one-dimensional: An exploration of the visual elements of bibliographic classification schemes","authors":"Deborah Lee","doi":"10.1002/asi.70001","DOIUrl":"10.1002/asi.70001","url":null,"abstract":"<p>Classification schemes are a key way of organizing bibliographic knowledge, yet the way that classification schemes communicate their information to classifiers receives little attention. This article takes a novel approach by exploring the visual aspects contained within classification schemes. The research uses a classification scheme analysis methodology. Three different classification scheme phenomena are discussed in terms of their visualization: hierarchy, notation, and notes. Indentation is found to be a significant—and implicit—method of communicating hierarchy to classifiers and offers intriguing solutions to the issues of transmuting from two dimensions into one. The visual elements of notation reveal a strong separation between notation and class, while the visual elements of notes illuminate a varying narrative around the position of notes in <i>the classification scheme</i>. A categorization system for visual elements in classification schemes is presented. Model 1 proffers visual elements as a fourth plane of classification, which extends and remodels Ranganathan's <i>Three Planes of Work</i>. Model 2 shows how visual elements could fit into classification scheme versioning. Ultimately, looking at visual aspects of classification schemes is a novel way of thinking about knowledge organization and can help us to better understand—and ultimately, to better use—classification schemes.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 10","pages":"1411-1427"},"PeriodicalIF":4.3,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145062742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiawei Xu, Zhihan Zheng, Chao Min, Win-bin Huang, Yi Bu
While facilitating science, interdisciplinary research (IDR) has a heavier cognitive burden for researchers compared to unidisciplinary research (UDR). Yet, little has been known about patterns of knowledge integration and diffusion structures of IDR. Here we adopt a causal inference strategy, namely propensity score matching, with all journal publications in 2005 in Microsoft Academic Graph to better understand the IDR effect in various research fields. We use the diversity of reference fields of one paper as the proxy of the paper's interdisciplinarity and estimate the effect of a research article being IDR on its knowledge integration and diffusion measured by its high-order citation/reference cascade. We find that, in disciplines where IDR articles are less popular, such as mathematics, physics, and chemistry, IDR needs a more extensive knowledge base than UDR to gain a similar number of citations. In disciplines where IDR articles are more popular, for example, psychology, geology, biology, and economics, a small knowledge base is enough for a high-impact IDR article. As to knowledge diffusion, no matter whether IDR or UDR, a more extensive knowledge base leads to stronger knowledge diffusion ability. Findings imply potential drawbacks of pure interdisciplinarity-oriented research policy; rather, the establishment of policies may vary across disciplines.
{"title":"Knowledge integration and diffusion structures of interdisciplinary research: A large-scale analysis based on propensity score matching","authors":"Jiawei Xu, Zhihan Zheng, Chao Min, Win-bin Huang, Yi Bu","doi":"10.1002/asi.25014","DOIUrl":"10.1002/asi.25014","url":null,"abstract":"<p>While facilitating science, interdisciplinary research (IDR) has a heavier cognitive burden for researchers compared to unidisciplinary research (UDR). Yet, little has been known about patterns of knowledge integration and diffusion structures of IDR. Here we adopt a causal inference strategy, namely propensity score matching, with all journal publications in 2005 in Microsoft Academic Graph to better understand the IDR effect in various research fields. We use the diversity of reference fields of one paper as the proxy of the paper's interdisciplinarity and estimate the effect of a research article being IDR on its knowledge integration and diffusion measured by its high-order citation/reference cascade. We find that, in disciplines where IDR articles are less popular, such as mathematics, physics, and chemistry, IDR needs a more extensive knowledge base than UDR to gain a similar number of citations. In disciplines where IDR articles are more popular, for example, psychology, geology, biology, and economics, a small knowledge base is enough for a high-impact IDR article. As to knowledge diffusion, no matter whether IDR or UDR, a more extensive knowledge base leads to stronger knowledge diffusion ability. Findings imply potential drawbacks of pure interdisciplinarity-oriented research policy; rather, the establishment of policies may vary across disciplines.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 9","pages":"1210-1226"},"PeriodicalIF":4.3,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144923610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given that historical maps (HM) are represented by a complex network of symbols, their semantics cannot be easily and directly understood. To extract the embedded knowledge, scholars have developed semantic organization for different types of HM. However, the construction of semantic organization for HM is challenging due to problems of semantic clutter, semantic loss, and semantic ambiguity. To resolve these problems, this paper proposes a semantic organization system which includes classification, representation, and association mechanisms for HM. The intent is to achieve semantic ordering, semantic enhancement, and semantic association. As a means to verify the proposed semantic organization system, this paper develops an HM knowledge question and answer (Q&A) system. Experimental results show that the Q&A system outperformed Baidu (Wenxinyiyan) and GPT-4o in terms of precision and recall.
{"title":"Semantic organization for historical maps: Classification, representation, association","authors":"Qi Xiaoying, Alton Y. K. Chua, Yang Haiping","doi":"10.1002/asi.25023","DOIUrl":"10.1002/asi.25023","url":null,"abstract":"<p>Given that historical maps (HM) are represented by a complex network of symbols, their semantics cannot be easily and directly understood. To extract the embedded knowledge, scholars have developed semantic organization for different types of HM. However, the construction of semantic organization for HM is challenging due to problems of semantic clutter, semantic loss, and semantic ambiguity. To resolve these problems, this paper proposes a semantic organization system which includes classification, representation, and association mechanisms for HM. The intent is to achieve semantic ordering, semantic enhancement, and semantic association. As a means to verify the proposed semantic organization system, this paper develops an HM knowledge question and answer (Q&A) system. Experimental results show that the Q&A system outperformed Baidu (Wenxinyiyan) and GPT-4o in terms of precision and recall.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 10","pages":"1374-1395"},"PeriodicalIF":4.3,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145062563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reusing research data can effectively reduce efforts in data collection and enhance the replicability of evaluation experiments, especially for small laboratories and research teams studying human-centered systems. Building a sustainable data reuse process and culture relies on frameworks that encompass policies, standards, roles, and responsibilities, all of which must address the diverse needs of data providers, curators, and reusers. This study investigated data reuse practices of experienced researchers in Interactive Information Retrieval (IIR), a field where data reuse has been strongly advocated but still remains a challenge. We conducted 21 semi-structured in-depth interviews with IIR researchers from varying demographic backgrounds, institutions, and career stages about their motivations, experiences, and concerns regarding data reuse. We uncovered the rationales, criteria, and strategies they used in reusability assessments, as well as the challenges they faced when attempting to reuse research data in their studies. These empirical findings enrich ongoing discussions about the reusability of user-generated data and research resources and help promote community-level data reuse culture and standards in both traditional and emerging IIR research fields.
{"title":"The landscape of data reuse in interactive information retrieval: Motivations, sources, and evaluation of reusability","authors":"Tianji Jiang, Wenqi Li, Jiqun Liu","doi":"10.1002/asi.25020","DOIUrl":"10.1002/asi.25020","url":null,"abstract":"<p>Reusing research data can effectively reduce efforts in data collection and enhance the replicability of evaluation experiments, especially for small laboratories and research teams studying human-centered systems. Building a sustainable data reuse process and culture relies on frameworks that encompass policies, standards, roles, and responsibilities, all of which must address the diverse needs of data providers, curators, and reusers. This study investigated data reuse practices of experienced researchers in Interactive Information Retrieval (IIR), a field where data reuse has been strongly advocated but still remains a challenge. We conducted 21 semi-structured in-depth interviews with IIR researchers from varying demographic backgrounds, institutions, and career stages about their motivations, experiences, and concerns regarding data reuse. We uncovered the rationales, criteria, and strategies they used in reusability assessments, as well as the challenges they faced when attempting to reuse research data in their studies. These empirical findings enrich ongoing discussions about the reusability of user-generated data and research resources and help promote community-level data reuse culture and standards in both traditional and emerging IIR research fields.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 9","pages":"1258-1276"},"PeriodicalIF":4.3,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144923781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study explored the spatiotemporal relationship between usage data (measured by PDF downloads and HTML views) and topic popularity (measured by the number of publications) in scientific literature. Using a panel dataset of over 2.3 million papers and 130 million usage records from IEEE Xplore, we develop a theoretical framework grounded in attention economy theory and the competitive exclusion principle. By using fixed effects model, the instrumental variable method, and the spatial Durbin model, we discover that how often a topic is used greatly increases its future popularity, while usage data from related topics have a negative impact. This study provides solid preliminary evidence for using usage data in detecting research hotspots. Additionally, this study innovatively proposes two methods for constructing spatial weight matrices based on topic semantic vectors, offering a concrete pathway for integrating spatial econometrics with spatial scientometrics.
{"title":"The spatiotemporal relationship between usage data and topic popularity in scientific literature","authors":"Xianwen Wang, Wencan Tian, Ruonan Cai, Zhichao Fang","doi":"10.1002/asi.25019","DOIUrl":"10.1002/asi.25019","url":null,"abstract":"<p>This study explored the spatiotemporal relationship between usage data (measured by PDF downloads and HTML views) and topic popularity (measured by the number of publications) in scientific literature. Using a panel dataset of over 2.3 million papers and 130 million usage records from IEEE Xplore, we develop a theoretical framework grounded in attention economy theory and the competitive exclusion principle. By using fixed effects model, the instrumental variable method, and the spatial Durbin model, we discover that how often a topic is used greatly increases its future popularity, while usage data from related topics have a negative impact. This study provides solid preliminary evidence for using usage data in detecting research hotspots. Additionally, this study innovatively proposes two methods for constructing spatial weight matrices based on topic semantic vectors, offering a concrete pathway for integrating spatial econometrics with spatial scientometrics.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 9","pages":"1241-1257"},"PeriodicalIF":4.3,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144923780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}