Pub Date : 2024-06-04DOI: 10.1007/s11192-024-05053-9
Abdelghani Maddi, Emmanuel Monneau, Catherine Guaspare-Cartron, Floriana Gargiulo, Michel Dubois
The Streetlight Effect represents an observation bias that occurs when individuals search for something only where it is easiest to look. Despite the significant development of Post-Publication Peer Review (PPPR) in recent years, facilitated in part by platforms such as PubPeer, existing literature has not examined whether PPPR is affected by this type of bias. In other words, if the PPPR mainly concerns publications to which researchers have direct access (eg to analyze image duplications, etc.). In this study, we compare the Open Access (OA) structures of publishers and journals among 51,882 publications commented on PubPeer to those indexed in OpenAlex database (#156,700,177). Our findings indicate that OA journals are 33% more prevalent in PubPeer than in the global total (52% for the most commented journals). This result can be attributed to disciplinary bias in PubPeer, with overrepresentation of medical and biological research (which exhibits higher levels of openness). However, after normalization, the results reveal that PPPR does not exhibit a Streetlight Effect, as OA publications, within the same discipline, are on average 16% less prevalent in PubPeer than in the global total. These results suggest that the process of scientific self-correction operates independently of publication access status.
{"title":"Streetlight effect in PubPeer comments: are Open Access publications more scrutinized?","authors":"Abdelghani Maddi, Emmanuel Monneau, Catherine Guaspare-Cartron, Floriana Gargiulo, Michel Dubois","doi":"10.1007/s11192-024-05053-9","DOIUrl":"https://doi.org/10.1007/s11192-024-05053-9","url":null,"abstract":"<p>The <i>Streetlight Effect</i> represents an observation bias that occurs when individuals search for something only where it is easiest to look. Despite the significant development of Post-Publication Peer Review (PPPR) in recent years, facilitated in part by platforms such as PubPeer, existing literature has not examined whether PPPR is affected by this type of bias. In other words, if the PPPR mainly concerns publications to which researchers have direct access (eg to analyze image duplications, etc.). In this study, we compare the Open Access (OA) structures of publishers and journals among 51,882 publications commented on PubPeer to those indexed in OpenAlex database (#156,700,177). Our findings indicate that OA journals are 33% more prevalent in PubPeer than in the global total (52% for the most commented journals). This result can be attributed to disciplinary bias in PubPeer, with overrepresentation of medical and biological research (which exhibits higher levels of openness). However, after normalization, the results reveal that PPPR does not exhibit a Streetlight Effect, as OA publications, within the same discipline, are on average 16% less prevalent in PubPeer than in the global total. These results suggest that the process of scientific self-correction operates independently of publication access status.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-03DOI: 10.1007/s11192-024-05045-9
Sandra Miguel, Claudia M. González, Zaida Chinchilla-Rodríguez
This study aims to identify and compare the national scope of research at the country level, dealing with two groups of countries: Latin America and the Caribbean (LAC) and a group of countries at the forefront in developing mainstream science (WORLD). We wish to explore whether similar or different patterns arise between the two groups at the global and disciplinary level, becoming apparent in their proportion of research related to local perspectives or topics. It is found that Latin America and the Caribbean countries present a greater proportion of local production. The trend to publish national-oriented research is related to disciplinary fields. Even though English is the dominant language of publication, the lingua franca is more likely to appear in the national scope of research, especially for Latin America and the Caribbean countries but also in the rest of non-Anglophone countries. Some implications and limitations for further studies are described.
{"title":"Towards a new approach to analyzing the geographical scope of national research. An exploratory analysis at the country level","authors":"Sandra Miguel, Claudia M. González, Zaida Chinchilla-Rodríguez","doi":"10.1007/s11192-024-05045-9","DOIUrl":"https://doi.org/10.1007/s11192-024-05045-9","url":null,"abstract":"<p>This study aims to identify and compare the national scope of research at the country level, dealing with two groups of countries: Latin America and the Caribbean (LAC) and a group of countries at the forefront in developing mainstream science (WORLD). We wish to explore whether similar or different patterns arise between the two groups at the global and disciplinary level, becoming apparent in their proportion of research related to local perspectives or topics. It is found that Latin America and the Caribbean countries present a greater proportion of local production. The trend to publish national-oriented research is related to disciplinary fields. Even though English is the dominant language of publication, the lingua franca is more likely to appear in the national scope of research, especially for Latin America and the Caribbean countries but also in the rest of non-Anglophone countries. Some implications and limitations for further studies are described.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Altmetrics, or alternative metrics, refer to the newer kind of events around scholarly articles, such as the number of times the article is read, tweeted, mentioned in blog posts etc. These metrics have gained a lot of popularity during last few years and are now being collected and used in several ways, ranging from early measure of article impact to a potential indicator of societal relevance of research. However, there are several studies which have cautioned about use of altmetrics on account of quality and reliability of altmetric data, as they may be more prone to manipulations and artificial inflations. This study proposes a framework based on application of Benford’s Law to evaluate the quality of altmetric data. A large sized altmetric data sample is considered and the fits with Benford’s Law are computed. The analysis is performed by doing plots of the empirical data distributions and the theoretical Benford's, and by employing relevant statistical measures and tests. Results for fit on first and second leading digit of altmetric data show conformity to Benford's distribution. To further explore the usefulness of the framework, the altmetric data is subjected to artificial manipulations through a systematic process and the fits to Benford’s law are reassessed to see if there are distortions. The results and analysis suggest that Benford’s Law based framework can be used to test the quality of altmetric data. Relevant implications of the research are discussed.
{"title":"Altmetric data quality analysis using Benford’s law","authors":"Solanki Gupta, Vivek Kumar Singh, Sumit Kumar Banshal","doi":"10.1007/s11192-024-05061-9","DOIUrl":"https://doi.org/10.1007/s11192-024-05061-9","url":null,"abstract":"<p>Altmetrics, or alternative metrics, refer to the newer kind of events around scholarly articles, such as the number of times the article is read, tweeted, mentioned in blog posts etc. These metrics have gained a lot of popularity during last few years and are now being collected and used in several ways, ranging from early measure of article impact to a potential indicator of societal relevance of research. However, there are several studies which have cautioned about use of altmetrics on account of quality and reliability of altmetric data, as they may be more prone to manipulations and artificial inflations. This study proposes a framework based on application of Benford’s Law to evaluate the quality of altmetric data. A large sized altmetric data sample is considered and the fits with Benford’s Law are computed. The analysis is performed by doing plots of the empirical data distributions and the theoretical Benford's, and by employing relevant statistical measures and tests. Results for fit on first and second leading digit of altmetric data show conformity to Benford's distribution. To further explore the usefulness of the framework, the altmetric data is subjected to artificial manipulations through a systematic process and the fits to Benford’s law are reassessed to see if there are distortions. The results and analysis suggest that Benford’s Law based framework can be used to test the quality of altmetric data. Relevant implications of the research are discussed.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-03DOI: 10.1007/s11192-024-05058-4
Abhijit Thakuria, Dipen Deka
The study utilized Latent Dirichlet Allocation (LDA) Topic modeling to identify prevalent latent topics within Open Access (OA) Library and Information Science (LIS) journals from 2013 to 2022. Eight core OA Scopus indexed journals were selected based on their SJR scores and DOAJ listing. Titles, Abstracts and keywords of 2589 articles were extracted from the Scopus database. R software packages were used to perform data analysis and LDA topic modeling. The optimal value of k was determined to be 9. The analysis revealed that 53.89% of documents comprise over 50% of a certain topic (θ > 0.50). Notably, ‘Scholarly Communication’ and ‘Information Systems, Models and Frameworks’ emerged as dominant topics with the highest proportions of research literature in the corpus. The topic ‘Scholarly Communication’ experienced significant growth with an average annual growth rate (AAGR) of 4.37%, while ‘Collection development and E-resources’ exhibited the lowest research proportion and a negative AAGR of − 4.22%. Additionally, topics such as ‘Information Seeking Behaviour and User Studies’, ‘User Education and Information Literacy’, and ‘Information Retrieval and Systematic Review’ remained stable and persistent topics. Conversely, research on traditional topics like ‘Librarianship and Profession’, ‘Bibliometrics’ and ‘Medical Library and Health Information’ showed a gradual decline. The LDA topic modeling approach unveiled previously unknown or unexplored topics in open access LIS research literature, enhancing our understanding of emerging trends.
{"title":"A decadal study on identifying latent topics and research trends in open access LIS journals using topic modeling approach","authors":"Abhijit Thakuria, Dipen Deka","doi":"10.1007/s11192-024-05058-4","DOIUrl":"https://doi.org/10.1007/s11192-024-05058-4","url":null,"abstract":"<p>The study utilized Latent Dirichlet Allocation (LDA) Topic modeling to identify prevalent latent topics within Open Access (OA) Library and Information Science (LIS) journals from 2013 to 2022. Eight core OA Scopus indexed journals were selected based on their SJR scores and DOAJ listing. Titles, Abstracts and keywords of 2589 articles were extracted from the Scopus database. R software packages were used to perform data analysis and LDA topic modeling. The optimal value of k was determined to be 9. The analysis revealed that 53.89% of documents comprise over 50% of a certain topic (θ > 0.50). Notably, ‘Scholarly Communication’ and ‘Information Systems, Models and Frameworks’ emerged as dominant topics with the highest proportions of research literature in the corpus. The topic ‘Scholarly Communication’ experienced significant growth with an average annual growth rate (AAGR) of 4.37%, while ‘Collection development and E-resources’ exhibited the lowest research proportion and a negative AAGR of − 4.22%. Additionally, topics such as ‘Information Seeking Behaviour and User Studies’, ‘User Education and Information Literacy’, and ‘Information Retrieval and Systematic Review’ remained stable and persistent topics. Conversely, research on traditional topics like ‘Librarianship and Profession’, ‘Bibliometrics’ and ‘Medical Library and Health Information’ showed a gradual decline. The LDA topic modeling approach unveiled previously unknown or unexplored topics in open access LIS research literature, enhancing our understanding of emerging trends.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-03DOI: 10.1007/s11192-024-05054-8
Congying Wang, Brent Jesiek, Wei Zhang
{"title":"Elevating international collaboration and academic outcomes through strategic research funding: a bibliometric analysis of China Scholarship Council funded publications","authors":"Congying Wang, Brent Jesiek, Wei Zhang","doi":"10.1007/s11192-024-05054-8","DOIUrl":"https://doi.org/10.1007/s11192-024-05054-8","url":null,"abstract":"","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141269534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11192-024-05041-z
Vladimir Batagelj
The standard and fractional projections are extended from binary two-mode networks to weighted two-mode networks. Some interesting properties of the extended projections are proved.
标准投影和分数投影从二元双模网络扩展到加权双模网络。证明了扩展投影的一些有趣特性。
{"title":"On weighted two-mode network projections","authors":"Vladimir Batagelj","doi":"10.1007/s11192-024-05041-z","DOIUrl":"https://doi.org/10.1007/s11192-024-05041-z","url":null,"abstract":"<p>The standard and fractional projections are extended from binary two-mode networks to weighted two-mode networks. Some interesting properties of the extended projections are proved.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11192-024-04983-8
Jamal El-Ouahi
Funding acknowledgments are important objects of study in the context of science funding. This study uses a mixed-methods approach to analyze the funding acknowledgments found in 2.3 million scientific publications published between 2008 and 2021 by authors affiliated with research institutions in the Middle East and North Africa (MENA). The aim is to identify the major funders, assess their contribution to national scientific publications, and gain insights into the funding mechanism in relation to collaboration and publication. Publication data from the Web of Science is examined to provide key insights about funding activities. Saudi Arabia and Qatar lead the region, as about half of their publications include acknowledgments to funding sources. Most MENA countries exhibit strong linkages with foreign agencies, mainly due to a high level of international collaboration. The distinction between domestic and international publications reveals some differences in terms of funding structures. For instance, Turkey and Iran are dominated by one or two major funders whereas a few other countries like Saudi Arabia showcase multiple funders. Iran and Kuwait are examples of countries where research is mainly funded by domestic funders. The government and academic sectors mainly fund scientific research in MENA whereas the industry sector plays little or no role in terms of research funding. Lastly, the qualitative analyses provide more context into the complex funding mechanism. The findings of this study contribute to a better understanding of the funding structure in MENA countries and provide insights to funders and research managers to evaluate the funding landscape.
{"title":"Research funding in the Middle East and North Africa: analyses of acknowledgments in scientific publications indexed in the Web of Science (2008–2021)","authors":"Jamal El-Ouahi","doi":"10.1007/s11192-024-04983-8","DOIUrl":"https://doi.org/10.1007/s11192-024-04983-8","url":null,"abstract":"<p>Funding acknowledgments are important objects of study in the context of science funding. This study uses a mixed-methods approach to analyze the funding acknowledgments found in 2.3 million scientific publications published between 2008 and 2021 by authors affiliated with research institutions in the Middle East and North Africa (MENA). The aim is to identify the major funders, assess their contribution to national scientific publications, and gain insights into the funding mechanism in relation to collaboration and publication. Publication data from the Web of Science is examined to provide key insights about funding activities. Saudi Arabia and Qatar lead the region, as about half of their publications include acknowledgments to funding sources. Most MENA countries exhibit strong linkages with foreign agencies, mainly due to a high level of international collaboration. The distinction between domestic and international publications reveals some differences in terms of funding structures. For instance, Turkey and Iran are dominated by one or two major funders whereas a few other countries like Saudi Arabia showcase multiple funders. Iran and Kuwait are examples of countries where research is mainly funded by domestic funders. The government and academic sectors mainly fund scientific research in MENA whereas the industry sector plays little or no role in terms of research funding. Lastly, the qualitative analyses provide more context into the complex funding mechanism. The findings of this study contribute to a better understanding of the funding structure in MENA countries and provide insights to funders and research managers to evaluate the funding landscape.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s11192-024-05026-y
Ariel Alexi, Teddy Lazebnik, Ariel Rosenfeld
Online academic profiles are used by scholars to reflect a desired image to their online audience. In Google Scholar, scholars can select a subset of co-authors for presentation in a central location on their profile using a social feature called the “co-authroship panel”. In this work, we examine whether scientometrics and reciprocality can explain the observed selections. To this end, we scrape and thoroughly analyze a novel set of 120,000 Google Scholar profiles, ranging across four dieffectsciplines and various academic institutions. Our results seem to suggest that scholars tend to favor co-authors with higher scientometrics over others for inclusion in their co-authorship panels. Interestingly, as one’s own scientometrics are higher, the tendency to include co-authors with high scientometrics is diminishing. Furthermore, we find that reciprocality is central in explaining scholars’ selections.
{"title":"The scientometrics and reciprocality underlying co-authorship panels in Google Scholar profiles","authors":"Ariel Alexi, Teddy Lazebnik, Ariel Rosenfeld","doi":"10.1007/s11192-024-05026-y","DOIUrl":"https://doi.org/10.1007/s11192-024-05026-y","url":null,"abstract":"<p>Online academic profiles are used by scholars to reflect a desired image to their online audience. In Google Scholar, scholars can select a subset of co-authors for presentation in a central location on their profile using a social feature called the “co-authroship panel”. In this work, we examine whether scientometrics and reciprocality can explain the observed selections. To this end, we scrape and thoroughly analyze a novel set of 120,000 Google Scholar profiles, ranging across four dieffectsciplines and various academic institutions. Our results seem to suggest that scholars tend to favor co-authors with higher scientometrics over others for inclusion in their co-authorship panels. Interestingly, as one’s own scientometrics are higher, the tendency to include co-authors with high scientometrics is diminishing. Furthermore, we find that reciprocality is central in explaining scholars’ selections.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s11192-024-05048-6
Yingyi Zhang, Chengzhi Zhang
Billions of scientific papers lead to the need to identify essential parts from the massive text. Scientific research is an activity from putting forward problems to using methods. To learn the main idea from scientific papers, we focus on extracting problem and method sentences. Annotating sentences within scientific papers is labor-intensive, resulting in small-scale datasets that limit the amount of information models can learn. This limited information leads models to rely heavily on specific forms, which in turn reduces their generalization capabilities. This paper addresses the problems caused by small-scale datasets from three perspectives: increasing dataset scale, reducing dependence on specific forms, and enriching the information within sentences. To implement the first two ideas, we introduce the concept of formulaic expression (FE) desensitization and propose FE desensitization-based data augmenters to generate synthetic data and reduce models’ reliance on FEs. For the third idea, we propose a context-enhanced transformer that utilizes context to measure the importance of words in target sentences and to reduce noise in the context. Furthermore, this paper conducts experiments using large language model (LLM) based in-context learning (ICL) methods. Quantitative and qualitative experiments demonstrate that our proposed models achieve a higher macro F1 score compared to the baseline models on two scientific paper datasets, with improvements of 3.71% and 2.67%, respectively. The LLM based ICL methods are found to be not suitable for the task of problem and method extraction.
数以亿计的科学论文导致我们需要从海量文本中找出重要部分。科学研究是一项从提出问题到使用方法的活动。为了从科学论文中学习主要观点,我们将重点放在提取问题句和方法句上。对科学论文中的句子进行注释是一项劳动密集型工作,导致数据集规模较小,限制了模型可学习的信息量。有限的信息导致模型严重依赖于特定的形式,这反过来又降低了模型的泛化能力。本文从三个方面解决了小规模数据集带来的问题:扩大数据集规模、减少对特定形式的依赖以及丰富句子中的信息。为了实现前两个想法,我们引入了公式化表达(FE)脱敏的概念,并提出了基于 FE 脱敏的数据增强器来生成合成数据,减少模型对 FE 的依赖。对于第三个想法,我们提出了一种上下文增强转换器,利用上下文来衡量目标句子中单词的重要性,并减少上下文中的噪音。此外,本文还使用基于大语言模型(LLM)的上下文学习(ICL)方法进行了实验。定量和定性实验表明,在两个科学论文数据集上,与基线模型相比,我们提出的模型获得了更高的宏观 F1 分数,分别提高了 3.71% 和 2.67%。基于 LLM 的 ICL 方法不适合问题和方法提取任务。
{"title":"Extracting problem and method sentence from scientific papers: a context-enhanced transformer using formulaic expression desensitization","authors":"Yingyi Zhang, Chengzhi Zhang","doi":"10.1007/s11192-024-05048-6","DOIUrl":"https://doi.org/10.1007/s11192-024-05048-6","url":null,"abstract":"<p>Billions of scientific papers lead to the need to identify essential parts from the massive text. Scientific research is an activity from putting forward problems to using methods. To learn the main idea from scientific papers, we focus on extracting problem and method sentences. Annotating sentences within scientific papers is labor-intensive, resulting in small-scale datasets that limit the amount of information models can learn. This limited information leads models to rely heavily on specific forms, which in turn reduces their generalization capabilities. This paper addresses the problems caused by small-scale datasets from three perspectives: increasing dataset scale, reducing dependence on specific forms, and enriching the information within sentences. To implement the first two ideas, we introduce the concept of formulaic expression (FE) desensitization and propose FE desensitization-based data augmenters to generate synthetic data and reduce models’ reliance on FEs. For the third idea, we propose a context-enhanced transformer that utilizes context to measure the importance of words in target sentences and to reduce noise in the context. Furthermore, this paper conducts experiments using large language model (LLM) based in-context learning (ICL) methods. Quantitative and qualitative experiments demonstrate that our proposed models achieve a higher macro F<sub>1</sub> score compared to the baseline models on two scientific paper datasets, with improvements of 3.71% and 2.67%, respectively. The LLM based ICL methods are found to be not suitable for the task of problem and method extraction.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}