首页 > 最新文献

Patterns最新文献

英文 中文
A systematic survey of natural language processing for the Greek language. 希腊语言的自然语言处理的系统调查。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-21 eCollection Date: 2025-11-14 DOI: 10.1016/j.patter.2025.101313
Juli Bakagianni, Kanella Pouli, Maria Gavriilidou, John Pavlopoulos

Comprehensive monolingual natural language processing (NLP) surveys are essential for assessing language-specific challenges, resource availability, and research gaps. However, existing surveys often lack standardized methodologies, leading to selection bias and fragmented coverage of NLP tasks and resources. This study introduces a generalizable framework for systematic monolingual NLP surveys. Our approach integrates a structured search protocol to minimize bias, an NLP task taxonomy for classification, and language resource taxonomies to identify potential benchmarks and highlight opportunities for improving resource availability. We apply this framework to Greek NLP (2012-2023), providing an in-depth analysis of its current state, task-specific progress, and resource gaps. The survey results are publicly available and are regularly updated to provide an evergreen resource. This systematic survey of Greek NLP serves as a case study, demonstrating the effectiveness of our framework and its potential for broader application to other not-so-well-resourced languages as regards NLP.

综合单语自然语言处理(NLP)调查对于评估语言特定挑战、资源可用性和研究差距至关重要。然而,现有的调查往往缺乏标准化的方法,导致选择偏差和碎片化的NLP任务和资源的覆盖。本研究为系统的单语NLP调查引入了一个可推广的框架。我们的方法集成了一个结构化搜索协议,以最大限度地减少偏见,一个用于分类的NLP任务分类法,以及语言资源分类法,以确定潜在的基准,并突出提高资源可用性的机会。我们将此框架应用于希腊NLP(2012-2023),对其当前状态、特定任务进展和资源缺口进行了深入分析。调查结果是公开的,并定期更新,以提供一个常绿的资源。这个希腊NLP的系统调查作为一个案例研究,展示了我们的框架的有效性,以及它在其他资源不那么丰富的语言中更广泛应用的潜力。
{"title":"A systematic survey of natural language processing for the Greek language.","authors":"Juli Bakagianni, Kanella Pouli, Maria Gavriilidou, John Pavlopoulos","doi":"10.1016/j.patter.2025.101313","DOIUrl":"10.1016/j.patter.2025.101313","url":null,"abstract":"<p><p>Comprehensive monolingual natural language processing (NLP) surveys are essential for assessing language-specific challenges, resource availability, and research gaps. However, existing surveys often lack standardized methodologies, leading to selection bias and fragmented coverage of NLP tasks and resources. This study introduces a generalizable framework for systematic monolingual NLP surveys. Our approach integrates a structured search protocol to minimize bias, an NLP task taxonomy for classification, and language resource taxonomies to identify potential benchmarks and highlight opportunities for improving resource availability. We apply this framework to Greek NLP (2012-2023), providing an in-depth analysis of its current state, task-specific progress, and resource gaps. The survey results are publicly available and are regularly updated to provide an evergreen resource. This systematic survey of Greek NLP serves as a case study, demonstrating the effectiveness of our framework and its potential for broader application to other not-so-well-resourced languages as regards NLP.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 11","pages":"101313"},"PeriodicalIF":7.4,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12715428/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145805594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pyomo: Accidentally outrunning the bear. Pyomo:不小心跑过了熊。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101311
Miranda Mundt, William E Hart, Emma S Johnson, Bethany Nicholson, John D Siirola

Pyomo is an open-source optimization modeling software that has undergone significant evolution since its inception in 2008. Pyomo has evolved to enhance flexibility, solver integration, and community engagement. Modern collaborative tools for open-source software have facilitated the development of new Pyomo functionality and improved our development process through automated testing and performance-tracking pipelines. However, Pyomo faces challenges typical of research software, including resource limitations and knowledge retention. The Pyomo team's commitment to better development practices and community engagement reflects a proactive approach to these issues. We describe Pyomo's development journey, highlighting both successes and failures, in the hopes that other open-source research software packages may benefit from our experiences.

Pyomo是一款开源的优化建模软件,自2008年问世以来经历了重大的发展。Pyomo已经发展到增强灵活性、求解器集成和社区参与。开源软件的现代协作工具促进了Pyomo新功能的开发,并通过自动化测试和性能跟踪管道改进了我们的开发过程。然而,Pyomo面临着研究软件的典型挑战,包括资源限制和知识保留。Pyomo团队对更好的开发实践和社区参与的承诺反映了对这些问题的积极态度。我们描述了Pyomo的开发历程,强调了成功和失败,希望其他开源研究软件包可以从我们的经验中受益。
{"title":"Pyomo: Accidentally outrunning the bear.","authors":"Miranda Mundt, William E Hart, Emma S Johnson, Bethany Nicholson, John D Siirola","doi":"10.1016/j.patter.2025.101311","DOIUrl":"10.1016/j.patter.2025.101311","url":null,"abstract":"<p><p>Pyomo is an open-source optimization modeling software that has undergone significant evolution since its inception in 2008. Pyomo has evolved to enhance flexibility, solver integration, and community engagement. Modern collaborative tools for open-source software have facilitated the development of new Pyomo functionality and improved our development process through automated testing and performance-tracking pipelines. However, Pyomo faces challenges typical of research software, including resource limitations and knowledge retention. The Pyomo team's commitment to better development practices and community engagement reflects a proactive approach to these issues. We describe Pyomo's development journey, highlighting both successes and failures, in the hopes that other open-source research software packages may benefit from our experiences.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101311"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416079/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A conversation with research software engineers at the International Brain Laboratory. 与国际大脑实验室研究软件工程师的对话。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101315
Mayo Faulkner, Miles Wells

Open-source software is the lifeblood of many modern research projects, allowing researchers to push boundaries, build collaborations, and work transparently. The International Brain Laboratory (IBL), a group of more than twenty labs working together to understand the neuroscience of decision-making, uses open-source software and other open science practices extensively to advance its research. Here, we interview two of the IBL's research software engineers to learn more about their career paths and how they view open-source development.

开源软件是许多现代研究项目的命脉,它允许研究人员突破界限,建立协作,并透明地工作。国际大脑实验室(IBL)由20多个实验室组成,共同研究决策的神经科学,广泛使用开源软件和其他开放科学实践来推进其研究。在这里,我们采访了IBL的两位研究软件工程师,以了解他们的职业道路以及他们如何看待开源开发。
{"title":"A conversation with research software engineers at the International Brain Laboratory.","authors":"Mayo Faulkner, Miles Wells","doi":"10.1016/j.patter.2025.101315","DOIUrl":"https://doi.org/10.1016/j.patter.2025.101315","url":null,"abstract":"<p><p>Open-source software is the lifeblood of many modern research projects, allowing researchers to push boundaries, build collaborations, and work transparently. The International Brain Laboratory (IBL), a group of more than twenty labs working together to understand the neuroscience of decision-making, uses open-source software and other open science practices extensively to advance its research. Here, we interview two of the IBL's research software engineers to learn more about their career paths and how they view open-source development.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101315"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416082/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning. 基于树的管道优化工具:利用遗传编程和自动机器学习解决生物医学研究问题。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101314
Jose Guadalupe Hernandez, Anil Kumar Saini, Attri Ghosh, Jason H Moore

The tree-based pipeline optimization tool (TPOT) is one of the earliest automated machine learning (ML) frameworks developed for optimizing ML pipelines, with an emphasis on addressing the complexities of biomedical research. TPOT uses genetic programming to explore a diverse space of pipeline structures and hyperparameter configurations in search of optimal pipelines. Here, we provide a comparative overview of the conceptual similarities and implementation differences between the previous and latest versions of TPOT, focusing on two key aspects: (1) the representation of ML pipelines and (2) the underlying algorithm driving pipeline optimization. We also highlight TPOT's application across various medical and healthcare domains, including disease diagnosis, adverse outcome forecasting, and genetic analysis. Additionally, we propose future directions for enhancing TPOT by integrating contemporary ML techniques and recent advancements in evolutionary computation.

基于树的管道优化工具(TPOT)是为优化机器学习管道而开发的最早的自动化机器学习(ML)框架之一,重点是解决生物医学研究的复杂性。TPOT使用遗传规划方法探索管道结构和超参数配置的多样化空间,以寻找最优管道。在这里,我们比较了以前和最新版本的TPOT之间概念上的相似性和实现上的差异,重点关注两个关键方面:(1)ML管道的表示和(2)驱动管道优化的底层算法。我们还强调了TPOT在各种医疗和保健领域的应用,包括疾病诊断,不良后果预测和遗传分析。此外,我们提出了通过整合当代ML技术和进化计算的最新进展来增强TPOT的未来方向。
{"title":"The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning.","authors":"Jose Guadalupe Hernandez, Anil Kumar Saini, Attri Ghosh, Jason H Moore","doi":"10.1016/j.patter.2025.101314","DOIUrl":"10.1016/j.patter.2025.101314","url":null,"abstract":"<p><p>The tree-based pipeline optimization tool (TPOT) is one of the earliest automated machine learning (ML) frameworks developed for optimizing ML pipelines, with an emphasis on addressing the complexities of biomedical research. TPOT uses genetic programming to explore a diverse space of pipeline structures and hyperparameter configurations in search of optimal pipelines. Here, we provide a comparative overview of the conceptual similarities and implementation differences between the previous and latest versions of TPOT, focusing on two key aspects: (1) the representation of ML pipelines and (2) the underlying algorithm driving pipeline optimization. We also highlight TPOT's application across various medical and healthcare domains, including disease diagnosis, adverse outcome forecasting, and genetic analysis. Additionally, we propose future directions for enhancing TPOT by integrating contemporary ML techniques and recent advancements in evolutionary computation.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101314"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416094/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-source software for data science. 数据科学的开源软件。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101324
Andrew L Hufton
{"title":"Open-source software for data science.","authors":"Andrew L Hufton","doi":"10.1016/j.patter.2025.101324","DOIUrl":"https://doi.org/10.1016/j.patter.2025.101324","url":null,"abstract":"","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101324"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416077/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The future of research software is the future of research. 研究软件的未来就是研究的未来。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101322
Neil P Chue Hong, Selina Aragon, Simon Hettrick, Caroline Jay

The use of software is near-ubiquitous in research, yet it is still underrecognized despite changes in policy and practice. Notwithstanding many successful initiatives to improve the culture around research software, the authors argue that it is essential that the development of research software anticipates changes in the research landscape and continues to support the many different people who use it.

软件的使用在研究中几乎无处不在,尽管政策和实践发生了变化,但它仍然没有得到充分认识。尽管有许多成功的举措来改善围绕研究软件的文化,但作者认为,研究软件的开发必须预见到研究领域的变化,并继续支持使用它的许多不同的人。
{"title":"The future of research software is the future of research.","authors":"Neil P Chue Hong, Selina Aragon, Simon Hettrick, Caroline Jay","doi":"10.1016/j.patter.2025.101322","DOIUrl":"10.1016/j.patter.2025.101322","url":null,"abstract":"<p><p>The use of software is near-ubiquitous in research, yet it is still underrecognized despite changes in policy and practice. Notwithstanding many successful initiatives to improve the culture around research software, the authors argue that it is essential that the development of research software anticipates changes in the research landscape and continues to support the many different people who use it.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101322"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-source models for development of data and metadata standards. 用于开发数据和元数据标准的开源模型。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101316
Ariel Rokem, Vani Mandava, Nicoleta Cristea, Anshul Tambay, Kristofer Bouchard, Carolina Berys-Gonzalez, Andy Connolly

Machine learning and artificial intelligence promise to accelerate research and understanding across many scientific disciplines. Harnessing the power of these techniques requires aggregating scientific data. In tandem, the importance of open data for reproducibility and scientific transparency is gaining recognition, and data are increasingly available through digital repositories. Leveraging efforts from disparate data collection sources, however, requires interoperable and adaptable standards for data description and storage. Through the synthesis of experiences in astronomy, high-energy physics, earth science, and neuroscience, we contend that the open-source software (OSS) model provides significant benefits for standard creation and adaptation. We highlight resultant issues, such as balancing flexibility vs. stability and utilizing new computing paradigms and technologies, that must be considered from both the user and developer perspectives to ensure pathways for recognition and sustainability. We recommend supporting and recognizing the development and maintenance of OSS data standards and software consistent with widely adopted scientific tools.

机器学习和人工智能有望加速许多科学学科的研究和理解。利用这些技术的力量需要汇集科学数据。与此同时,开放数据对再现性和科学透明度的重要性正在得到认可,并且越来越多的数据可以通过数字存储库获得。然而,要利用来自不同数据收集源的成果,就需要用于数据描述和存储的可互操作和可适应的标准。通过综合天文学、高能物理学、地球科学和神经科学方面的经验,我们认为开源软件(OSS)模型为标准的创建和调整提供了显著的好处。我们强调了由此产生的问题,例如平衡灵活性与稳定性以及利用新的计算范式和技术,必须从用户和开发人员的角度考虑这些问题,以确保获得认可和可持续性。我们建议支持和认可与广泛采用的科学工具一致的OSS数据标准和软件的开发和维护。
{"title":"Open-source models for development of data and metadata standards.","authors":"Ariel Rokem, Vani Mandava, Nicoleta Cristea, Anshul Tambay, Kristofer Bouchard, Carolina Berys-Gonzalez, Andy Connolly","doi":"10.1016/j.patter.2025.101316","DOIUrl":"10.1016/j.patter.2025.101316","url":null,"abstract":"<p><p>Machine learning and artificial intelligence promise to accelerate research and understanding across many scientific disciplines. Harnessing the power of these techniques requires aggregating scientific data. In tandem, the importance of open data for reproducibility and scientific transparency is gaining recognition, and data are increasingly available through digital repositories. Leveraging efforts from disparate data collection sources, however, requires interoperable and adaptable standards for data description and storage. Through the synthesis of experiences in astronomy, high-energy physics, earth science, and neuroscience, we contend that the open-source software (OSS) model provides significant benefits for standard creation and adaptation. We highlight resultant issues, such as balancing flexibility vs. stability and utilizing new computing paradigms and technologies, that must be considered from both the user and developer perspectives to ensure pathways for recognition and sustainability. We recommend supporting and recognizing the development and maintenance of OSS data standards and software consistent with widely adopted scientific tools.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101316"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416081/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bioconductor: Planning a third decade of comprehensive support for genomic data science. Bioconductor:计划为基因组数据科学提供全面支持的第三个十年。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-11 DOI: 10.1016/j.patter.2025.101319
Vincent J Carey

This opinion piece discusses the Bioconductor project for open-source bioinformatics and the engineering concepts underlying its effectiveness to date. Since the inception of Bioconductor in 2002 with 15 software packages devoted to analysis of DNA microarrays, it has grown into an ecosystem of ∼3,000 packages contributed by more than 1,000 developers. Aspects of the history and commitments are reviewed here to contribute to thinking about the design and orchestration of future open-source software projects.

这篇观点文章讨论了开源生物信息学的Bioconductor项目以及迄今为止其有效性背后的工程概念。Bioconductor于2002年成立,当时只有15个DNA微阵列分析专用软件,目前已发展成为拥有1000多名开发人员提供的约3000个软件的生态系统。这里回顾了历史和承诺的各个方面,以有助于思考未来开源软件项目的设计和编排。
{"title":"Bioconductor: Planning a third decade of comprehensive support for genomic data science.","authors":"Vincent J Carey","doi":"10.1016/j.patter.2025.101319","DOIUrl":"10.1016/j.patter.2025.101319","url":null,"abstract":"<p><p>This opinion piece discusses the Bioconductor project for open-source bioinformatics and the engineering concepts underlying its effectiveness to date. Since the inception of Bioconductor in 2002 with 15 software packages devoted to analysis of DNA microarrays, it has grown into an ecosystem of ∼3,000 packages contributed by more than 1,000 developers. Aspects of the history and commitments are reviewed here to contribute to thinking about the design and orchestration of future open-source software projects.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101319"},"PeriodicalIF":7.4,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416078/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASReview LAB v.2: Open-source text screening with multiple agents and a crowd of experts. ASReview LAB v.2:使用多个代理和一群专家的开源文本筛选。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-03 eCollection Date: 2025-07-11 DOI: 10.1016/j.patter.2025.101318
Jonathan de Bruin, Peter Lombaers, Casper Kaandorp, Jelle Teijema, Timo van der Kuil, Berke Yazan, Angie Dong, Rens van de Schoot

ASReview LAB v.2 introduces an advancement in AI-assisted systematic reviewing by enabling collaborative screening with multiple experts ("a crowd of oracles") using a shared AI model. The platform supports multiple AI agents within the same project, allowing users to switch between fast general-purpose models and domain-specific, semantic, or multilingual transformer models. Leveraging the SYNERGY benchmark dataset, performance has improved significantly, showing a 24.1% reduction in loss compared to version 1 through model improvements and hyperparameter tuning. ASReview LAB v.2 follows user-centric design principles and offers reproducible, transparent workflows. It logs key configuration and annotation data while balancing full model traceability with efficient storage. Future developments include automated model switching based on performance metrics, noise-robust learning, and ensemble-based decision-making.

ASReview LAB v.2通过使用共享的AI模型,实现与多名专家(“一群神谕”)的协作筛选,引入了人工智能辅助系统审查的进步。该平台支持同一项目中的多个AI代理,允许用户在快速通用模型和特定领域、语义或多语言转换模型之间切换。利用SYNERGY基准数据集,性能得到了显著提高,通过模型改进和超参数调优,与版本1相比,损失减少了24.1%。ASReview LAB v.2遵循以用户为中心的设计原则,并提供可再现的、透明的工作流。它记录关键的配置和注释数据,同时平衡完整的模型可跟踪性和高效的存储。未来的发展包括基于性能指标的自动模型切换、噪声鲁棒学习和基于集成的决策。
{"title":"ASReview LAB v.2: Open-source text screening with multiple agents and a crowd of experts.","authors":"Jonathan de Bruin, Peter Lombaers, Casper Kaandorp, Jelle Teijema, Timo van der Kuil, Berke Yazan, Angie Dong, Rens van de Schoot","doi":"10.1016/j.patter.2025.101318","DOIUrl":"10.1016/j.patter.2025.101318","url":null,"abstract":"<p><p>ASReview LAB v.2 introduces an advancement in AI-assisted systematic reviewing by enabling collaborative screening with multiple experts (\"a crowd of oracles\") using a shared AI model. The platform supports multiple AI agents within the same project, allowing users to switch between fast general-purpose models and domain-specific, semantic, or multilingual transformer models. Leveraging the SYNERGY benchmark dataset, performance has improved significantly, showing a 24.1% reduction in loss compared to version 1 through model improvements and hyperparameter tuning. ASReview LAB v.2 follows user-centric design principles and offers reproducible, transparent workflows. It logs key configuration and annotation data while balancing full model traceability with efficient storage. Future developments include automated model switching based on performance metrics, noise-robust learning, and ensemble-based decision-making.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101318"},"PeriodicalIF":7.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416088/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OpenML: Insights from 10 years and more than a thousand papers. OpenML:来自10年和1000多篇论文的见解。
IF 7.4 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-07-03 eCollection Date: 2025-07-11 DOI: 10.1016/j.patter.2025.101317
Bernd Bischl, Giuseppe Casalicchio, Taniya Das, Matthias Feurer, Sebastian Fischer, Pieter Gijsbers, Subhaditya Mukherjee, Andreas C Müller, László Németh, Luis Oala, Lennart Purucker, Sahithya Ravi, Jan N van Rijn, Prabhant Singh, Joaquin Vanschoren, Jos van der Velde, Marcel Wever

OpenML is an open-source platform that democratizes machine-learning evaluation by enabling anyone to share datasets in uniform standards, define precise machine-learning tasks, and automatically share detailed workflows and model evaluations. More than just a platform, OpenML fosters a collaborative ecosystem where scientists create new tools, launch initiatives, and establish standards to advance machine learning. Over the past decade, OpenML has inspired over 1,500 publications across diverse fields, from scientists releasing new datasets and benchmarking new models to educators teaching reproducible science. Looking back, we detail and describe the platform's impact by looking at usage and citations. We share lessons from a decade of building, maintaining, and expanding OpenML, highlighting how rich metadata, collaborative benchmarking, and open interfaces have enhanced research and interoperability. Looking ahead, we cover ongoing efforts to expand OpenML's capabilities and integrate with other platforms, informing a broader vision for open-science infrastructure for machine learning.

OpenML是一个开放源代码平台,它使机器学习评估民主化,使任何人都能够以统一的标准共享数据集,定义精确的机器学习任务,并自动共享详细的工作流和模型评估。OpenML不仅仅是一个平台,它还培育了一个协作生态系统,科学家们可以在这里创建新工具、启动计划并建立标准,以推进机器学习。在过去的十年中,OpenML已经在不同领域激发了超过1500种出版物,从发布新数据集和新模型基准的科学家到教授可再生科学的教育工作者。回顾过去,我们通过查看使用情况和引用来详细描述该平台的影响。我们将分享十年来构建、维护和扩展OpenML的经验教训,强调丰富的元数据、协作基准测试和开放接口如何增强了研究和互操作性。展望未来,我们将继续努力扩展OpenML的功能并与其他平台集成,为机器学习的开放科学基础设施提供更广阔的视野。
{"title":"OpenML: Insights from 10 years and more than a thousand papers.","authors":"Bernd Bischl, Giuseppe Casalicchio, Taniya Das, Matthias Feurer, Sebastian Fischer, Pieter Gijsbers, Subhaditya Mukherjee, Andreas C Müller, László Németh, Luis Oala, Lennart Purucker, Sahithya Ravi, Jan N van Rijn, Prabhant Singh, Joaquin Vanschoren, Jos van der Velde, Marcel Wever","doi":"10.1016/j.patter.2025.101317","DOIUrl":"10.1016/j.patter.2025.101317","url":null,"abstract":"<p><p>OpenML is an open-source platform that democratizes machine-learning evaluation by enabling anyone to share datasets in uniform standards, define precise machine-learning tasks, and automatically share detailed workflows and model evaluations. More than just a platform, OpenML fosters a collaborative ecosystem where scientists create new tools, launch initiatives, and establish standards to advance machine learning. Over the past decade, OpenML has inspired over 1,500 publications across diverse fields, from scientists releasing new datasets and benchmarking new models to educators teaching reproducible science. Looking back, we detail and describe the platform's impact by looking at usage and citations. We share lessons from a decade of building, maintaining, and expanding OpenML, highlighting how rich metadata, collaborative benchmarking, and open interfaces have enhanced research and interoperability. Looking ahead, we cover ongoing efforts to expand OpenML's capabilities and integrate with other platforms, informing a broader vision for open-science infrastructure for machine learning.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 7","pages":"101317"},"PeriodicalIF":7.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Patterns
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1