Omar Abdelghani AttafiDepartment of Biomedical Sciences University of Padova Italy, Damiano ClementelDepartment of Biomedical Sciences University of Padova Italy, Konstantinos KyritsisInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki Greece, Emidio CapriottiDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Gavin FarrellELIXIR Hub Hinxton Cambridge UK, Styliani-Christina FragkouliInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki GreeceDepartment of Biology National and Kapodistrian University of Athens Athens Greece, Leyla Jael CastroZB Med Information Centre for Life Sciences Cologne Germany, András HatosDepartment of Oncology Geneva University Hospitals Geneva SwitzerlandDepartment of Computational Biology University of Lausanne Lausanne SwitzerlandSwiss Institute of Bioinformatics Lausanne SwitzerlandSwiss Cancer Center Léman Lausanne Switzerland, Tom LenaertsInteruniversity Institute of Bioinformatics in Brussels Université Libre de Bruxelles Vrije Universiteit Brussel Brussels BelgiumMachine Learning Group Université Libre de Bruxelles Street BelgiumArtificial Intelligence Laboratory Vrije Universiteit Brussels Brussels Belgium, Stanislav MazurenkoLoschmidt Laboratories Department of Experimental Biology and RECETOX Faculty of ScienceMasaryk University Brno Czech Republic International Clinical Research Centre St Anne's Hospital Brno Czech Republic, Soroush MozaffariDepartment of Biomedical Sciences University of Padova Italy, Franco PradelliDepartment of Biomedical Sciences University of Padova Italy, Patrick RuchHES-SO - HEG Geneva Geneva SwitzerlandSIB Swiss Institute of Bioinformatics Geneva Switzerland, Castrense SavojardoDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Paola TurinaDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Federico ZambelliDept of Biosciences University of Milan ItalyInstitute of Biomembranes Bioenergetics and Molecular Biotechnologies Bari Italy, Damiano PiovesanDepartment of Biomedical Sciences University of Padova Italy, Alexander Miguel MonzonDepartment of Information Engineering University of Padova Italy, Fotis PsomopoulosInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki Greece, Silvio C. E. TosattoDepartment of Biomedical Sciences University of Padova ItalyInstitute of Biomembranes Bioenergetics and Molecular Biotechnologies National Research Council Bari Italy
{"title":"DOME Registry: Implementing community-wide recommendations for reporting supervised machine learning in biology","authors":"Omar Abdelghani AttafiDepartment of Biomedical Sciences University of Padova Italy, Damiano ClementelDepartment of Biomedical Sciences University of Padova Italy, Konstantinos KyritsisInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki Greece, Emidio CapriottiDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Gavin FarrellELIXIR Hub Hinxton Cambridge UK, Styliani-Christina FragkouliInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki GreeceDepartment of Biology National and Kapodistrian University of Athens Athens Greece, Leyla Jael CastroZB Med Information Centre for Life Sciences Cologne Germany, András HatosDepartment of Oncology Geneva University Hospitals Geneva SwitzerlandDepartment of Computational Biology University of Lausanne Lausanne SwitzerlandSwiss Institute of Bioinformatics Lausanne SwitzerlandSwiss Cancer Center Léman Lausanne Switzerland, Tom LenaertsInteruniversity Institute of Bioinformatics in Brussels Université Libre de Bruxelles Vrije Universiteit Brussel Brussels BelgiumMachine Learning Group Université Libre de Bruxelles Street BelgiumArtificial Intelligence Laboratory Vrije Universiteit Brussels Brussels Belgium, Stanislav MazurenkoLoschmidt Laboratories Department of Experimental Biology and RECETOX Faculty of ScienceMasaryk University Brno Czech Republic International Clinical Research Centre St Anne's Hospital Brno Czech Republic, Soroush MozaffariDepartment of Biomedical Sciences University of Padova Italy, Franco PradelliDepartment of Biomedical Sciences University of Padova Italy, Patrick RuchHES-SO - HEG Geneva Geneva SwitzerlandSIB Swiss Institute of Bioinformatics Geneva Switzerland, Castrense SavojardoDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Paola TurinaDepartment of Pharmacy and Biotechnology University of Bologna Bologna Italy, Federico ZambelliDept of Biosciences University of Milan ItalyInstitute of Biomembranes Bioenergetics and Molecular Biotechnologies Bari Italy, Damiano PiovesanDepartment of Biomedical Sciences University of Padova Italy, Alexander Miguel MonzonDepartment of Information Engineering University of Padova Italy, Fotis PsomopoulosInstitute of Applied Biosciences Centre for Research and Technology Hellas Thessaloniki Greece, Silvio C. E. TosattoDepartment of Biomedical Sciences University of Padova ItalyInstitute of Biomembranes Bioenergetics and Molecular Biotechnologies National Research Council Bari Italy","doi":"arxiv-2408.07721","DOIUrl":null,"url":null,"abstract":"Supervised machine learning (ML) is used extensively in biology and deserves\ncloser scrutiny. The DOME recommendations aim to enhance the validation and\nreproducibility of ML research by establishing standards for key aspects such\nas data handling and processing, optimization, evaluation, and model\ninterpretability. The recommendations help to ensure that key details are\nreported transparently by providing a structured set of questions. Here, we\nintroduce the DOME Registry (URL: registry.dome-ml.org), a database that allows\nscientists to manage and access comprehensive DOME-related information on\npublished ML studies. The registry uses external resources like ORCID, APICURON\nand the Data Stewardship Wizard to streamline the annotation process and ensure\ncomprehensive documentation. By assigning unique identifiers and DOME scores to\npublications, the registry fosters a standardized evaluation of ML methods.\nFuture plans include continuing to grow the registry through community\ncuration, improving the DOME score definition and encouraging publishers to\nadopt DOME standards, promoting transparency and reproducibility of ML in the\nlife sciences.","PeriodicalId":501219,"journal":{"name":"arXiv - QuanBio - Other Quantitative Biology","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Other Quantitative Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Supervised machine learning (ML) is used extensively in biology and deserves
closer scrutiny. The DOME recommendations aim to enhance the validation and
reproducibility of ML research by establishing standards for key aspects such
as data handling and processing, optimization, evaluation, and model
interpretability. The recommendations help to ensure that key details are
reported transparently by providing a structured set of questions. Here, we
introduce the DOME Registry (URL: registry.dome-ml.org), a database that allows
scientists to manage and access comprehensive DOME-related information on
published ML studies. The registry uses external resources like ORCID, APICURON
and the Data Stewardship Wizard to streamline the annotation process and ensure
comprehensive documentation. By assigning unique identifiers and DOME scores to
publications, the registry fosters a standardized evaluation of ML methods.
Future plans include continuing to grow the registry through community
curation, improving the DOME score definition and encouraging publishers to
adopt DOME standards, promoting transparency and reproducibility of ML in the
life sciences.
有监督的机器学习(ML)被广泛应用于生物学领域,值得更严格的审查。DOME 建议旨在通过建立数据处理和加工、优化、评估和模型可解释性等关键方面的标准,加强 ML 研究的验证和可重复性。这些建议通过提供一系列结构化问题,有助于确保关键细节的透明报告。在此,我们介绍 DOME 注册中心(URL:registry.dome-ml.org),这是一个允许科学家管理和访问已发表的 ML 研究的 DOME 相关综合信息的数据库。该注册中心使用 ORCID、APICURON 和数据管理向导等外部资源来简化注释过程并确保文档的全面性。未来的计划包括通过社区化继续发展该注册机构,改进 DOME 分数定义,鼓励出版商采用 DOME 标准,提高生命科学领域 ML 的透明度和可重复性。