在转向无机物的同时,使 InChI 具有 FAIR 和可持续性

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-09-04 DOI:10.1039/d4fd00145a
Gerd Blanke, Jan Brammer, Djordje Baljozovic, Nauman Khan, Frank Lange, Felix Bänsch, Clare A. Tovee, Ulrich Schatzschneider, Richard M Hartshorn, Sonja Herres-Pawlis
{"title":"在转向无机物的同时,使 InChI 具有 FAIR 和可持续性","authors":"Gerd Blanke, Jan Brammer, Djordje Baljozovic, Nauman Khan, Frank Lange, Felix Bänsch, Clare A. Tovee, Ulrich Schatzschneider, Richard M Hartshorn, Sonja Herres-Pawlis","doi":"10.1039/d4fd00145a","DOIUrl":null,"url":null,"abstract":"The InChI (International Chemical Identifier) standard stands as a cornerstone in chemical informatics, facilitating the structure-based identification and exchange of chemical compounds across various platforms and databases. The InChI as a unique canonical line notation has made chemical structures searchable on the internet at a broad scale. The largest repositories working with InChIs contain more than 1 billion structures. Central to the functionality of the InChI is its codebase, which orchestrates a series of intricate steps to generate unique identifiers for chemical compounds. Up to now, these steps have been sparsely documented and the InChI algorithm had to be seen as a black box. For the new v1.07 release, the code has been analyzed and the major steps documented, more than 3000 bugs and security issues, as well as nearly 60 Google OSS-Fuzz issues have been fixed. New test systems have been implemented that allow users to directly test the code developments. The move to GitHub has not only made the development more transparent but will also enable external contributors to join the further development of the InChI code. Motivation for this modernisation was the urgency to treat molecular inorganic compounds by the InChI in a meaningful way. Until now, no classic string representation fulfills this need of molecular inorganic chemistry. The connection of metal bonds is by definition disconnected which makes most inorganic InChIs meaningless at the moment. Herein, we propose new routines to remedy this problem in the representation of molecular inorganic compounds by the InChI.","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Making the InChI FAIR and sustainable while moving to Inorganics\",\"authors\":\"Gerd Blanke, Jan Brammer, Djordje Baljozovic, Nauman Khan, Frank Lange, Felix Bänsch, Clare A. Tovee, Ulrich Schatzschneider, Richard M Hartshorn, Sonja Herres-Pawlis\",\"doi\":\"10.1039/d4fd00145a\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The InChI (International Chemical Identifier) standard stands as a cornerstone in chemical informatics, facilitating the structure-based identification and exchange of chemical compounds across various platforms and databases. The InChI as a unique canonical line notation has made chemical structures searchable on the internet at a broad scale. The largest repositories working with InChIs contain more than 1 billion structures. Central to the functionality of the InChI is its codebase, which orchestrates a series of intricate steps to generate unique identifiers for chemical compounds. Up to now, these steps have been sparsely documented and the InChI algorithm had to be seen as a black box. For the new v1.07 release, the code has been analyzed and the major steps documented, more than 3000 bugs and security issues, as well as nearly 60 Google OSS-Fuzz issues have been fixed. New test systems have been implemented that allow users to directly test the code developments. The move to GitHub has not only made the development more transparent but will also enable external contributors to join the further development of the InChI code. Motivation for this modernisation was the urgency to treat molecular inorganic compounds by the InChI in a meaningful way. Until now, no classic string representation fulfills this need of molecular inorganic chemistry. The connection of metal bonds is by definition disconnected which makes most inorganic InChIs meaningless at the moment. Herein, we propose new routines to remedy this problem in the representation of molecular inorganic compounds by the InChI.\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1039/d4fd00145a\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4fd00145a","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

InChI(国际化学标识符)标准是化学信息学的基石,有助于在各种平台和数据库中进行基于结构的化合物识别和交换。InChI 作为一种独特的规范行符号,使化学结构可以在互联网上进行广泛搜索。使用 InChIs 的最大资源库包含 10 亿多个结构。InChI 功能的核心是其代码库,它协调了一系列复杂的步骤来生成化合物的唯一标识符。到目前为止,对这些步骤的记录还很少,InChI 算法只能被看作是一个黑盒子。在新发布的 v1.07 版中,对代码进行了分析,并记录了主要步骤,修复了 3000 多个错误和安全问题,以及近 60 个 Google OSS-Fuzz 问题。此外,还实施了新的测试系统,允许用户直接测试代码开发。迁移到 GitHub 不仅使开发工作更加透明,还能让外部贡献者加入到 InChI 代码的进一步开发中。InChI之所以要进行现代化改造,是因为迫切需要以一种有意义的方式来处理分子无机化合物。到目前为止,还没有一种经典的字符串表示法能满足分子无机化学的这一需求。根据定义,金属键的连接是断开的,这使得大多数无机 InChI 目前毫无意义。在此,我们提出了新的例程,以弥补用 InChI 表示分子无机化合物的这一问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Making the InChI FAIR and sustainable while moving to Inorganics
The InChI (International Chemical Identifier) standard stands as a cornerstone in chemical informatics, facilitating the structure-based identification and exchange of chemical compounds across various platforms and databases. The InChI as a unique canonical line notation has made chemical structures searchable on the internet at a broad scale. The largest repositories working with InChIs contain more than 1 billion structures. Central to the functionality of the InChI is its codebase, which orchestrates a series of intricate steps to generate unique identifiers for chemical compounds. Up to now, these steps have been sparsely documented and the InChI algorithm had to be seen as a black box. For the new v1.07 release, the code has been analyzed and the major steps documented, more than 3000 bugs and security issues, as well as nearly 60 Google OSS-Fuzz issues have been fixed. New test systems have been implemented that allow users to directly test the code developments. The move to GitHub has not only made the development more transparent but will also enable external contributors to join the further development of the InChI code. Motivation for this modernisation was the urgency to treat molecular inorganic compounds by the InChI in a meaningful way. Until now, no classic string representation fulfills this need of molecular inorganic chemistry. The connection of metal bonds is by definition disconnected which makes most inorganic InChIs meaningless at the moment. Herein, we propose new routines to remedy this problem in the representation of molecular inorganic compounds by the InChI.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Vitamin B12: prevention of human beings from lethal diseases and its food application. Current status and obstacles of narrowing yield gaps of four major crops. Cold shock treatment alleviates pitting in sweet cherry fruit by enhancing antioxidant enzymes activity and regulating membrane lipid metabolism. Removal of proteins and lipids affects structure, in vitro digestion and physicochemical properties of rice flour modified by heat-moisture treatment. Investigating the impact of climate variables on the organic honey yield in Turkey using XGBoost machine learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1