GestaltMatcher Database - A global reference for facial phenotypic variability in rare human diseases.

Hellen Lesmann, Alexander Hustinx, Shahida Moosa, Hannah Klinkhammer, Elaine Marchi, Pilar Caro, Ibrahim M Abdelrazek, Jean Tori Pantel, Merle Ten Hagen, Meow-Keong Thong, Rifhan Azwani Binti Mazlan, Sok Kun Tae, Tom Kamphans, Wolfgang Meiswinkel, Jing-Mei Li, Behnam Javanmardi, Alexej Knaus, Annette Uwineza, Cordula Knopp, Tinatin Tkemaladze, Miriam Elbracht, Larissa Mattern, Rami Abou Jamra, Clara Velmans, Vincent Strehlow, Maureen Jacob, Angela Peron, Cristina Dias, Beatriz Carvalho Nunes, Thainá Vilella, Isabel Furquim Pinheiro, Chong Ae Kim, Maria Isabel Melaragno, Hannah Weiland, Sophia Kaptain, Karolina Chwiałkowska, Miroslaw Kwasniewski, Ramy Saad, Sarah Wiethoff, Himanshu Goel, Clara Tang, Anna Hau, Tahsin Stefan Barakat, Przemysław Panek, Amira Nabil, Julia Suh, Frederik Braun, Israel Gomy, Luisa Averdunk, Ekanem Ekure, Gaber Bergant, Borut Peterlin, Claudio Graziano, Nagwa Gaboon, Moisés Fiesco-Roa, Alessandro Mauro Spinelli, Nina-Maria Wilpert, Prasit Phowthongkum, Nergis Güzel, Tobias B Haack, Rana Bitar, Andreas Tzschach, Agusti Rodriguez-Palmero, Theresa Brunet, Sabine Rudnik-Schöneborn, Silvina Noemi Contreras-Capetillo, Ava Oberlack, Carole Samango-Sprouse, Teresa Sadeghin, Margaret Olaya, Konrad Platzer, Artem Borovikov, Franziska Schnabel, Lara Heuft, Vera Herrmann, Renske Oegema, Nour Elkhateeb, Sheetal Kumar, Katalin Komlosi, Khoushoua Mohamed, Silvia Kalantari, Fabio Sirchia, Antonio F Martinez-Monseny, Matthias Höller, Louiza Toutouna, Amal Mohamed, Amaia Lasa-Aranzasti, John A Sayer, Nadja Ehmke, Magdalena Danyel, Henrike Sczakiel, Sarina Schwartzmann, Felix Boschann, Max Zhao, Ronja Adam, Lara Einicke, Denise Horn, Kee Seang Chew, Choy Chen Kam, Miray Karakoyun, Ben Pode-Shakked, Aviva Eliyahu, Rachel Rock, Teresa Carrion, Odelia Chorin, Yuri A Zarate, Marcelo Martinez Conti, Mert Karakaya, Moon Ley Tung, Bharatendu Chandra, Arjan Bouman, Aime Lumaka, Naveed Wasif, Marwan Shinawi, Patrick R Blackburn, Tianyun Wang, Tim Niehues, Axel Schmidt, Regina Rita Roth, Dagmar Wieczorek, Ping Hu, Rebekah L Waikel, Suzanna E Ledgister Hanchard, Gehad Elmakkawy, Sylvia Safwat, Frédéric Ebstein, Elke Krüger, Sébastien Küry, Stéphane Bézieau, Annabelle Arlt, Eric Olinger, Felix Marbach, Dong Li, Lucie Dupuis, Roberto Mendoza-Londono, Sofia Douzgou Houge, Denisa Weis, Brian Hon-Yin Chung, Christopher C Y Mak, Hülya Kayserili, Nursel Elcioglu, Ayca Aykut, Peli Özlem Şimşek-Kiper, Nina Bögershausen, Bernd Wollnik, Heidi Beate Bentzen, Ingo Kurth, Christian Netzer, Aleksandra Jezela-Stanek, Koen Devriendt, Karen W Gripp, Martin Mücke, Alain Verloes, Christian P Schaaf, Christoffer Nellåker, Benjamin D Solomon, Markus M Nöthen, Ebtesam Abdalla, Gholson J Lyon, Peter M Krawitz, Tzung-Chien Hsieh
{"title":"GestaltMatcher Database - A global reference for facial phenotypic variability in rare human diseases.","authors":"Hellen Lesmann, Alexander Hustinx, Shahida Moosa, Hannah Klinkhammer, Elaine Marchi, Pilar Caro, Ibrahim M Abdelrazek, Jean Tori Pantel, Merle Ten Hagen, Meow-Keong Thong, Rifhan Azwani Binti Mazlan, Sok Kun Tae, Tom Kamphans, Wolfgang Meiswinkel, Jing-Mei Li, Behnam Javanmardi, Alexej Knaus, Annette Uwineza, Cordula Knopp, Tinatin Tkemaladze, Miriam Elbracht, Larissa Mattern, Rami Abou Jamra, Clara Velmans, Vincent Strehlow, Maureen Jacob, Angela Peron, Cristina Dias, Beatriz Carvalho Nunes, Thainá Vilella, Isabel Furquim Pinheiro, Chong Ae Kim, Maria Isabel Melaragno, Hannah Weiland, Sophia Kaptain, Karolina Chwiałkowska, Miroslaw Kwasniewski, Ramy Saad, Sarah Wiethoff, Himanshu Goel, Clara Tang, Anna Hau, Tahsin Stefan Barakat, Przemysław Panek, Amira Nabil, Julia Suh, Frederik Braun, Israel Gomy, Luisa Averdunk, Ekanem Ekure, Gaber Bergant, Borut Peterlin, Claudio Graziano, Nagwa Gaboon, Moisés Fiesco-Roa, Alessandro Mauro Spinelli, Nina-Maria Wilpert, Prasit Phowthongkum, Nergis Güzel, Tobias B Haack, Rana Bitar, Andreas Tzschach, Agusti Rodriguez-Palmero, Theresa Brunet, Sabine Rudnik-Schöneborn, Silvina Noemi Contreras-Capetillo, Ava Oberlack, Carole Samango-Sprouse, Teresa Sadeghin, Margaret Olaya, Konrad Platzer, Artem Borovikov, Franziska Schnabel, Lara Heuft, Vera Herrmann, Renske Oegema, Nour Elkhateeb, Sheetal Kumar, Katalin Komlosi, Khoushoua Mohamed, Silvia Kalantari, Fabio Sirchia, Antonio F Martinez-Monseny, Matthias Höller, Louiza Toutouna, Amal Mohamed, Amaia Lasa-Aranzasti, John A Sayer, Nadja Ehmke, Magdalena Danyel, Henrike Sczakiel, Sarina Schwartzmann, Felix Boschann, Max Zhao, Ronja Adam, Lara Einicke, Denise Horn, Kee Seang Chew, Choy Chen Kam, Miray Karakoyun, Ben Pode-Shakked, Aviva Eliyahu, Rachel Rock, Teresa Carrion, Odelia Chorin, Yuri A Zarate, Marcelo Martinez Conti, Mert Karakaya, Moon Ley Tung, Bharatendu Chandra, Arjan Bouman, Aime Lumaka, Naveed Wasif, Marwan Shinawi, Patrick R Blackburn, Tianyun Wang, Tim Niehues, Axel Schmidt, Regina Rita Roth, Dagmar Wieczorek, Ping Hu, Rebekah L Waikel, Suzanna E Ledgister Hanchard, Gehad Elmakkawy, Sylvia Safwat, Frédéric Ebstein, Elke Krüger, Sébastien Küry, Stéphane Bézieau, Annabelle Arlt, Eric Olinger, Felix Marbach, Dong Li, Lucie Dupuis, Roberto Mendoza-Londono, Sofia Douzgou Houge, Denisa Weis, Brian Hon-Yin Chung, Christopher C Y Mak, Hülya Kayserili, Nursel Elcioglu, Ayca Aykut, Peli Özlem Şimşek-Kiper, Nina Bögershausen, Bernd Wollnik, Heidi Beate Bentzen, Ingo Kurth, Christian Netzer, Aleksandra Jezela-Stanek, Koen Devriendt, Karen W Gripp, Martin Mücke, Alain Verloes, Christian P Schaaf, Christoffer Nellåker, Benjamin D Solomon, Markus M Nöthen, Ebtesam Abdalla, Gholson J Lyon, Peter M Krawitz, Tzung-Chien Hsieh","doi":"10.1101/2023.06.06.23290887","DOIUrl":null,"url":null,"abstract":"<p><p>The most important factor that complicates the work of dysmorphologists is the significant phenotypic variability of the human face. Next-Generation Phenotyping (NGP) tools that assist clinicians with recognizing characteristic syndromic patterns are particularly challenged when confronted with patients from populations different from their training data. To that end, we systematically analyzed the impact of genetic ancestry on facial dysmorphism. For that purpose, we established the GestaltMatcher Database (GMDB) as a reference dataset for medical images of patients with rare genetic disorders from around the world. We collected 10,980 frontal facial images - more than a quarter previously unpublished - from 8,346 patients, representing 581 rare disorders. Although the predominant ancestry is still European (67%), data from underrepresented populations have been increased considerably via global collaborations (19% Asian and 7% African). This includes previously unpublished reports for more than 40% of the African patients. The NGP analysis on this diverse dataset revealed characteristic performance differences depending on the composition of training and test sets corresponding to genetic relatedness. For clinical use of NGP, incorporating non-European patients resulted in a profound enhancement of GestaltMatcher performance. The top-5 accuracy rate increased by +11.29%. Importantly, this improvement in delineating the correct disorder from a facial portrait was achieved without decreasing the performance on European patients. By design, GMDB complies with the FAIR principles by rendering the curated medical data findable, accessible, interoperable, and reusable. This means GMDB can also serve as data for training and benchmarking. In summary, our study on facial dysmorphism on a global sample revealed a considerable cross ancestral phenotypic variability confounding NGP that should be counteracted by international efforts for increasing data diversity. GMDB will serve as a vital reference database for clinicians and a transparent training set for advancing NGP technology.</p>","PeriodicalId":18659,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2b/fe/nihpp-2023.06.06.23290887v1.PMC10371103.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.06.06.23290887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The most important factor that complicates the work of dysmorphologists is the significant phenotypic variability of the human face. Next-Generation Phenotyping (NGP) tools that assist clinicians with recognizing characteristic syndromic patterns are particularly challenged when confronted with patients from populations different from their training data. To that end, we systematically analyzed the impact of genetic ancestry on facial dysmorphism. For that purpose, we established the GestaltMatcher Database (GMDB) as a reference dataset for medical images of patients with rare genetic disorders from around the world. We collected 10,980 frontal facial images - more than a quarter previously unpublished - from 8,346 patients, representing 581 rare disorders. Although the predominant ancestry is still European (67%), data from underrepresented populations have been increased considerably via global collaborations (19% Asian and 7% African). This includes previously unpublished reports for more than 40% of the African patients. The NGP analysis on this diverse dataset revealed characteristic performance differences depending on the composition of training and test sets corresponding to genetic relatedness. For clinical use of NGP, incorporating non-European patients resulted in a profound enhancement of GestaltMatcher performance. The top-5 accuracy rate increased by +11.29%. Importantly, this improvement in delineating the correct disorder from a facial portrait was achieved without decreasing the performance on European patients. By design, GMDB complies with the FAIR principles by rendering the curated medical data findable, accessible, interoperable, and reusable. This means GMDB can also serve as data for training and benchmarking. In summary, our study on facial dysmorphism on a global sample revealed a considerable cross ancestral phenotypic variability confounding NGP that should be counteracted by international efforts for increasing data diversity. GMDB will serve as a vital reference database for clinicians and a transparent training set for advancing NGP technology.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GestaltMatcher数据库-一个FAIR数据库,用于罕见疾病的医学成像数据。
计算机辅助图像分析的价值已经在几项研究中得到了证明。人工智能工具(如GestaltMatcher)的性能随着训练集的大小和多样性而提高,但正确标记的训练数据是目前开发下一代表型(NGP)应用程序的最大瓶颈。因此,我们开发了GestaltMatcher数据库(GMDB),这是一个机器可读医学图像数据的数据库,符合FAIR原则,提高了医学遗传学科学发现的开放性和可访问性。GMDB中的条目包括医学图像,如肖像、X射线或眼底镜检查,以及机器可读元信息,如HPO术语中编码的临床特征或HGVS格式报告的致病突变。一开始,数据主要是由策展人从文献中收集图像来收集的。目前,从患者支持小组招募的临床医生和个人提供了他们以前未发表的数据。对于这种以患者为中心的方法,我们开发了一种数字同意书。GMDB是一种现代的病例报告出版媒介,补充了预印本,例如medRxiv。为了实现队列间比较,我们在GMDB中实现了一个研究功能,该功能计算手工挑选的病例之间的成对症状相似性。通过社区驱动的努力,我们收集了超过7533例GMDB中792种疾病的图像。大部分数据来自2058份出版物。此外,还获得了498例先前未发表病例的约1018张正面图像。网络界面允许以基因和表型为中心的查询或在图库中进行无限滚动。数字同意导致患者越来越多地采用这种方法。GMDB中的研究应用程序用于生成症状相似性矩阵,以表征两种新表型(CSNK2B、PSMC3)。GMDB是NGP的第一个FAIR数据库,其中的数据是可查找、可访问、可互操作和可重用的。它是medRxiv中无法包含的医学图像的存储库。这意味着GMDB将临床医生与特定表型的共同兴趣联系起来,并提高人工智能的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
After the Infection: A Survey of Pathogens and Non-communicable Human Disease. The Extra-Islet Pancreas Supports Autoimmunity in Human Type 1 Diabetes. Keyphrase Identification Using Minimal Labeled Data with Hierarchical Contexts and Transfer Learning. Advancing Efficacy Prediction for EHR-based Emulated Trials in Repurposing Heart Failure Therapies. Novel autoantibody targets identified in patients with autoimmune hepatitis (AIH) by PhIP-Seq reveals pathogenic insights.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1