Pub Date : 2024-12-01Epub Date: 2024-12-11DOI: 10.1200/CCI.24.00123
Zacharie Hamilton, Aseem Aseem, Zhengjia Chen, Noor Naffakh, Natalie M Reizine, Frank Weinberg, Shikha Jain, Larry G Kessler, Vijayakrishna K Gadi, Christopher Bun, Ryan H Nguyen
Purpose: Precision oncology in non-small cell lung cancer (NSCLC) relies on biomarker testing for clinical decision making. Despite its importance, challenges like the lack of genomic oncology training, nonstandardized biomarker reporting, and a rapidly evolving treatment landscape hinder its practice. Generative artificial intelligence (AI), such as ChatGPT, offers promise for enhancing clinical decision support. Effective performance metrics are crucial to evaluate these models' accuracy and their propensity for producing incorrect or hallucinated information. We assessed various ChatGPT versions' ability to generate accurate next-generation sequencing reports and treatment recommendations for NSCLC, using a novel Generative AI Performance Score (G-PS), which considers accuracy, relevancy, and hallucinations.
Methods: We queried ChatGPT versions for first-line NSCLC treatment recommendations with an Food and Drug Administration-approved targeted therapy, using a zero-shot prompt approach for eight oncogenes. Responses were assessed against National Comprehensive Cancer Network (NCCN) guidelines for accuracy, relevance, and hallucinations, with G-PS calculating scores from -1 (all hallucinations) to 1 (fully NCCN-compliant recommendations). G-PS was designed as a composite measure with a base score for correct recommendations (weighted for preferred treatments) and a penalty for hallucinations.
Results: Analyzing 160 responses, generative pre-trained transformer (GPT)-4 outperformed GPT-3.5, showing higher base score (90% v 60%; P < .01) and fewer hallucinations (34% v 53%; P < .01). GPT-4's overall G-PS was significantly higher (0.34 v -0.15; P < .01), indicating superior performance.
Conclusion: This study highlights the rapid improvement of generative AI in matching treatment recommendations with biomarkers in precision oncology. Although the rate of hallucinations improved in the GPT-4 model, future generative AI use in clinical care requires high levels of accuracy with minimal to no room for hallucinations. The GP-S represents a novel metric quantifying generative AI utility in health care compared with national guidelines, with potential adaptation beyond precision oncology.
{"title":"Comparative Analysis of Generative Pre-Trained Transformer Models in Oncogene-Driven Non-Small Cell Lung Cancer: Introducing the Generative Artificial Intelligence Performance Score.","authors":"Zacharie Hamilton, Aseem Aseem, Zhengjia Chen, Noor Naffakh, Natalie M Reizine, Frank Weinberg, Shikha Jain, Larry G Kessler, Vijayakrishna K Gadi, Christopher Bun, Ryan H Nguyen","doi":"10.1200/CCI.24.00123","DOIUrl":"10.1200/CCI.24.00123","url":null,"abstract":"<p><strong>Purpose: </strong>Precision oncology in non-small cell lung cancer (NSCLC) relies on biomarker testing for clinical decision making. Despite its importance, challenges like the lack of genomic oncology training, nonstandardized biomarker reporting, and a rapidly evolving treatment landscape hinder its practice. Generative artificial intelligence (AI), such as ChatGPT, offers promise for enhancing clinical decision support. Effective performance metrics are crucial to evaluate these models' accuracy and their propensity for producing incorrect or hallucinated information. We assessed various ChatGPT versions' ability to generate accurate next-generation sequencing reports and treatment recommendations for NSCLC, using a novel Generative AI Performance Score (G-PS), which considers accuracy, relevancy, and hallucinations.</p><p><strong>Methods: </strong>We queried ChatGPT versions for first-line NSCLC treatment recommendations with an Food and Drug Administration-approved targeted therapy, using a zero-shot prompt approach for eight oncogenes. Responses were assessed against National Comprehensive Cancer Network (NCCN) guidelines for accuracy, relevance, and hallucinations, with G-PS calculating scores from -1 (all hallucinations) to 1 (fully NCCN-compliant recommendations). G-PS was designed as a composite measure with a base score for correct recommendations (weighted for preferred treatments) and a penalty for hallucinations.</p><p><strong>Results: </strong>Analyzing 160 responses, generative pre-trained transformer (GPT)-4 outperformed GPT-3.5, showing higher base score (90% <i>v</i> 60%; <i>P</i> < .01) and fewer hallucinations (34% <i>v</i> 53%; <i>P</i> < .01). GPT-4's overall G-PS was significantly higher (0.34 <i>v</i> -0.15; <i>P</i> < .01), indicating superior performance.</p><p><strong>Conclusion: </strong>This study highlights the rapid improvement of generative AI in matching treatment recommendations with biomarkers in precision oncology. Although the rate of hallucinations improved in the GPT-4 model, future generative AI use in clinical care requires high levels of accuracy with minimal to no room for hallucinations. The GP-S represents a novel metric quantifying generative AI utility in health care compared with national guidelines, with potential adaptation beyond precision oncology.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400123"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634130/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-11DOI: 10.1200/CCI.24.00126
Li-Ching Chen, Travis Zack, Arda Demirci, Madhumita Sushil, Brenda Miao, Corynn Kasap, Atul Butte, Eric A Collisson, Julian C Hong
Purpose: We examined the effectiveness of proprietary and open large language models (LLMs) in detecting disease presence, location, and treatment response in pancreatic cancer from radiology reports.
Methods: We analyzed 203 deidentified radiology reports, manually annotated for disease status, location, and indeterminate nodules needing follow-up. Using generative pre-trained transformer (GPT)-4, GPT-3.5-turbo, and open models such as Gemma-7B and Llama3-8B, we employed strategies such as ablation and prompt engineering to boost accuracy. Discrepancies between human and model interpretations were reviewed by a secondary oncologist.
Results: Among 164 patients with pancreatic tumor, GPT-4 showed the highest accuracy in inferring disease status, achieving a 75.5% correctness (F1-micro). Open models Mistral-7B and Llama3-8B performed comparably, with accuracies of 68.6% and 61.4%, respectively. Mistral-7B excelled in deriving correct inferences from objective findings directly. Most tested models demonstrated proficiency in identifying disease containing anatomic locations from a list of choices, with GPT-4 and Llama3-8B showing near-parity in precision and recall for disease site identification. However, open models struggled with differentiating benign from malignant postsurgical changes, affecting their precision in identifying findings indeterminate for cancer. A secondary review occasionally favored GPT-3.5's interpretations, indicating the variability in human judgment.
Conclusion: LLMs, especially GPT-4, are proficient in deriving oncologic insights from radiology reports. Their performance is enhanced by effective summarization strategies, demonstrating their potential in clinical support and health care analytics. This study also underscores the possibility of zero-shot open model utility in environments where proprietary models are restricted. Finally, by providing a set of annotated radiology reports, this paper presents a valuable data set for further LLM research in oncology.
{"title":"Assessing Large Language Models for Oncology Data Inference From Radiology Reports.","authors":"Li-Ching Chen, Travis Zack, Arda Demirci, Madhumita Sushil, Brenda Miao, Corynn Kasap, Atul Butte, Eric A Collisson, Julian C Hong","doi":"10.1200/CCI.24.00126","DOIUrl":"https://doi.org/10.1200/CCI.24.00126","url":null,"abstract":"<p><strong>Purpose: </strong>We examined the effectiveness of proprietary and open large language models (LLMs) in detecting disease presence, location, and treatment response in pancreatic cancer from radiology reports.</p><p><strong>Methods: </strong>We analyzed 203 deidentified radiology reports, manually annotated for disease status, location, and indeterminate nodules needing follow-up. Using generative pre-trained transformer (GPT)-4, GPT-3.5-turbo, and open models such as Gemma-7B and Llama3-8B, we employed strategies such as ablation and prompt engineering to boost accuracy. Discrepancies between human and model interpretations were reviewed by a secondary oncologist.</p><p><strong>Results: </strong>Among 164 patients with pancreatic tumor, GPT-4 showed the highest accuracy in inferring disease status, achieving a 75.5% correctness (F1-micro). Open models Mistral-7B and Llama3-8B performed comparably, with accuracies of 68.6% and 61.4%, respectively. Mistral-7B excelled in deriving correct inferences from objective findings directly. Most tested models demonstrated proficiency in identifying disease containing anatomic locations from a list of choices, with GPT-4 and Llama3-8B showing near-parity in precision and recall for disease site identification. However, open models struggled with differentiating benign from malignant postsurgical changes, affecting their precision in identifying findings indeterminate for cancer. A secondary review occasionally favored GPT-3.5's interpretations, indicating the variability in human judgment.</p><p><strong>Conclusion: </strong>LLMs, especially GPT-4, are proficient in deriving oncologic insights from radiology reports. Their performance is enhanced by effective summarization strategies, demonstrating their potential in clinical support and health care analytics. This study also underscores the possibility of zero-shot open model utility in environments where proprietary models are restricted. Finally, by providing a set of annotated radiology reports, this paper presents a valuable data set for further LLM research in oncology.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400126"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Postsustained virologic response (SVR) screening following clinical guidelines does not address individual risk of hepatocellular carcinoma (HCC). Our aim is to provide tailored screening for patients using machine learning to predict HCC incidence after SVR.
Methods: Using clinical data from 1,028 SVR patients, we developed an HCC prediction model using a random survival forest (RSF). Model performance was assessed using Harrel's c-index and validated in an independent cohort of 737 SVR patients. Shapley additive explanation (SHAP) facilitated feature quantification, whereas optimal cutoffs were determined using maximally selected rank statistics. We used Kaplan-Meier analysis to compare cumulative HCC incidence between risk groups.
Results: We achieved c-index scores and 95% CIs of 0.90 (0.85 to 0.94) and 0.80 (0.74 to 0.85) in the derivation and validation cohorts, respectively, in a model using platelet count, gamma-glutamyl transpeptidase, sex, age, and ALT. Stratification resulted in four risk groups: low, intermediate, high, and very high. The 5-year cumulative HCC incidence rates and 95% CIs for these groups were as follows: derivation: 0% (0 to 0), 3.8% (0.6 to 6.8), 26.2% (17.2 to 34.3), and 54.2% (20.2 to 73.7), respectively, and validation: 0.7% (0 to 1.6), 7.1% (2.7 to 11.3), 5.2% (0 to 10.8), and 28.6% (0 to 55.3), respectively.
Conclusion: The integration of RSF and SHAP enabled accurate HCC risk classification after SVR, which may facilitate individualized HCC screening strategies and more cost-effective care.
{"title":"Prediction of Hepatocellular Carcinoma After Hepatitis C Virus Sustained Virologic Response Using a Random Survival Forest Model.","authors":"Hikaru Nakahara, Atsushi Ono, C Nelson Hayes, Yuki Shirane, Ryoichi Miura, Yasutoshi Fujii, Serami Murakami, Kenji Yamaoka, Hauri Bao, Shinsuke Uchikawa, Hatsue Fujino, Eisuke Murakami, Tomokazu Kawaoka, Daiki Miki, Masataka Tsuge, Shiro Oka","doi":"10.1200/CCI.24.00108","DOIUrl":"https://doi.org/10.1200/CCI.24.00108","url":null,"abstract":"<p><strong>Purpose: </strong>Postsustained virologic response (SVR) screening following clinical guidelines does not address individual risk of hepatocellular carcinoma (HCC). Our aim is to provide tailored screening for patients using machine learning to predict HCC incidence after SVR.</p><p><strong>Methods: </strong>Using clinical data from 1,028 SVR patients, we developed an HCC prediction model using a random survival forest (RSF). Model performance was assessed using Harrel's c-index and validated in an independent cohort of 737 SVR patients. Shapley additive explanation (SHAP) facilitated feature quantification, whereas optimal cutoffs were determined using maximally selected rank statistics. We used Kaplan-Meier analysis to compare cumulative HCC incidence between risk groups.</p><p><strong>Results: </strong>We achieved c-index scores and 95% CIs of 0.90 (0.85 to 0.94) and 0.80 (0.74 to 0.85) in the derivation and validation cohorts, respectively, in a model using platelet count, gamma-glutamyl transpeptidase, sex, age, and ALT. Stratification resulted in four risk groups: low, intermediate, high, and very high. The 5-year cumulative HCC incidence rates and 95% CIs for these groups were as follows: derivation: 0% (0 to 0), 3.8% (0.6 to 6.8), 26.2% (17.2 to 34.3), and 54.2% (20.2 to 73.7), respectively, and validation: 0.7% (0 to 1.6), 7.1% (2.7 to 11.3), 5.2% (0 to 10.8), and 28.6% (0 to 55.3), respectively.</p><p><strong>Conclusion: </strong>The integration of RSF and SHAP enabled accurate HCC risk classification after SVR, which may facilitate individualized HCC screening strategies and more cost-effective care.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400108"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142856659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-20DOI: 10.1200/CCI.24.00132
Gurjyot K Doshi, Andrew J Osterland, Ping Shi, Annette Yim, Viviana Del Tejo, Sarah B Guttenplan, Samantha Eiffert, Xin Yin, Lisa Rosenblatt, Paul R Conkling
Purpose: Nivolumab plus ipilimumab (NIVO + IPI) is a first-in-class combination immunotherapy for the treatment of intermediate- or poor (I/P)-risk advanced or metastatic renal cell carcinoma (mRCC). Currently, there are limited real-world data regarding clinical effectiveness beyond 12-24 months from treatment initiation. In this real-world study, treatment patterns and clinical outcomes were evaluated for NIVO + IPI in a community oncology setting.
Methods: A retrospective analysis using electronic medical record data from The US Oncology Network examined patients with I/P-risk clear cell mRCC who initiated first-line (1L) NIVO + IPI between January 4, 2018, and December 31, 2019, with follow-up until June 30, 2022. Baseline demographics, clinical characteristics, treatment patterns, clinical effectiveness, and safety outcomes were assessed descriptively. Overall survival (OS) and real-world progression-free survival (rwPFS) were analyzed using Kaplan-Meier methods.
Results: Among 187 patients identified (median follow-up, 22.4 months), with median age 63 (range, 30-89) years, 74 (39.6%) patients had poor risk and 37 (19.8%) patients had Eastern Cooperative Oncology Group performance status score ≥2. Of 86 patients who received second-line therapy, 54.7% received cabozantinib and 10.5% received pazopanib. The median (95% CI) OS and rwPFS were 38.4 (24.7-46.1) months and 11.1 (7.5-15.0) months, respectively. Treatment-related adverse events (TRAEs) were reported in 89 (47.6%) patients, including fatigue (n = 25, 13.4%) and rash (n = 19, 10.2%).
Conclusion: This study provides data to support the understanding of the real-world utilization and long-term effectiveness of 1L NIVO + IPI in patients with I/P-risk mRCC. TRAE rates were low relative to clinical trials.
目的:Nivolumab + ipilimumab (NIVO + IPI)是一种用于治疗中或低(I/P)风险晚期或转移性肾细胞癌(mRCC)的首创联合免疫疗法。目前,关于治疗开始后12-24个月的临床有效性的实际数据有限。在这项现实世界的研究中,在社区肿瘤学环境中评估了NIVO + IPI的治疗模式和临床结果。方法:回顾性分析美国肿瘤网络的电子病历数据,对2018年1月4日至2019年12月31日期间开始一线(1L) NIVO + IPI的I/ p -风险透明细胞mRCC患者进行分析,随访至2022年6月30日。对基线人口统计学、临床特征、治疗模式、临床有效性和安全性结果进行描述性评估。采用Kaplan-Meier方法分析总生存期(OS)和真实世界无进展生存期(rwPFS)。结果:187例患者(中位随访22.4个月),中位年龄63岁(范围30 ~ 89岁),不良风险74例(39.6%),东部肿瘤合作组绩效状态评分≥2例(19.8%)。86名接受二线治疗的患者中,54.7%接受卡博赞替尼治疗,10.5%接受帕唑帕尼治疗。中位(95% CI) OS和rwPFS分别为38.4(24.7-46.1)个月和11.1(7.5-15.0)个月。89例(47.6%)患者报告了治疗相关不良事件(TRAEs),包括疲劳(n = 25, 13.4%)和皮疹(n = 19, 10.2%)。结论:本研究提供的数据支持了解1L NIVO + IPI在I/P-risk mRCC患者中的实际使用情况和长期有效性。与临床试验相比,TRAE率较低。
{"title":"Real-World Outcomes in Patients With Metastatic Renal Cell Carcinoma Treated With First-Line Nivolumab Plus Ipilimumab in the United States.","authors":"Gurjyot K Doshi, Andrew J Osterland, Ping Shi, Annette Yim, Viviana Del Tejo, Sarah B Guttenplan, Samantha Eiffert, Xin Yin, Lisa Rosenblatt, Paul R Conkling","doi":"10.1200/CCI.24.00132","DOIUrl":"10.1200/CCI.24.00132","url":null,"abstract":"<p><strong>Purpose: </strong>Nivolumab plus ipilimumab (NIVO + IPI) is a first-in-class combination immunotherapy for the treatment of intermediate- or poor (I/P)-risk advanced or metastatic renal cell carcinoma (mRCC). Currently, there are limited real-world data regarding clinical effectiveness beyond 12-24 months from treatment initiation. In this real-world study, treatment patterns and clinical outcomes were evaluated for NIVO + IPI in a community oncology setting.</p><p><strong>Methods: </strong>A retrospective analysis using electronic medical record data from The US Oncology Network examined patients with I/P-risk clear cell mRCC who initiated first-line (1L) NIVO + IPI between January 4, 2018, and December 31, 2019, with follow-up until June 30, 2022. Baseline demographics, clinical characteristics, treatment patterns, clinical effectiveness, and safety outcomes were assessed descriptively. Overall survival (OS) and real-world progression-free survival (rwPFS) were analyzed using Kaplan-Meier methods.</p><p><strong>Results: </strong>Among 187 patients identified (median follow-up, 22.4 months), with median age 63 (range, 30-89) years, 74 (39.6%) patients had poor risk and 37 (19.8%) patients had Eastern Cooperative Oncology Group performance status score ≥2. Of 86 patients who received second-line therapy, 54.7% received cabozantinib and 10.5% received pazopanib. The median (95% CI) OS and rwPFS were 38.4 (24.7-46.1) months and 11.1 (7.5-15.0) months, respectively. Treatment-related adverse events (TRAEs) were reported in 89 (47.6%) patients, including fatigue (n = 25, 13.4%) and rash (n = 19, 10.2%).</p><p><strong>Conclusion: </strong>This study provides data to support the understanding of the real-world utilization and long-term effectiveness of 1L NIVO + IPI in patients with I/P-risk mRCC. TRAE rates were low relative to clinical trials.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400132"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142869775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-17DOI: 10.1200/CCI-24-00196
Bradley D McDowell, Michael A O'Rorke, Mary C Schroeder, Elizabeth A Chrischilles, Christine M Spinka, Lemuel R Waitman, Kelechi Anuforo, Alejandro Araya, Haddyjatou Bah, Jackson Barlocker, Sravani Chandaka, Lindsay G Cowell, Carol R Geary, Snehil Gupta, Benjamin D Horne, Boyd M Knosp, Albert M Lai, Vasanthi Mandhadi, Abu Saleh Mohammad Mosa, Phillip Reeder, Giyung Ryu, Brian Shukwit, Claire Smith, Alexander J Stoddard, Mahanazuddin Syed, Shorabuddin Syed, Bradley W Taylor, Jeffrey J VanWormer
Purpose: Electronic health records (EHRs) comprise a rich source of real-world data for cancer studies, but they often lack critical structured data elements such as diagnosis date and disease stage. Fortunately, such concepts are available from hospital cancer registries. We describe experiences from integrating cancer registry data with EHR and billing data in an interoperable data model across a multisite clinical research network.
Methods: After sites implemented cancer registry data into a tumor table compatible with the PCORnet Common Data Model (CDM), distributed queries were performed to assess quality issues. After remediation of quality issues, another query produced descriptive frequencies of cancer types and demographic characteristics. This included linked BMI. We also report two current use cases of the new resource.
Results: Eleven sites implemented the tumor table, yielding a resource with data for 572,902 tumors. Institutional and technical barriers were surmounted to accomplish this. Variations in racial and ethnic distributions across the sites were observed; the percent of tumors among Black patients ranged from <1% to 15% across sites, and the percent of tumors among Hispanic patients ranged from 1% to 46% across sites. Current use cases include a pragmatic prospective cohort study of a rare cancer and a retrospective cohort study leveraging body size and chemotherapy dosing.
Conclusion: Integrating cancer registry data with the PCORnet CDM across multiple institutions creates a powerful resource for cancer studies. It provides a wider array of structured, cancer-relevant concepts, and it allows investigators to examine variability in those concepts across many treatment environments. Having the CDM tumor table in place enhances the impact of the network's effectiveness for real-world cancer research.
{"title":"Implementing Cancer Registry Data With the PCORnet Common Data Model: The Greater Plains Collaborative Experience.","authors":"Bradley D McDowell, Michael A O'Rorke, Mary C Schroeder, Elizabeth A Chrischilles, Christine M Spinka, Lemuel R Waitman, Kelechi Anuforo, Alejandro Araya, Haddyjatou Bah, Jackson Barlocker, Sravani Chandaka, Lindsay G Cowell, Carol R Geary, Snehil Gupta, Benjamin D Horne, Boyd M Knosp, Albert M Lai, Vasanthi Mandhadi, Abu Saleh Mohammad Mosa, Phillip Reeder, Giyung Ryu, Brian Shukwit, Claire Smith, Alexander J Stoddard, Mahanazuddin Syed, Shorabuddin Syed, Bradley W Taylor, Jeffrey J VanWormer","doi":"10.1200/CCI-24-00196","DOIUrl":"10.1200/CCI-24-00196","url":null,"abstract":"<p><strong>Purpose: </strong>Electronic health records (EHRs) comprise a rich source of real-world data for cancer studies, but they often lack critical structured data elements such as diagnosis date and disease stage. Fortunately, such concepts are available from hospital cancer registries. We describe experiences from integrating cancer registry data with EHR and billing data in an interoperable data model across a multisite clinical research network.</p><p><strong>Methods: </strong>After sites implemented cancer registry data into a tumor table compatible with the PCORnet Common Data Model (CDM), distributed queries were performed to assess quality issues. After remediation of quality issues, another query produced descriptive frequencies of cancer types and demographic characteristics. This included linked BMI. We also report two current use cases of the new resource.</p><p><strong>Results: </strong>Eleven sites implemented the tumor table, yielding a resource with data for 572,902 tumors. Institutional and technical barriers were surmounted to accomplish this. Variations in racial and ethnic distributions across the sites were observed; the percent of tumors among Black patients ranged from <1% to 15% across sites, and the percent of tumors among Hispanic patients ranged from 1% to 46% across sites. Current use cases include a pragmatic prospective cohort study of a rare cancer and a retrospective cohort study leveraging body size and chemotherapy dosing.</p><p><strong>Conclusion: </strong>Integrating cancer registry data with the PCORnet CDM across multiple institutions creates a powerful resource for cancer studies. It provides a wider array of structured, cancer-relevant concepts, and it allows investigators to examine variability in those concepts across many treatment environments. Having the CDM tumor table in place enhances the impact of the network's effectiveness for real-world cancer research.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400196"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658786/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-11-27DOI: 10.1200/CCI-24-00150
Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen
Purpose: Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question "Did this trial enroll patients with localized disease, metastatic disease, or both?" could be used to narrow down the number of potentially relevant trials when conducting a search.
Methods: Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.
Results: In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.
Conclusion: Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.
目的:以结构化、自动化的方式提取纳入和排除标准仍然是开发更好的搜索功能或对肿瘤随机对照试验进行自动化系统综述所面临的挑战。在进行检索时,"该试验是否纳入了局部疾病、转移性疾病或两者兼有的患者?"这一问题可用于缩小潜在相关试验的数量:根据是否允许纳入局部性疾病和/或转移性疾病患者,对来自高影响力医学期刊的600项试验进行了分类。500 项试验用于开发和验证三种不同的模型,其中 100 项试验用于测试。测试集还用于评估 GPT-4o 在同一任务中的性能:在测试集中,基于规则的系统使用正则表达式预测试验是否允许纳入局部疾病患者,F1 得分为 0.72,预测转移性疾病的 F1 得分为 0.77。基于转换器的机器学习(ML)模型的 F1 分数分别为 0.97 和 0.88。在一种综合方法中,允许基于规则的系统凌驾于 ML 模型之上,F1 分数分别为 0.97 和 0.89。GPT-4o 的 F1 分数分别为 0.87 和 0.92:在纳入局部和/或转移性疾病患者方面对癌症试验进行自动分类是可行的。在某些情况下,将提取试验标准转化为分类问题可以改进循证医学中的文本挖掘方法。越来越多的大型语言模型可以减少或消除对先前任务训练的需求,但代价是计算能力的提高和成本的增加。
{"title":"Metastatic Versus Localized Disease as Inclusion Criteria That Can Be Automatically Extracted From Randomized Controlled Trials Using Natural Language Processing.","authors":"Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen","doi":"10.1200/CCI-24-00150","DOIUrl":"https://doi.org/10.1200/CCI-24-00150","url":null,"abstract":"<p><strong>Purpose: </strong>Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question \"Did this trial enroll patients with localized disease, metastatic disease, or both?\" could be used to narrow down the number of potentially relevant trials when conducting a search.</p><p><strong>Methods: </strong>Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.</p><p><strong>Results: </strong>In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.</p><p><strong>Conclusion: </strong>Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400150"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-10DOI: 10.1200/CCI.23.00263
Dorian Culié, Renaud Schiappa, Sara Contu, Eva Seutin, Tanguy Pace-Loscos, Gilles Poissonnet, Agathe Villarme, Alexandre Bozec, Emmanuel Chamorey
Purpose: Thyroid nodules are common in the general population, and assessing their malignancy risk is the initial step in care. Surgical exploration remains the sole definitive option for indeterminate nodules. Extensive database access is crucial for improving this initial assessment. Our objective was to develop an automated process using convolutional neural networks (CNNs) to extract and structure biomedical insights from electronic health reports (EHRs) in a large thyroid pathology cohort.
Materials and methods: We randomly selected 1,500 patients with thyroid pathology from our cohort for model development and an additional 100 for testing. We then divided the cohort of 1,500 patients into training (70%) and validation (30%) sets. We used EHRs from initial surgeon visits, preanesthesia visits, ultrasound, surgery, and anatomopathology reports. We selected 42 variables of interest and had them manually annotated by a clinical expert. We developed RUBY-THYRO using six distinct CNN models from SpaCy, supplemented with keyword extraction rules and postprocessing. Evaluation against a gold standard database included calculating precision, recall, and F1 score.
Results: Performance remained consistent across the test and validation sets, with the majority of variables (30/42) achieving performance metrics exceeding 90% for all metrics in both sets. Results differed according to the variables; pathologic tumor stage score achieved 100% in precision, recall, and F1 score, versus 45%, 28%, and 32% for the number of nodules in the test set, respectively. Surgical and preanesthesia reports demonstrated particularly high performance.
Conclusion: Our study successfully implemented a CNN-based natural language processing (NLP) approach for extracting and structuring data from various EHRs in thyroid pathology. This highlights the potential of artificial intelligence-driven NLP techniques for extensive and cost-effective data extraction, paving the way for creating comprehensive, hospital-wide data warehouses.
{"title":"Enhancing Thyroid Pathology With Artificial Intelligence: Automated Data Extraction From Electronic Health Reports Using RUBY.","authors":"Dorian Culié, Renaud Schiappa, Sara Contu, Eva Seutin, Tanguy Pace-Loscos, Gilles Poissonnet, Agathe Villarme, Alexandre Bozec, Emmanuel Chamorey","doi":"10.1200/CCI.23.00263","DOIUrl":"https://doi.org/10.1200/CCI.23.00263","url":null,"abstract":"<p><strong>Purpose: </strong>Thyroid nodules are common in the general population, and assessing their malignancy risk is the initial step in care. Surgical exploration remains the sole definitive option for indeterminate nodules. Extensive database access is crucial for improving this initial assessment. Our objective was to develop an automated process using convolutional neural networks (CNNs) to extract and structure biomedical insights from electronic health reports (EHRs) in a large thyroid pathology cohort.</p><p><strong>Materials and methods: </strong>We randomly selected 1,500 patients with thyroid pathology from our cohort for model development and an additional 100 for testing. We then divided the cohort of 1,500 patients into training (70%) and validation (30%) sets. We used EHRs from initial surgeon visits, preanesthesia visits, ultrasound, surgery, and anatomopathology reports. We selected 42 variables of interest and had them manually annotated by a clinical expert. We developed RUBY-THYRO using six distinct CNN models from SpaCy, supplemented with keyword extraction rules and postprocessing. Evaluation against a gold standard database included calculating precision, recall, and F1 score.</p><p><strong>Results: </strong>Performance remained consistent across the test and validation sets, with the majority of variables (30/42) achieving performance metrics exceeding 90% for all metrics in both sets. Results differed according to the variables; pathologic tumor stage score achieved 100% in precision, recall, and F1 score, versus 45%, 28%, and 32% for the number of nodules in the test set, respectively. Surgical and preanesthesia reports demonstrated particularly high performance.</p><p><strong>Conclusion: </strong>Our study successfully implemented a CNN-based natural language processing (NLP) approach for extracting and structuring data from various EHRs in thyroid pathology. This highlights the potential of artificial intelligence-driven NLP techniques for extensive and cost-effective data extraction, paving the way for creating comprehensive, hospital-wide data warehouses.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2300263"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-23DOI: 10.1200/CCI.24.00010
Lie Cai, Thomas M Deutsch, Chris Sidey-Gibbons, Michelle Kobel, Fabian Riedel, Katharina Smetanay, Carlo Fremd, Laura Michel, Michael Golatta, Joerg Heil, Andreas Schneeweiss, André Pfob
Purpose: Toxicity to systemic cancer treatment represents a major anxiety for patients and a challenge to treatment plans. We aimed to develop machine learning algorithms for the upfront prediction of an individual's risk of experiencing treatment-relevant toxicity during the course of treatment.
Methods: Clinical records were retrieved from a single-center, consecutive cohort of patients who underwent neoadjuvant treatment for early breast cancer. We developed and validated machine learning algorithms to predict grade 3 or 4 toxicity (anemia, neutropenia, deviation of liver enzymes, nephrotoxicity, thrombopenia, electrolyte disturbance, or neuropathy). We used 10-fold cross-validation to develop two algorithms (logistic regression with elastic net penalty [GLM] and support vector machines [SVMs]). Algorithm predictions were compared with documented toxicity events and diagnostic performance was evaluated via area under the curve (AUROC).
Results: A total of 590 patients were identified, 432 in the development set and 158 in the validation set. The median age was 51 years, and 55.8% (329 of 590) experienced grade 3 or 4 toxicity. The performance improved significantly when adding referenced treatment information (referenced regimen, referenced summation dose intensity product) in addition to patient and tumor variables: GLM AUROC 0.59 versus 0.75, P = .02; SVM AUROC 0.64 versus 0.75, P = .01.
Conclusion: The individual risk of treatment-relevant toxicity can be predicted using machine learning algorithms. We demonstrate a promising way to improve efficacy and facilitate proactive toxicity management of systemic cancer treatment.
目的:系统性癌症治疗的毒性是患者的主要焦虑,也是对治疗计划的挑战。我们的目标是开发机器学习算法,以提前预测个体在治疗过程中出现治疗相关毒性的风险。方法:从接受新辅助治疗的早期乳腺癌患者的单中心、连续队列中检索临床记录。我们开发并验证了机器学习算法来预测3级或4级毒性(贫血、中性粒细胞减少、肝酶偏离、肾毒性、血小板减少、电解质紊乱或神经病变)。我们使用10倍交叉验证来开发两种算法(弹性网络惩罚逻辑回归[GLM]和支持向量机[svm])。将算法预测与记录的毒性事件进行比较,并通过曲线下面积(AUROC)评估诊断性能。结果:共确定了590例患者,其中432例在开发组,158例在验证组。中位年龄为51岁,55.8%(590人中329人)出现3级或4级毒性。除患者和肿瘤变量外,添加参考治疗信息(参考方案、参考总剂量强度积)可显著提高疗效:GLM AUROC为0.59比0.75,P = 0.02;支持向量机AUROC为0.64 vs . 0.75, P = 0.01。结论:使用机器学习算法可以预测治疗相关毒性的个体风险。我们展示了一种有希望的方法来提高系统性癌症治疗的疗效和促进主动毒性管理。
{"title":"Machine Learning to Predict the Individual Risk of Treatment-Relevant Toxicity for Patients With Breast Cancer Undergoing Neoadjuvant Systemic Treatment.","authors":"Lie Cai, Thomas M Deutsch, Chris Sidey-Gibbons, Michelle Kobel, Fabian Riedel, Katharina Smetanay, Carlo Fremd, Laura Michel, Michael Golatta, Joerg Heil, Andreas Schneeweiss, André Pfob","doi":"10.1200/CCI.24.00010","DOIUrl":"10.1200/CCI.24.00010","url":null,"abstract":"<p><strong>Purpose: </strong>Toxicity to systemic cancer treatment represents a major anxiety for patients and a challenge to treatment plans. We aimed to develop machine learning algorithms for the upfront prediction of an individual's risk of experiencing treatment-relevant toxicity during the course of treatment.</p><p><strong>Methods: </strong>Clinical records were retrieved from a single-center, consecutive cohort of patients who underwent neoadjuvant treatment for early breast cancer. We developed and validated machine learning algorithms to predict grade 3 or 4 toxicity (anemia, neutropenia, deviation of liver enzymes, nephrotoxicity, thrombopenia, electrolyte disturbance, or neuropathy). We used 10-fold cross-validation to develop two algorithms (logistic regression with elastic net penalty [GLM] and support vector machines [SVMs]). Algorithm predictions were compared with documented toxicity events and diagnostic performance was evaluated via area under the curve (AUROC).</p><p><strong>Results: </strong>A total of 590 patients were identified, 432 in the development set and 158 in the validation set. The median age was 51 years, and 55.8% (329 of 590) experienced grade 3 or 4 toxicity. The performance improved significantly when adding referenced treatment information (referenced regimen, referenced summation dose intensity product) in addition to patient and tumor variables: GLM AUROC 0.59 versus 0.75, <i>P</i> = .02; SVM AUROC 0.64 versus 0.75, <i>P</i> = .01.</p><p><strong>Conclusion: </strong>The individual risk of treatment-relevant toxicity can be predicted using machine learning algorithms. We demonstrate a promising way to improve efficacy and facilitate proactive toxicity management of systemic cancer treatment.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400010"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-12-03DOI: 10.1200/CCI.24.00056
Joshi Hogenboom, Aiara Lobo Gomes, Andre Dekker, Winette Van Der Graaf, Olga Husson, Leonard Wee
Purpose: Research on rare diseases and atypical health care demographics is often slowed by high interparticipant heterogeneity and overall scarcity of data. Synthetic data (SD) have been proposed as means for data sharing, enlargement, and diversification, by artificially generating real phenomena while obscuring the real patient data. The utility of SD is actively scrutinized in health care research, but the role of sample size for actionability of SD is insufficiently explored. We aim to understand the interplay of actionability and sample size by generating SD sets of varying sizes from gradually diminishing amounts of real individuals' data. We evaluate the actionability of SD in a highly heterogeneous and rare demographic: adolescents and young adults (AYAs) with cancer.
Methods: A population-based cross-sectional cohort study of 3,735 AYAs was subsampled at random to produce 13 training data sets of varying sample sizes. We studied four distinct generator architectures built on the open-source Synthetic Data Vault library. Each architecture was used to generate SD of varying sizes on the basis of each aforementioned training subsets. SD actionability was assessed by comparing the resulting SD with their respective real data against three metrics-veracity, utility, and privacy concealment.
Results: All examined generator architectures yielded actionable data when generating SD with sizes similar to the real data. Large SD sample size increased veracity but generally increased privacy risks. Using fewer training participants led to faster convergence in veracity, but partially exacerbated privacy concealment issues.
Conclusion: SD is a potentially promising option for data sharing and data augmentation, yet sample size plays a significant role in its actionability. SD generation should go hand-in-hand with consistent scrutiny, and sample size should be carefully considered in this process.
目的:对罕见病和非典型卫生保健人口统计的研究往往因参与者之间的高度异质性和数据的总体稀缺性而减慢。合成数据(SD)被提出作为数据共享、扩大和多样化的手段,通过人为地产生真实的现象,同时模糊真实的患者数据。在卫生保健研究中,SD的效用受到了积极的审视,但样本大小对SD可操作性的作用尚未得到充分的探讨。我们的目标是通过从逐渐减少的真实个人数据中生成不同大小的SD集来理解可操作性和样本量之间的相互作用。我们评估了SD在一个高度异质性和罕见的人口统计学中的可操作性:患有癌症的青少年和年轻人(AYAs)。方法:以人群为基础的横断面队列研究,随机抽样3,735名AYAs,产生13个不同样本量的训练数据集。我们研究了基于开源Synthetic Data Vault库构建的四种不同的生成器体系结构。每一种体系结构都被用来在上述每个训练子集的基础上生成不同大小的SD。通过将结果SD与各自的真实数据与三个指标(准确性、实用性和隐私隐蔽性)进行比较,评估SD的可操作性。结果:当生成大小与真实数据相似的SD时,所有检查的生成器架构都产生了可操作的数据。较大的SD样本量增加了准确性,但通常增加了隐私风险。使用较少的培训参与者可以加快准确性的收敛速度,但在一定程度上加剧了隐私隐藏问题。结论:SD是一种潜在的有前途的数据共享和数据增强选择,但样本量在其可操作性中起着重要作用。SD生成应与持续的审查齐头并进,在此过程中应仔细考虑样本大小。
{"title":"Actionability of Synthetic Data in a Heterogeneous and Rare Health Care Demographic: Adolescents and Young Adults With Cancer.","authors":"Joshi Hogenboom, Aiara Lobo Gomes, Andre Dekker, Winette Van Der Graaf, Olga Husson, Leonard Wee","doi":"10.1200/CCI.24.00056","DOIUrl":"10.1200/CCI.24.00056","url":null,"abstract":"<p><strong>Purpose: </strong>Research on rare diseases and atypical health care demographics is often slowed by high interparticipant heterogeneity and overall scarcity of data. Synthetic data (SD) have been proposed as means for data sharing, enlargement, and diversification, by artificially generating real phenomena while obscuring the real patient data. The utility of SD is actively scrutinized in health care research, but the role of sample size for actionability of SD is insufficiently explored. We aim to understand the interplay of actionability and sample size by generating SD sets of varying sizes from gradually diminishing amounts of real individuals' data. We evaluate the actionability of SD in a highly heterogeneous and rare demographic: adolescents and young adults (AYAs) with cancer.</p><p><strong>Methods: </strong>A population-based cross-sectional cohort study of 3,735 AYAs was subsampled at random to produce 13 training data sets of varying sample sizes. We studied four distinct generator architectures built on the open-source Synthetic Data Vault library. Each architecture was used to generate SD of varying sizes on the basis of each aforementioned training subsets. SD actionability was assessed by comparing the resulting SD with their respective real data against three metrics-veracity, utility, and privacy concealment.</p><p><strong>Results: </strong>All examined generator architectures yielded actionable data when generating SD with sizes similar to the real data. Large SD sample size increased veracity but generally increased privacy risks. Using fewer training participants led to faster convergence in veracity, but partially exacerbated privacy concealment issues.</p><p><strong>Conclusion: </strong>SD is a potentially promising option for data sharing and data augmentation, yet sample size plays a significant role in its actionability. SD generation should go hand-in-hand with consistent scrutiny, and sample size should be carefully considered in this process.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400056"},"PeriodicalIF":3.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11627331/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}