Samuel O Adeosun, Afua B Faibille, Aisha N Qadir, Jerotich T Mutwol, Taylor McMannen
{"title":"用于将药学实践出版物分类到研究领域的深度神经网络模型。","authors":"Samuel O Adeosun, Afua B Faibille, Aisha N Qadir, Jerotich T Mutwol, Taylor McMannen","doi":"10.1016/j.sapharm.2024.10.009","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pharmacy practice faculty research profiles extend beyond the clinical and social domains, which are core elements of pharmacy practice. But as highlighted by journal editors in the Granada Statements, there is no consensus on these terms. Four domains (clinical, education, social & administrative, and basic & translational) of pharmacy practice faculty research are proposed.</p><p><strong>Objectives: </strong>To develop a classifier for categorizing pharmacy practice faculty publications into four proposed domains, and to compare the model with zero-shot performances of state-of-the-art, general purpose large language models (gpLLMs).</p><p><strong>Methods: </strong>One thousand abstracts from 2018 to 2021 documents published by pharmacy practice faculty were reviewed, labelled and used to screen and finetune several Bidirectional Encoders Representations from Transformers (BERT) models. The selected model was compared with zero-shot performances of 7 state-of-the-art gpLLMs including ChatGPT-4o, Gemini-1.5-Pro, Claude-3.5, LLAMA-3.1 and Mistral Large, using 80 randomly selected abstracts from 2023 publications labelled with ≥80% consensus by all authors. Classification metrics included F1, recall, precision and accuracy, and reproducibility was measured with Cohen's kappa. A use case was demonstrated by testing the null hypothesis that the research domain distribution of faculty publications was independent of the pandemic.</p><p><strong>Result: </strong>The model - Pharmacy Practice Research Domain Classifier (PPRDC) produced a 5-fold stratified cross-validation metrics of 89.4 ± 1.7, 90.2 ± 2.2, 89.0 ± 1.7, and 95.5 ± 0.6, for F1, recall, precision and accuracy, respectively. PPRDC produced perfectly reproducible classifications (Cohen's kappa = 1.0) and outperformed zero-shot performances of all gpLLMs. F1 scores were 96.2 ± 1.6, 92.7 ± 1.2, 85.8 ± 3.2, and 83.1 ± 9.8 for education, clinical, social, and translational domains, respectively.</p><p><strong>Conclusions: </strong>PPRDC (https://sadeosun-pprdc.streamlit.app) performed better than gpLLMs in this abstract classification task. Among several other impacts, PPRDC opens a new frontier in bibliometric studies; it will also advance the goals of the Grenada Statements by aiding authors and journal editors in journal selection and article prioritization decisions, respectively.</p>","PeriodicalId":48126,"journal":{"name":"Research in Social & Administrative Pharmacy","volume":" ","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep neural network model for classifying pharmacy practice publications into research domains.\",\"authors\":\"Samuel O Adeosun, Afua B Faibille, Aisha N Qadir, Jerotich T Mutwol, Taylor McMannen\",\"doi\":\"10.1016/j.sapharm.2024.10.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Pharmacy practice faculty research profiles extend beyond the clinical and social domains, which are core elements of pharmacy practice. But as highlighted by journal editors in the Granada Statements, there is no consensus on these terms. Four domains (clinical, education, social & administrative, and basic & translational) of pharmacy practice faculty research are proposed.</p><p><strong>Objectives: </strong>To develop a classifier for categorizing pharmacy practice faculty publications into four proposed domains, and to compare the model with zero-shot performances of state-of-the-art, general purpose large language models (gpLLMs).</p><p><strong>Methods: </strong>One thousand abstracts from 2018 to 2021 documents published by pharmacy practice faculty were reviewed, labelled and used to screen and finetune several Bidirectional Encoders Representations from Transformers (BERT) models. The selected model was compared with zero-shot performances of 7 state-of-the-art gpLLMs including ChatGPT-4o, Gemini-1.5-Pro, Claude-3.5, LLAMA-3.1 and Mistral Large, using 80 randomly selected abstracts from 2023 publications labelled with ≥80% consensus by all authors. Classification metrics included F1, recall, precision and accuracy, and reproducibility was measured with Cohen's kappa. A use case was demonstrated by testing the null hypothesis that the research domain distribution of faculty publications was independent of the pandemic.</p><p><strong>Result: </strong>The model - Pharmacy Practice Research Domain Classifier (PPRDC) produced a 5-fold stratified cross-validation metrics of 89.4 ± 1.7, 90.2 ± 2.2, 89.0 ± 1.7, and 95.5 ± 0.6, for F1, recall, precision and accuracy, respectively. PPRDC produced perfectly reproducible classifications (Cohen's kappa = 1.0) and outperformed zero-shot performances of all gpLLMs. F1 scores were 96.2 ± 1.6, 92.7 ± 1.2, 85.8 ± 3.2, and 83.1 ± 9.8 for education, clinical, social, and translational domains, respectively.</p><p><strong>Conclusions: </strong>PPRDC (https://sadeosun-pprdc.streamlit.app) performed better than gpLLMs in this abstract classification task. Among several other impacts, PPRDC opens a new frontier in bibliometric studies; it will also advance the goals of the Grenada Statements by aiding authors and journal editors in journal selection and article prioritization decisions, respectively.</p>\",\"PeriodicalId\":48126,\"journal\":{\"name\":\"Research in Social & Administrative Pharmacy\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research in Social & Administrative Pharmacy\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.sapharm.2024.10.009\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in Social & Administrative Pharmacy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.sapharm.2024.10.009","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
A deep neural network model for classifying pharmacy practice publications into research domains.
Background: Pharmacy practice faculty research profiles extend beyond the clinical and social domains, which are core elements of pharmacy practice. But as highlighted by journal editors in the Granada Statements, there is no consensus on these terms. Four domains (clinical, education, social & administrative, and basic & translational) of pharmacy practice faculty research are proposed.
Objectives: To develop a classifier for categorizing pharmacy practice faculty publications into four proposed domains, and to compare the model with zero-shot performances of state-of-the-art, general purpose large language models (gpLLMs).
Methods: One thousand abstracts from 2018 to 2021 documents published by pharmacy practice faculty were reviewed, labelled and used to screen and finetune several Bidirectional Encoders Representations from Transformers (BERT) models. The selected model was compared with zero-shot performances of 7 state-of-the-art gpLLMs including ChatGPT-4o, Gemini-1.5-Pro, Claude-3.5, LLAMA-3.1 and Mistral Large, using 80 randomly selected abstracts from 2023 publications labelled with ≥80% consensus by all authors. Classification metrics included F1, recall, precision and accuracy, and reproducibility was measured with Cohen's kappa. A use case was demonstrated by testing the null hypothesis that the research domain distribution of faculty publications was independent of the pandemic.
Result: The model - Pharmacy Practice Research Domain Classifier (PPRDC) produced a 5-fold stratified cross-validation metrics of 89.4 ± 1.7, 90.2 ± 2.2, 89.0 ± 1.7, and 95.5 ± 0.6, for F1, recall, precision and accuracy, respectively. PPRDC produced perfectly reproducible classifications (Cohen's kappa = 1.0) and outperformed zero-shot performances of all gpLLMs. F1 scores were 96.2 ± 1.6, 92.7 ± 1.2, 85.8 ± 3.2, and 83.1 ± 9.8 for education, clinical, social, and translational domains, respectively.
Conclusions: PPRDC (https://sadeosun-pprdc.streamlit.app) performed better than gpLLMs in this abstract classification task. Among several other impacts, PPRDC opens a new frontier in bibliometric studies; it will also advance the goals of the Grenada Statements by aiding authors and journal editors in journal selection and article prioritization decisions, respectively.
期刊介绍:
Research in Social and Administrative Pharmacy (RSAP) is a quarterly publication featuring original scientific reports and comprehensive review articles in the social and administrative pharmaceutical sciences. Topics of interest include outcomes evaluation of products, programs, or services; pharmacoepidemiology; medication adherence; direct-to-consumer advertising of prescription medications; disease state management; health systems reform; drug marketing; medication distribution systems such as e-prescribing; web-based pharmaceutical/medical services; drug commerce and re-importation; and health professions workforce issues.