Automated data analysis of unstructured grey literature in health research: A mapping review

IF 5 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Research Synthesis Methods Pub Date : 2023-12-19 DOI:10.1002/jrsm.1692
Lena Schmidt, Saleh Mohamed, Nick Meader, Jaume Bacardit, Dawn Craig
{"title":"Automated data analysis of unstructured grey literature in health research: A mapping review","authors":"Lena Schmidt,&nbsp;Saleh Mohamed,&nbsp;Nick Meader,&nbsp;Jaume Bacardit,&nbsp;Dawn Craig","doi":"10.1002/jrsm.1692","DOIUrl":null,"url":null,"abstract":"<p>The amount of grey literature and ‘softer’ intelligence from social media or websites is vast. Given the long lead-times of producing high-quality peer-reviewed health information, this is causing a demand for new ways to provide prompt input for secondary research. To our knowledge, this is the first review of automated data extraction methods or tools for health-related grey literature and soft data, with a focus on (semi)automating horizon scans, health technology assessments (HTA), evidence maps, or other literature reviews. We searched six databases to cover both health- and computer-science literature. After deduplication, 10% of the search results were screened by two reviewers, the remainder was single-screened up to an estimated 95% sensitivity; screening was stopped early after screening an additional 1000 results with no new includes. All full texts were retrieved, screened, and extracted by a single reviewer and 10% were checked in duplicate. We included 84 papers covering automation for health-related social media, internet fora, news, patents, government agencies and charities, or trial registers. From each paper, we extracted data about important functionalities for users of the tool or method; information about the level of support and reliability; and about practical challenges and research gaps. Poor availability of code, data, and usable tools leads to low transparency regarding performance and duplication of work. Financial implications, scalability, integration into downstream workflows, and meaningful evaluations should be carefully planned before starting to develop a tool, given the vast amounts of data and opportunities those tools offer to expedite research.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1692","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Synthesis Methods","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1692","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The amount of grey literature and ‘softer’ intelligence from social media or websites is vast. Given the long lead-times of producing high-quality peer-reviewed health information, this is causing a demand for new ways to provide prompt input for secondary research. To our knowledge, this is the first review of automated data extraction methods or tools for health-related grey literature and soft data, with a focus on (semi)automating horizon scans, health technology assessments (HTA), evidence maps, or other literature reviews. We searched six databases to cover both health- and computer-science literature. After deduplication, 10% of the search results were screened by two reviewers, the remainder was single-screened up to an estimated 95% sensitivity; screening was stopped early after screening an additional 1000 results with no new includes. All full texts were retrieved, screened, and extracted by a single reviewer and 10% were checked in duplicate. We included 84 papers covering automation for health-related social media, internet fora, news, patents, government agencies and charities, or trial registers. From each paper, we extracted data about important functionalities for users of the tool or method; information about the level of support and reliability; and about practical challenges and research gaps. Poor availability of code, data, and usable tools leads to low transparency regarding performance and duplication of work. Financial implications, scalability, integration into downstream workflows, and meaningful evaluations should be carefully planned before starting to develop a tool, given the vast amounts of data and opportunities those tools offer to expedite research.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对健康研究中的非结构化灰色文献进行自动数据分析:绘图审查。
灰色文献和来自社交媒体或网站的 "软 "情报数量巨大。鉴于高质量的同行评审健康信息的制作周期较长,这就需要有新的方法来为二次研究提供及时的输入。据我们所知,这是首次对与健康相关的灰色文献和软数据的自动数据提取方法或工具进行综述,重点关注地平线扫描、健康技术评估 (HTA)、证据图谱或其他文献综述的(半)自动化。我们搜索了六个数据库,涵盖了健康和计算机科学文献。重复数据删除后,10% 的搜索结果由两名审稿人进行筛选,其余的则进行单项筛选,灵敏度估计为 95%;在筛选了另外 1000 个结果且没有新收录后,我们提前停止了筛选。所有全文均由一名审稿人进行检索、筛选和提取,10%的全文进行了重复检查。我们共收录了 84 篇论文,涉及与健康相关的社交媒体、互联网论坛、新闻、专利、政府机构和慈善机构或试验登记册的自动化。我们从每篇论文中提取了有关工具或方法用户重要功能的数据、有关支持水平和可靠性的信息,以及有关实际挑战和研究空白的数据。代码、数据和可用工具的可用性差,导致绩效透明度低和工作重复。鉴于这些工具可提供大量数据和机会以加快研究,因此在开始开发工具之前,应仔细规划其财务影响、可扩展性、与下游工作流程的整合以及有意义的评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Research Synthesis Methods
Research Synthesis Methods MATHEMATICAL & COMPUTATIONAL BIOLOGYMULTID-MULTIDISCIPLINARY SCIENCES
CiteScore
16.90
自引率
3.10%
发文量
75
期刊介绍: Research Synthesis Methods is a reputable, peer-reviewed journal that focuses on the development and dissemination of methods for conducting systematic research synthesis. Our aim is to advance the knowledge and application of research synthesis methods across various disciplines. Our journal provides a platform for the exchange of ideas and knowledge related to designing, conducting, analyzing, interpreting, reporting, and applying research synthesis. While research synthesis is commonly practiced in the health and social sciences, our journal also welcomes contributions from other fields to enrich the methodologies employed in research synthesis across scientific disciplines. By bridging different disciplines, we aim to foster collaboration and cross-fertilization of ideas, ultimately enhancing the quality and effectiveness of research synthesis methods. Whether you are a researcher, practitioner, or stakeholder involved in research synthesis, our journal strives to offer valuable insights and practical guidance for your work.
期刊最新文献
Automation tools to support undertaking scoping reviews. Reduce, reuse, recycle: Introducing MetaPipeX, a framework for analyses of multi-lab data. A comparison of two models for detecting inconsistency in network meta-analysis. Calculating the power of a planned individual participant data meta-analysis to examine prognostic factor effects for a binary outcome. Considerations for conducting systematic reviews: A follow-up study to evaluate the performance of various automated methods for reference de-duplication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1