Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00169-9
Sumali Bajaj SM , Siyu Chen DPhil , Richard Creswell DPhil , Reshania Naidoo MD , Joseph L-H Tsui MSc , Olumide Kolade BSc , George Nicholson DPhil , Brieuc Lehmann PhD , James A Hay PhD , Prof Moritz U G Kraemer DPhil , Ricardo Aguas PhD , Prof Christl A Donnelly ScD , Tom Fowler FFPH , Prof Susan Hopkins FMedSci , Liberty Cantrell MSc , Prabin Dahal DPhil , Prof Lisa J White PhD , Kasia Stepniewska PhD , Merryn Voysey DPhil , Ben Lambert DPhil , Lisa J White
<div><h3>Background</h3><div>Understanding underlying mechanisms of heterogeneity in test-seeking and reporting behaviour during an infectious disease outbreak can help to protect vulnerable populations and guide equity-driven interventions. The COVID-19 pandemic probably exerted different stresses on individuals in different sociodemographic groups and ensuring fair access to and usage of COVID-19 tests was a crucial element of England's testing programme. We aimed to investigate the relationship between sociodemographic factors and COVID-19 testing behaviours in England during the COVID-19 pandemic.</div></div><div><h3>Methods</h3><div>We did a population-based study of COVID-19 testing behaviours with mass COVID-19 testing data for England and data from community prevalence surveillance surveys (REACT-1 and ONS-CIS) from Oct 1, 2020, to March 30, 2022. We used mass testing data for lateral flow device (LFD; data for approximately 290 million tests performed and reported) and PCR (data for approximately 107 million tests performed and returned from the laboratory) tests made available for the general public and provided by date and self-reported age and ethnicity at the lower tier local authority (LTLA) level. We also used publicly available data on mean population size estimates for individual LTLAs, and data on ethnic groups, age groups, and deprivation indices for LTLAs. We did not have access to REACT-1 or ONS-CIS prevalence data disaggregated by sex or gender. Using a mechanistic causal model to debias the PCR testing data, we obtained estimates of weekly SARS-CoV-2 prevalence by both self-reported ethnic groups and age groups for LTLAs in England. This approach to debiasing the PCR (or LFD) testing data also estimated a testing bias parameter defined as the odds of testing in infected versus not infected individuals, which would be close to zero if the likelihood of test seeking (or seeking and reporting) was the same regardless of infection status. With confirmatory PCR data, we estimated false positivity rates, sensitivity, specificity, and the rate of decline in detection probability subsequent to reporting a positive LFD for PCR tests by sociodemographic groups. We also estimated the daily incidence, allowing us to calculate the fraction of cases captured by the testing programme.</div></div><div><h3>Findings</h3><div>From March, 2021 onwards, individuals in the most deprived regions reported approximately half as many LFD tests per capita as individuals in the least deprived areas (median ratio 0·50 [IQR 0·44–0·54]). During the period October, 2020, to June, 2021, PCR testing patterns showed the opposite trend, with individuals in the most deprived areas performing almost double the number of PCR tests per capita than those in the least deprived areas (1·8 [1·7–1·9]). Infection prevalences in Asian or Asian British individuals were considerably higher than those of other ethnic groups during the alpha (B.1.1.7) and omicron (B.1.1.529
{"title":"COVID-19 testing and reporting behaviours in England across different sociodemographic groups: a population-based study using testing data and data from community prevalence surveillance surveys","authors":"Sumali Bajaj SM , Siyu Chen DPhil , Richard Creswell DPhil , Reshania Naidoo MD , Joseph L-H Tsui MSc , Olumide Kolade BSc , George Nicholson DPhil , Brieuc Lehmann PhD , James A Hay PhD , Prof Moritz U G Kraemer DPhil , Ricardo Aguas PhD , Prof Christl A Donnelly ScD , Tom Fowler FFPH , Prof Susan Hopkins FMedSci , Liberty Cantrell MSc , Prabin Dahal DPhil , Prof Lisa J White PhD , Kasia Stepniewska PhD , Merryn Voysey DPhil , Ben Lambert DPhil , Lisa J White","doi":"10.1016/S2589-7500(24)00169-9","DOIUrl":"10.1016/S2589-7500(24)00169-9","url":null,"abstract":"<div><h3>Background</h3><div>Understanding underlying mechanisms of heterogeneity in test-seeking and reporting behaviour during an infectious disease outbreak can help to protect vulnerable populations and guide equity-driven interventions. The COVID-19 pandemic probably exerted different stresses on individuals in different sociodemographic groups and ensuring fair access to and usage of COVID-19 tests was a crucial element of England's testing programme. We aimed to investigate the relationship between sociodemographic factors and COVID-19 testing behaviours in England during the COVID-19 pandemic.</div></div><div><h3>Methods</h3><div>We did a population-based study of COVID-19 testing behaviours with mass COVID-19 testing data for England and data from community prevalence surveillance surveys (REACT-1 and ONS-CIS) from Oct 1, 2020, to March 30, 2022. We used mass testing data for lateral flow device (LFD; data for approximately 290 million tests performed and reported) and PCR (data for approximately 107 million tests performed and returned from the laboratory) tests made available for the general public and provided by date and self-reported age and ethnicity at the lower tier local authority (LTLA) level. We also used publicly available data on mean population size estimates for individual LTLAs, and data on ethnic groups, age groups, and deprivation indices for LTLAs. We did not have access to REACT-1 or ONS-CIS prevalence data disaggregated by sex or gender. Using a mechanistic causal model to debias the PCR testing data, we obtained estimates of weekly SARS-CoV-2 prevalence by both self-reported ethnic groups and age groups for LTLAs in England. This approach to debiasing the PCR (or LFD) testing data also estimated a testing bias parameter defined as the odds of testing in infected versus not infected individuals, which would be close to zero if the likelihood of test seeking (or seeking and reporting) was the same regardless of infection status. With confirmatory PCR data, we estimated false positivity rates, sensitivity, specificity, and the rate of decline in detection probability subsequent to reporting a positive LFD for PCR tests by sociodemographic groups. We also estimated the daily incidence, allowing us to calculate the fraction of cases captured by the testing programme.</div></div><div><h3>Findings</h3><div>From March, 2021 onwards, individuals in the most deprived regions reported approximately half as many LFD tests per capita as individuals in the least deprived areas (median ratio 0·50 [IQR 0·44–0·54]). During the period October, 2020, to June, 2021, PCR testing patterns showed the opposite trend, with individuals in the most deprived areas performing almost double the number of PCR tests per capita than those in the least deprived areas (1·8 [1·7–1·9]). Infection prevalences in Asian or Asian British individuals were considerably higher than those of other ethnic groups during the alpha (B.1.1.7) and omicron (B.1.1.529","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e778-e790"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00146-8
Joseph E Alderman MB ChB , Maria Charalambides MB ChB , Gagandeep Sachdeva MB ChB , Elinor Laws MB BCh , Joanne Palmer PhD , Elsa Lee MSc , Vaishnavi Menon MB ChB , Qasim Malik MB ChB , Sonam Vadera MB BS , Prof Melanie Calvert PhD , Marzyeh Ghassemi PhD , Melissa D McCradden PhD , Johan Ordish MA , Bilal Mateen MBBS , Prof Charlotte Summers PhD , Jacqui Gath , Rubeta N Matin PhD , Prof Alastair K Denniston PhD , Xiaoxuan Liu PhD
During the COVID-19 pandemic, artificial intelligence (AI) models were created to address health-care resource constraints. Previous research shows that health-care datasets often have limitations, leading to biased AI technologies. This systematic review assessed datasets used for AI development during the pandemic, identifying several deficiencies. Datasets were identified by screening articles from MEDLINE and using Google Dataset Search. 192 datasets were analysed for metadata completeness, composition, data accessibility, and ethical considerations. Findings revealed substantial gaps: only 48% of datasets documented individuals’ country of origin, 43% reported age, and under 25% included sex, gender, race, or ethnicity. Information on data labelling, ethical review, or consent was frequently missing. Many datasets reused data with inadequate traceability. Notably, historical paediatric chest x-rays appeared in some datasets without acknowledgment. These deficiencies highlight the need for better data quality and transparent documentation to lessen the risk that biased AI models are developed in future health emergencies.
{"title":"Revealing transparency gaps in publicly available COVID-19 datasets used for medical artificial intelligence development—a systematic review","authors":"Joseph E Alderman MB ChB , Maria Charalambides MB ChB , Gagandeep Sachdeva MB ChB , Elinor Laws MB BCh , Joanne Palmer PhD , Elsa Lee MSc , Vaishnavi Menon MB ChB , Qasim Malik MB ChB , Sonam Vadera MB BS , Prof Melanie Calvert PhD , Marzyeh Ghassemi PhD , Melissa D McCradden PhD , Johan Ordish MA , Bilal Mateen MBBS , Prof Charlotte Summers PhD , Jacqui Gath , Rubeta N Matin PhD , Prof Alastair K Denniston PhD , Xiaoxuan Liu PhD","doi":"10.1016/S2589-7500(24)00146-8","DOIUrl":"10.1016/S2589-7500(24)00146-8","url":null,"abstract":"<div><div>During the COVID-19 pandemic, artificial intelligence (AI) models were created to address health-care resource constraints. Previous research shows that health-care datasets often have limitations, leading to biased AI technologies. This systematic review assessed datasets used for AI development during the pandemic, identifying several deficiencies. Datasets were identified by screening articles from MEDLINE and using Google Dataset Search. 192 datasets were analysed for metadata completeness, composition, data accessibility, and ethical considerations. Findings revealed substantial gaps: only 48% of datasets documented individuals’ country of origin, 43% reported age, and under 25% included sex, gender, race, or ethnicity. Information on data labelling, ethical review, or consent was frequently missing. Many datasets reused data with inadequate traceability. Notably, historical paediatric chest x-rays appeared in some datasets without acknowledgment. These deficiencies highlight the need for better data quality and transparent documentation to lessen the risk that biased AI models are developed in future health emergencies.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e827-e847"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00173-0
Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD
Background
Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.
Methods
We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.
Findings
Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.
Interpretation
The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.
{"title":"Strategies for integrating artificial intelligence into mammography screening programmes: a retrospective simulation analysis","authors":"Zacharias V Fisches MSc , Michael Ball ScB , Trasias Mukama PhD , Vilim Štih PhD , Nicholas R Payne PhD , Sarah E Hickman PhD , Prof Fiona J Gilbert PhD , Stefan Bunk MSc , Christian Leibig PhD","doi":"10.1016/S2589-7500(24)00173-0","DOIUrl":"10.1016/S2589-7500(24)00173-0","url":null,"abstract":"<div><h3>Background</h3><div>Integrating artificial intelligence (AI) into mammography screening can support radiologists and improve programme metrics, yet the potential of different strategies for integrating the technology remains understudied. We compared programme-level performance metrics of seven AI integration strategies.</div></div><div><h3>Methods</h3><div>We performed a retrospective comparative evaluation of seven strategies for integrating AI into mammography screening using datasets generated from screening programmes in Germany (n=1 657 068), the UK (n=223 603) and Sweden (n=22 779). The commercially available AI model used was Vara version 2.10, trained from scratch on German data. We simulated the performance of each strategy in terms of cancer detection rate (CDR), recall rate, and workload reduction, and compared the metrics with those of the screening programmes. We also assessed the distribution of the stages and grades of the cancers detected by each strategy and the AI model's ability to correctly localise those cancers.</div></div><div><h3>Findings</h3><div>Compared with the German screening programme (CDR 6·32 per 1000 examinations, recall rate 4·11 per 100 examinations), replacement of both readers (standalone AI strategy) achieved a non-inferior CDR of 6·37 (95% CI 6·10–6·64) at a recall rate of 3·80 (95% CI 3·67–3·93), whereas single reader replacement achieved a CDR of 6·49 (6·31–6·67), a recall rate of 4·01 (3·92–4·10), and a 49% workload reduction. Programme-level decision referral achieved a CDR of 6·85 (6·61–7·11), a recall rate of 3·55 (3·43–3·68), and an 84% workload reduction. Compared with the UK programme CDR of 8·19, the reader-level, programme-level, and deferral to single reader strategies achieved CDRs of 8·24 (7·82–8·71), 8·59 (8·12–9·06), and 8·28 (7·86–8·71), without increasing recall and while reducing workload by 37%, 81%, and 95%, respectively. On the Swedish dataset, programme-level decision referral increased the CDR by 17·7% without increasing recall and while reducing reading workload by 92%.</div></div><div><h3>Interpretation</h3><div>The decision referral strategies offered the largest improvements in cancer detection rates and reduction in recall rates, and all strategies except normal triaging showed potential to improve screening metrics.</div></div><div><h3>Funding</h3><div>Vara.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e803-e814"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00172-9
Arunashis Sau PhD , Libor Pastika MBBS , Ewa Sieliwonczyk PhD , Konstantinos Patlatzoglou PhD , Antônio H Ribeiro PhD , Kathryn A McGurk PhD , Boroumand Zeidaabadi BSc , Henry Zhang BSc , Krzysztof Macierzanka BSc , Prof Danilo Mandic PhD , Prof Ester Sabino MD , Luana Giatti PhD , Prof Sandhi M Barreto PhD , Lidyane do Valle Camelo PhD , Prof Ioanna Tzoulaki PhD , Prof Declan P O'Regan PhD , Prof Nicholas S Peters MD , Prof James S Ware PhD , Prof Antonio Luiz P Ribeiro PhD , Daniel B Kramer MD , Fu Siong Ng PhD
Background
Artificial intelligence (AI)-enabled electrocardiography (ECG) can be used to predict risk of future disease and mortality but has not yet been adopted into clinical practice. Existing model predictions do not have actionability at an individual patient level, explainability, or biological plausibi. We sought to address these limitations of previous AI-ECG approaches by developing the AI-ECG risk estimator (AIRE) platform.
Methods
The AIRE platform was developed in a secondary care dataset (Beth Israel Deaconess Medical Center [BIDMC]) of 1 163 401 ECGs from 189 539 patients with deep learning and a discrete-time survival model to create a patient-specific survival curve with a single ECG. Therefore, AIRE predicts not only risk of mortality, but also time-to-mortality. AIRE was validated in five diverse, transnational cohorts from the USA, Brazil, and the UK (UK Biobank [UKB]), including volunteers, primary care patients, and secondary care patients.
Findings
AIRE accurately predicts risk of all-cause mortality (BIDMC C-index 0·775, 95% CI 0·773–0·776; C-indices on external validation datasets 0·638–0·773), future ventricular arrhythmia (BIDMC C-index 0·760, 95% CI 0·756–0·763; UKB C-index 0·719, 95% CI 0·635–0·803), future atherosclerotic cardiovascular disease (0·696, 0·694–0·698; 0·643, 0·624–0·662), and future heart failure (0·787, 0·785–0·789; 0·768, 0·733–0·802). Through phenome-wide and genome-wide association studies, we identified candidate biological pathways for the prediction of increased risk, including changes in cardiac structure and function, and genes associated with cardiac structure, biological ageing, and metabolic syndrome.
Interpretation
AIRE is an actionable, explainable, and biologically plausible AI-ECG risk estimation platform that has the potential for use worldwide across a wide range of clinical contexts for short-term and long-term risk estimation.
Funding
British Heart Foundation, National Institute for Health and Care Research, and Medical Research Council.
{"title":"Artificial intelligence-enabled electrocardiogram for mortality and cardiovascular risk estimation: a model development and validation study","authors":"Arunashis Sau PhD , Libor Pastika MBBS , Ewa Sieliwonczyk PhD , Konstantinos Patlatzoglou PhD , Antônio H Ribeiro PhD , Kathryn A McGurk PhD , Boroumand Zeidaabadi BSc , Henry Zhang BSc , Krzysztof Macierzanka BSc , Prof Danilo Mandic PhD , Prof Ester Sabino MD , Luana Giatti PhD , Prof Sandhi M Barreto PhD , Lidyane do Valle Camelo PhD , Prof Ioanna Tzoulaki PhD , Prof Declan P O'Regan PhD , Prof Nicholas S Peters MD , Prof James S Ware PhD , Prof Antonio Luiz P Ribeiro PhD , Daniel B Kramer MD , Fu Siong Ng PhD","doi":"10.1016/S2589-7500(24)00172-9","DOIUrl":"10.1016/S2589-7500(24)00172-9","url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI)-enabled electrocardiography (ECG) can be used to predict risk of future disease and mortality but has not yet been adopted into clinical practice. Existing model predictions do not have actionability at an individual patient level, explainability, or biological plausibi. We sought to address these limitations of previous AI-ECG approaches by developing the AI-ECG risk estimator (AIRE) platform.</div></div><div><h3>Methods</h3><div>The AIRE platform was developed in a secondary care dataset (Beth Israel Deaconess Medical Center [BIDMC]) of 1 163 401 ECGs from 189 539 patients with deep learning and a discrete-time survival model to create a patient-specific survival curve with a single ECG. Therefore, AIRE predicts not only risk of mortality, but also time-to-mortality. AIRE was validated in five diverse, transnational cohorts from the USA, Brazil, and the UK (UK Biobank [UKB]), including volunteers, primary care patients, and secondary care patients.</div></div><div><h3>Findings</h3><div>AIRE accurately predicts risk of all-cause mortality (BIDMC C-index 0·775, 95% CI 0·773–0·776; C-indices on external validation datasets 0·638–0·773), future ventricular arrhythmia (BIDMC C-index 0·760, 95% CI 0·756–0·763; UKB C-index 0·719, 95% CI 0·635–0·803), future atherosclerotic cardiovascular disease (0·696, 0·694–0·698; 0·643, 0·624–0·662), and future heart failure (0·787, 0·785–0·789; 0·768, 0·733–0·802). Through phenome-wide and genome-wide association studies, we identified candidate biological pathways for the prediction of increased risk, including changes in cardiac structure and function, and genes associated with cardiac structure, biological ageing, and metabolic syndrome.</div></div><div><h3>Interpretation</h3><div>AIRE is an actionable, explainable, and biologically plausible AI-ECG risk estimation platform that has the potential for use worldwide across a wide range of clinical contexts for short-term and long-term risk estimation.</div></div><div><h3>Funding</h3><div>British Heart Foundation, National Institute for Health and Care Research, and Medical Research Council.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e791-e802"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00221-8
The Lancet Digital Health
{"title":"Lifting the veil on health datasets","authors":"The Lancet Digital Health","doi":"10.1016/S2589-7500(24)00221-8","DOIUrl":"10.1016/S2589-7500(24)00221-8","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Page e772"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/S2589-7500(24)00154-7
Jun Ma PhD , Yao Zhang PhD , Song Gu MSc , Cheng Ge MSc , Shihao Mae BSc , Adamo Young MSc , Cheng Zhu PhD , Prof Xin Yang PhD , Prof Kangkang Meng PhD , Ziyan Huang BSc , Fan Zhang MSc , Yuanke Pan MSc , Shoujin Huang BSc , Jiacheng Wang PhD , Mingze Sun PhD , Prof Rongguo Zhang PhD , Dengqiang Jia PhD , Jae Won Choi MD , Natália Alves MSc , Bram de Wilde PhD , Prof Bo Wang PhD
Deep learning has shown great potential to automate abdominal organ segmentation and quantification. However, most existing algorithms rely on expert annotations and do not have comprehensive evaluations in real-world multinational settings. To address these limitations, we organised the FLARE 2022 challenge to benchmark fast, low-resource, and accurate abdominal organ segmentation algorithms. We first constructed an intercontinental abdomen CT dataset from more than 50 clinical research groups. We then independently validated that deep learning algorithms achieved a median dice similarity coefficient (DSC) of 90·0% (IQR 87·4–91·3%) by use of 50 labelled images and 2000 unlabelled images, which can substantially reduce manual annotation costs. The best-performing algorithms successfully generalised to holdout external validation sets, achieving a median DSC of 89·4% (85·2–91·3%), 90·0% (84·3–93·0%), and 88·5% (80·9–91·9%) on North American, European, and Asian cohorts, respectively. These algorithms show the potential to use unlabelled data to boost performance and alleviate annotation shortages for modern artificial intelligence models.
{"title":"Unleashing the strengths of unlabelled data in deep learning-assisted pan-cancer abdominal organ quantification: the FLARE22 challenge","authors":"Jun Ma PhD , Yao Zhang PhD , Song Gu MSc , Cheng Ge MSc , Shihao Mae BSc , Adamo Young MSc , Cheng Zhu PhD , Prof Xin Yang PhD , Prof Kangkang Meng PhD , Ziyan Huang BSc , Fan Zhang MSc , Yuanke Pan MSc , Shoujin Huang BSc , Jiacheng Wang PhD , Mingze Sun PhD , Prof Rongguo Zhang PhD , Dengqiang Jia PhD , Jae Won Choi MD , Natália Alves MSc , Bram de Wilde PhD , Prof Bo Wang PhD","doi":"10.1016/S2589-7500(24)00154-7","DOIUrl":"10.1016/S2589-7500(24)00154-7","url":null,"abstract":"<div><div>Deep learning has shown great potential to automate abdominal organ segmentation and quantification. However, most existing algorithms rely on expert annotations and do not have comprehensive evaluations in real-world multinational settings. To address these limitations, we organised the FLARE 2022 challenge to benchmark fast, low-resource, and accurate abdominal organ segmentation algorithms. We first constructed an intercontinental abdomen CT dataset from more than 50 clinical research groups. We then independently validated that deep learning algorithms achieved a median dice similarity coefficient (DSC) of 90·0% (IQR 87·4–91·3%) by use of 50 labelled images and 2000 unlabelled images, which can substantially reduce manual annotation costs. The best-performing algorithms successfully generalised to holdout external validation sets, achieving a median DSC of 89·4% (85·2–91·3%), 90·0% (84·3–93·0%), and 88·5% (80·9–91·9%) on North American, European, and Asian cohorts, respectively. These algorithms show the potential to use unlabelled data to boost performance and alleviate annotation shortages for modern artificial intelligence models.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Pages e815-e826"},"PeriodicalIF":23.8,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1016/S2589-7500(24)00192-4
Timo O Nieder PhD , Janis Renner MSc , Susanne Sehner MSc , Amra Pepić PhD , Prof Antonia Zapf PhD , Martin Lambert MD , Prof Peer Briken MD , Arne Dekker PhD
<div><h3>Background</h3><div>Transgender and gender diverse (TGD) people in remote areas face challenges accessing health-care services, including mental health care and gender-affirming medical treatment, which can be associated with psychological distress. In this study, we aimed to evaluate the effectiveness of a 4-month TGD-informed e-health intervention to improve psychological distress among TGD people from remote areas in northern Germany.</div></div><div><h3>Methods</h3><div>In a randomised controlled trial done at a single centre in Germany, adults (aged ≥18 years) who met criteria for gender incongruence or gender dysphoria and who lived at least 50 km outside of Hamburg in one of the northern German federal states were recruited and randomly assigned (1:1) to i<sup>2</sup>TransHealth intervention or a wait list control group. Randomisation was performed with the use of a computer-based code. Due to the nature of the intervention, study participants and clinical staff were aware of treatment allocation, but researchers responsible for data analysis were masked to allocation groups. Study participants in the intervention group (service users) started the i<sup>2</sup>TransHealth intervention immediately after completing the baseline survey after enrolment. Participants assigned to the control group waited 4 months before they were able to access i<sup>2</sup>TransHealth services or regular care. The primary outcome was difference in the Brief Symptom Inventory (BSI)-18 summary score between baseline and 4 months, assessed using a linear model analysis. The primary outcome was assessed in the intention-to-treat (ITT) population, which included all randomly assigned participants. The trial was registered with <span><span>ClinicalTrials.gov</span><svg><path></path></svg></span>, <span><span>NCT04290286</span><svg><path></path></svg></span>.</div></div><div><h3>Findings</h3><div>Between May 12, 2020, and May 2, 2022, 177 TGD people were assessed for eligibility, of whom 174 were included in the ITT population (n=90 in the intervention group, n=84 in the control group). Six participants did not provide data for the primary outcome at 4 months, and thus 168 people were included in the analysis population (88 participants in the intervention group and 80 participants in the control group). At 4 months, in the intervention group, the adjusted mean change in BSI-18 from baseline was –0·65 (95% CI –2·25 to 0·96; p=0·43) compared with 2·34 (0·65 to 4·02; p=0·0069) in the control group. Linear model analysis identified a significant difference at 4 months between the groups with regard to change in BSI-18 summary scores from baseline (between-group difference –2·98 [95% CI –5·31 to –0·65]; p=0·012). Adverse events were rare: there were two suicide attempts and one participant was admitted to hospital in the intervention group, and in the control group, there was one case of self-harm and one case of self-harm followed by hospital admission.</div></d
{"title":"Effect of the i2TransHealth e-health intervention on psychological distress among transgender and gender diverse adults from remote areas in Germany: a randomised controlled trial","authors":"Timo O Nieder PhD , Janis Renner MSc , Susanne Sehner MSc , Amra Pepić PhD , Prof Antonia Zapf PhD , Martin Lambert MD , Prof Peer Briken MD , Arne Dekker PhD","doi":"10.1016/S2589-7500(24)00192-4","DOIUrl":"10.1016/S2589-7500(24)00192-4","url":null,"abstract":"<div><h3>Background</h3><div>Transgender and gender diverse (TGD) people in remote areas face challenges accessing health-care services, including mental health care and gender-affirming medical treatment, which can be associated with psychological distress. In this study, we aimed to evaluate the effectiveness of a 4-month TGD-informed e-health intervention to improve psychological distress among TGD people from remote areas in northern Germany.</div></div><div><h3>Methods</h3><div>In a randomised controlled trial done at a single centre in Germany, adults (aged ≥18 years) who met criteria for gender incongruence or gender dysphoria and who lived at least 50 km outside of Hamburg in one of the northern German federal states were recruited and randomly assigned (1:1) to i<sup>2</sup>TransHealth intervention or a wait list control group. Randomisation was performed with the use of a computer-based code. Due to the nature of the intervention, study participants and clinical staff were aware of treatment allocation, but researchers responsible for data analysis were masked to allocation groups. Study participants in the intervention group (service users) started the i<sup>2</sup>TransHealth intervention immediately after completing the baseline survey after enrolment. Participants assigned to the control group waited 4 months before they were able to access i<sup>2</sup>TransHealth services or regular care. The primary outcome was difference in the Brief Symptom Inventory (BSI)-18 summary score between baseline and 4 months, assessed using a linear model analysis. The primary outcome was assessed in the intention-to-treat (ITT) population, which included all randomly assigned participants. The trial was registered with <span><span>ClinicalTrials.gov</span><svg><path></path></svg></span>, <span><span>NCT04290286</span><svg><path></path></svg></span>.</div></div><div><h3>Findings</h3><div>Between May 12, 2020, and May 2, 2022, 177 TGD people were assessed for eligibility, of whom 174 were included in the ITT population (n=90 in the intervention group, n=84 in the control group). Six participants did not provide data for the primary outcome at 4 months, and thus 168 people were included in the analysis population (88 participants in the intervention group and 80 participants in the control group). At 4 months, in the intervention group, the adjusted mean change in BSI-18 from baseline was –0·65 (95% CI –2·25 to 0·96; p=0·43) compared with 2·34 (0·65 to 4·02; p=0·0069) in the control group. Linear model analysis identified a significant difference at 4 months between the groups with regard to change in BSI-18 summary scores from baseline (between-group difference –2·98 [95% CI –5·31 to –0·65]; p=0·012). Adverse events were rare: there were two suicide attempts and one participant was admitted to hospital in the intervention group, and in the control group, there was one case of self-harm and one case of self-harm followed by hospital admission.</div></d","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 12","pages":"Pages e883-e893"},"PeriodicalIF":23.8,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142478075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1016/S2589-7500(24)00220-6
{"title":"Correction to Lancet Digit Health 2024; published online Sept 17. https://doi.org/10.1016/S2589-7500(24)00143-2","authors":"","doi":"10.1016/S2589-7500(24)00220-6","DOIUrl":"10.1016/S2589-7500(24)00220-6","url":null,"abstract":"","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 11","pages":"Page e777"},"PeriodicalIF":23.8,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142394367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}