Pub Date : 2024-08-08DOI: 10.1016/j.patter.2024.101038
Recently, a surge in image manipulations in scientific publications has led to numerous retractions, highlighting the importance of image integrity. Although forensic detectors for image duplication and synthesis have been researched, the detection of image splicing in scientific publications remains largely unexplored. Splicing detection is more challenging than duplication detection due to the lack of reference images and more difficult than synthesis detection because of the presence of smaller tampered-with areas. Moreover, disruptive factors in scientific images, such as artifacts, abnormal patterns, and noise, present misleading features like splicing traces, rendering this task difficult. In addition, the scarcity of high-quality datasets of spliced scientific images has limited advancements. Therefore, we propose the uncertainty-guided refinement network (URN) to mitigate these disruptive factors. We also construct a dataset for image splicing detection (SciSp) with 1,290 spliced images by collecting and manually splicing. Comprehensive experiments demonstrate the URN’s superior splicing detection performance.
{"title":"Exposing image splicing traces in scientific publications via uncertainty-guided refinement","authors":"","doi":"10.1016/j.patter.2024.101038","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101038","url":null,"abstract":"<p>Recently, a surge in image manipulations in scientific publications has led to numerous retractions, highlighting the importance of image integrity. Although forensic detectors for image duplication and synthesis have been researched, the detection of image splicing in scientific publications remains largely unexplored. Splicing detection is more challenging than duplication detection due to the lack of reference images and more difficult than synthesis detection because of the presence of smaller tampered-with areas. Moreover, disruptive factors in scientific images, such as artifacts, abnormal patterns, and noise, present misleading features like splicing traces, rendering this task difficult. In addition, the scarcity of high-quality datasets of spliced scientific images has limited advancements. Therefore, we propose the uncertainty-guided refinement network (URN) to mitigate these disruptive factors. We also construct a dataset for image splicing detection (SciSp) with 1,290 spliced images by collecting and manually splicing. Comprehensive experiments demonstrate the URN’s superior splicing detection performance.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"2011 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1016/j.patter.2024.101039
Changes in body mass are key indicators of health in humans and animals and are routinely monitored in animal husbandry and preclinical studies. In rodent studies, the current method of manually weighing the animal on a balance causes at least two issues. First, directly handling the animal induces stress, possibly confounding studies. Second, these data are static, limiting continuous assessment and obscuring rapid changes. A non-invasive, continuous method of monitoring animal mass would have utility in multiple biomedical research areas. We combine computer vision with statistical modeling to demonstrate the feasibility of determining mouse body mass by using video data. Our methods determine mass with a 4.8% error across genetically diverse mouse strains with varied coat colors and masses. This error is low enough to replace manual weighing in most mouse studies. We conclude that visually determining rodent mass enables non-invasive, continuous monitoring, improving preclinical studies and animal welfare.
{"title":"Highly accurate and precise determination of mouse mass using computer vision","authors":"","doi":"10.1016/j.patter.2024.101039","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101039","url":null,"abstract":"<p>Changes in body mass are key indicators of health in humans and animals and are routinely monitored in animal husbandry and preclinical studies. In rodent studies, the current method of manually weighing the animal on a balance causes at least two issues. First, directly handling the animal induces stress, possibly confounding studies. Second, these data are static, limiting continuous assessment and obscuring rapid changes. A non-invasive, continuous method of monitoring animal mass would have utility in multiple biomedical research areas. We combine computer vision with statistical modeling to demonstrate the feasibility of determining mouse body mass by using video data. Our methods determine mass with a 4.8% error across genetically diverse mouse strains with varied coat colors and masses. This error is low enough to replace manual weighing in most mouse studies. We conclude that visually determining rodent mass enables non-invasive, continuous monitoring, improving preclinical studies and animal welfare.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1016/j.patter.2024.101031
The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use federated learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its private data for some time and then shares the neural network parameters (i.e., weights and/or gradients) with a federation controller, which in turn aggregates the local models and sends the resulting community model back to each site, and the process repeats. Our federated learning architecture, MetisFL, provides strong security and privacy. First, sample data never leave a site. Second, neural network parameters are encrypted before transmission and the global neural model is computed under fully homomorphic encryption. Finally, we use information-theoretic methods to limit information leakage from the neural model to prevent a “curious” site from performing model inversion or membership attacks. We present a thorough evaluation of the performance of secure, private federated learning in neuroimaging tasks, including for predicting Alzheimer’s disease and for brain age gap estimation (BrainAGE) from magnetic resonance imaging (MRI) studies in challenging, heterogeneous federated environments where sites have different amounts of data and statistical distributions.
{"title":"A federated learning architecture for secure and private neuroimaging analysis","authors":"","doi":"10.1016/j.patter.2024.101031","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101031","url":null,"abstract":"<p>The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use federated learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its private data for some time and then shares the neural network parameters (i.e., weights and/or gradients) with a federation controller, which in turn aggregates the local models and sends the resulting community model back to each site, and the process repeats. Our federated learning architecture, MetisFL, provides strong security and privacy. First, sample data never leave a site. Second, neural network parameters are encrypted before transmission and the global neural model is computed under fully homomorphic encryption. Finally, we use information-theoretic methods to limit information leakage from the neural model to prevent a “curious” site from performing model inversion or membership attacks. We present a thorough evaluation of the performance of secure, private federated learning in neuroimaging tasks, including for predicting Alzheimer’s disease and for brain age gap estimation (BrainAGE) from magnetic resonance imaging (MRI) studies in challenging, heterogeneous federated environments where sites have different amounts of data and statistical distributions.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"22 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1016/j.patter.2024.101027
The present perspective outlines how epistemically baseless and ethically pernicious paradigms are recycled back into the scientific literature via machine learning (ML) and explores connections between these two dimensions of failure. We hold up the renewed emergence of physiognomic methods, facilitated by ML, as a case study in the harmful repercussions of ML-laundered junk science. A summary and analysis of several such studies is delivered, with attention to the means by which unsound research lends itself to social harms. We explore some of the many factors contributing to poor practice in applied ML. In conclusion, we offer resources for research best practices to developers and practitioners.
本视角概述了在认识论上毫无根据、在伦理道德上有害的范式是如何通过机器学习(ML)重新回到科学文献中的,并探讨了这两方面失败之间的联系。我们将机器学习推动下重新出现的相貌学方法作为一个案例,研究机器学习垃圾科学的有害影响。我们将对几项此类研究进行总结和分析,并关注不靠谱的研究是如何造成社会危害的。我们探讨了造成应用 ML 不良实践的诸多因素。最后,我们为开发人员和从业人员提供了研究最佳实践的资源。
{"title":"The reanimation of pseudoscience in machine learning and its ethical repercussions","authors":"","doi":"10.1016/j.patter.2024.101027","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101027","url":null,"abstract":"<p>The present perspective outlines how epistemically baseless and ethically pernicious paradigms are recycled back into the scientific literature via machine learning (ML) and explores connections between these two dimensions of failure. We hold up the renewed emergence of physiognomic methods, facilitated by ML, as a case study in the harmful repercussions of ML-laundered junk science. A summary and analysis of several such studies is delivered, with attention to the means by which unsound research lends itself to social harms. We explore some of the many factors contributing to poor practice in applied ML. In conclusion, we offer resources for research best practices to developers and practitioners.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"86 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1016/j.patter.2024.101030
The “Reversal Curse” describes the inability of autoregressive decoder large language models (LLMs) to deduce “B is A” from “A is B,” assuming that B and A are distinct and can be uniquely identified from each other. This logical failure suggests limitations in using generative pretrained transformer (GPT) models for tasks like constructing knowledge graphs. Our study revealed that a bidirectional LLM, bidirectional encoder representations from transformers (BERT), does not suffer from this issue. To investigate further, we focused on more complex deductive reasoning by training encoder and decoder LLMs to perform union and intersection operations on sets. While both types of models managed tasks involving two sets, they struggled with operations involving three sets. Our findings underscore the differences between encoder and decoder models in handling logical reasoning. Thus, selecting BERT or GPT should depend on the task’s specific needs, utilizing BERT’s bidirectional context comprehension or GPT’s sequence prediction strengths.
逆转诅咒 "描述的是自回归解码器大型语言模型(LLM)无法从 "A 是 B "推导出 "B 是 A",前提是 B 和 A 是不同的,并且可以从彼此中唯一地识别出来。这种逻辑上的失败表明,在构建知识图谱等任务中使用生成式预训练转换器(GPT)模型存在局限性。我们的研究表明,双向 LLM--来自变换器的双向编码器表征(BERT)并不存在这个问题。为了进一步研究,我们将重点放在了更复杂的演绎推理上,训练编码器和解码器 LLM 对集合进行联合和相交运算。虽然这两类模型都能完成涉及两个集合的任务,但它们在涉及三个集合的运算中却举步维艰。我们的发现强调了编码器模型和解码器模型在处理逻辑推理方面的差异。因此,选择 BERT 还是 GPT 应取决于任务的具体需求,利用 BERT 的双向上下文理解能力或 GPT 的序列预测能力。
{"title":"Exploring the reversal curse and other deductive logical reasoning in BERT and GPT-based large language models","authors":"","doi":"10.1016/j.patter.2024.101030","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101030","url":null,"abstract":"<p>The “Reversal Curse” describes the inability of autoregressive decoder large language models (LLMs) to deduce “B is A” from “A is B,” assuming that B and A are distinct and can be uniquely identified from each other. This logical failure suggests limitations in using generative pretrained transformer (GPT) models for tasks like constructing knowledge graphs. Our study revealed that a bidirectional LLM, bidirectional encoder representations from transformers (BERT), does not suffer from this issue. To investigate further, we focused on more complex deductive reasoning by training encoder and decoder LLMs to perform union and intersection operations on sets. While both types of models managed tasks involving two sets, they struggled with operations involving three sets. Our findings underscore the differences between encoder and decoder models in handling logical reasoning. Thus, selecting BERT or GPT should depend on the task’s specific needs, utilizing BERT’s bidirectional context comprehension or GPT’s sequence prediction strengths.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"7 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141778215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1016/j.patter.2024.101023
The complexity and cost of training machine learning models have made cloud-based machine learning as a service (MLaaS) attractive for businesses and researchers. MLaaS eliminates the need for in-house expertise by providing pre-built models and infrastructure. However, it raises data privacy and model security concerns, especially in medical fields like protein fold recognition. We propose a secure three-party computation-based MLaaS solution for privacy-preserving protein fold recognition, protecting both sequence and model privacy. Our efficient private building blocks enable complex operations privately, including addition, multiplication, multiplexer with a different methodology, most-significant bit, modulus conversion, and exact exponential operations. We demonstrate our privacy-preserving recurrent kernel network (RKN) solution, showing that it matches the performance of non-private models. Our scalability analysis indicates linear scalability with RKN parameters, making it viable for real-world deployment. This solution holds promise for converting other medical domain machine learning algorithms to privacy-preserving MLaaS using our building blocks.
{"title":"A privacy-preserving approach for cloud-based protein fold recognition","authors":"","doi":"10.1016/j.patter.2024.101023","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101023","url":null,"abstract":"<p>The complexity and cost of training machine learning models have made cloud-based machine learning as a service (MLaaS) attractive for businesses and researchers. MLaaS eliminates the need for in-house expertise by providing pre-built models and infrastructure. However, it raises data privacy and model security concerns, especially in medical fields like protein fold recognition. We propose a secure three-party computation-based MLaaS solution for privacy-preserving protein fold recognition, protecting both sequence and model privacy. Our efficient private building blocks enable complex operations privately, including addition, multiplication, multiplexer with a different methodology, most-significant bit, modulus conversion, and exact exponential operations. We demonstrate our privacy-preserving recurrent kernel network (RKN) solution, showing that it matches the performance of non-private models. Our scalability analysis indicates linear scalability with RKN parameters, making it viable for real-world deployment. This solution holds promise for converting other medical domain machine learning algorithms to privacy-preserving MLaaS using our building blocks.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1016/j.patter.2024.101029
Building energy modeling (BEM) is fundamental for achieving optimized energy control, resilient retrofit designs, and sustainable urbanization to mitigate climate change. However, traditional BEM requires detailed building information, expert knowledge, substantial modeling efforts, and customized case-by-case calibrations. This process must be repeated for every building, thereby limiting its scalability. To address these limitations, we developed a modularized neural network incorporating physical priors (ModNN), which is improved by its model structure incorporating heat balance equations, physically consistent model constraints, and data-driven modular design that can allow for multiple-building applications through model sharing and inheritance. We demonstrated its scalability in four cases: load prediction, indoor environment modeling, building retrofitting, and energy optimization. This approach provides guidance for future BEM by incorporating physical priors into data-driven models without extensive modeling efforts, paving the way for large-scale BEM, energy management, retrofit designs, and buildings-to-grid integration.
建筑能源建模(BEM)是实现优化能源控制、弹性改造设计和可持续城市化以减缓气候变化的基础。然而,传统的 BEM 需要详细的建筑信息、专家知识、大量建模工作以及定制的个案校准。每个建筑都必须重复这一过程,从而限制了其可扩展性。为了解决这些局限性,我们开发了一种包含物理先验的模块化神经网络(ModNN),其模型结构包含热平衡方程、物理上一致的模型约束以及数据驱动的模块化设计,可通过模型共享和继承实现多建筑应用。我们在负载预测、室内环境建模、建筑改造和能源优化等四个案例中展示了其可扩展性。这种方法无需大量建模工作就能将物理先验纳入数据驱动模型,为未来的 BEM 提供了指导,为大规模 BEM、能源管理、改造设计和楼宇并网集成铺平了道路。
{"title":"Modularized neural network incorporating physical priors for future building energy modeling","authors":"","doi":"10.1016/j.patter.2024.101029","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101029","url":null,"abstract":"<p>Building energy modeling (BEM) is fundamental for achieving optimized energy control, resilient retrofit designs, and sustainable urbanization to mitigate climate change. However, traditional BEM requires detailed building information, expert knowledge, substantial modeling efforts, and customized case-by-case calibrations. This process must be repeated for every building, thereby limiting its scalability. To address these limitations, we developed a modularized neural network incorporating physical priors (ModNN), which is improved by its model structure incorporating heat balance equations, physically consistent model constraints, and data-driven modular design that can allow for multiple-building applications through model sharing and inheritance. We demonstrated its scalability in four cases: load prediction, indoor environment modeling, building retrofitting, and energy optimization. This approach provides guidance for future BEM by incorporating physical priors into data-driven models without extensive modeling efforts, paving the way for large-scale BEM, energy management, retrofit designs, and buildings-to-grid integration.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"80 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1016/j.patter.2024.101025
Multidimensional reconstruction of brain attractors from electroencephalography (EEG) data enables the analysis of geometric complexity and interactions between signals in state space. Utilizing resting-state data from young and older adults, we characterize periodic (traditional frequency bands) and aperiodic (broadband exponent) attractors according to their geometric complexity and shared dynamical signatures, which we refer to as a geometric cross-parameter coupling. Alpha and aperiodic attractors are the least complex, and their global shapes are shared among all other frequency bands, affording alpha and aperiodic greater predictive power. Older adults show lower geometric complexity but greater coupling, resulting from dedifferentiation of gamma activity. The form and content of resting-state thoughts were further associated with the complexity of attractor dynamics. These findings support a process-developmental perspective on the brain’s dynamic core, whereby more complex information differentiates out of an integrative and global geometric core.
{"title":"EEG spectral attractors identify a geometric core of brain dynamics","authors":"","doi":"10.1016/j.patter.2024.101025","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101025","url":null,"abstract":"<p>Multidimensional reconstruction of brain attractors from electroencephalography (EEG) data enables the analysis of geometric complexity and interactions between signals in state space. Utilizing resting-state data from young and older adults, we characterize periodic (traditional frequency bands) and aperiodic (broadband exponent) attractors according to their geometric complexity and shared dynamical signatures, which we refer to as a geometric cross-parameter coupling. Alpha and aperiodic attractors are the least complex, and their global shapes are shared among all other frequency bands, affording alpha and aperiodic greater predictive power. Older adults show lower geometric complexity but greater coupling, resulting from dedifferentiation of gamma activity. The form and content of resting-state thoughts were further associated with the complexity of attractor dynamics. These findings support a process-developmental perspective on the brain’s dynamic core, whereby more complex information differentiates out of an integrative and global geometric core.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"92 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1016/j.patter.2024.101024
In the rapidly evolving field of bioimaging, the integration and orchestration of findable, accessible, interoperable, and reusable (FAIR) image analysis workflows remains a challenge. We introduce BIOMERO (bioimage analysis in OMERO), a bridge connecting OMERO, a renowned bioimaging data management platform; FAIR workflows; and high-performance computing (HPC) environments. BIOMERO facilitates seamless execution of FAIR workflows, particularly for large datasets from high-content or high-throughput screening. BIOMERO empowers researchers by eliminating the need for specialized knowledge, enabling scalable image processing directly from OMERO. BIOMERO notably supports the sharing and utilization of FAIR workflows between OMERO, Cytomine/BIAFLOWS, and other bioimaging communities. BIOMERO will promote the widespread adoption of FAIR workflows, emphasizing reusability, across the realm of bioimaging research. Its user-friendly interface will empower users, including those without technical expertise, to seamlessly apply these workflows to their datasets, democratizing the utilization of AI by the broader research community.
{"title":"BIOMERO: A scalable and extensible image analysis framework","authors":"","doi":"10.1016/j.patter.2024.101024","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101024","url":null,"abstract":"<p>In the rapidly evolving field of bioimaging, the integration and orchestration of findable, accessible, interoperable, and reusable (FAIR) image analysis workflows remains a challenge. We introduce BIOMERO (bioimage analysis in OMERO), a bridge connecting OMERO, a renowned bioimaging data management platform; FAIR workflows; and high-performance computing (HPC) environments. BIOMERO facilitates seamless execution of FAIR workflows, particularly for large datasets from high-content or high-throughput screening. BIOMERO empowers researchers by eliminating the need for specialized knowledge, enabling scalable image processing directly from OMERO. BIOMERO notably supports the sharing and utilization of FAIR workflows between OMERO, Cytomine/BIAFLOWS, and other bioimaging communities. BIOMERO will promote the widespread adoption of FAIR workflows, emphasizing reusability, across the realm of bioimaging research. Its user-friendly interface will empower users, including those without technical expertise, to seamlessly apply these workflows to their datasets, democratizing the utilization of AI by the broader research community.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"1 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1016/j.patter.2024.100974
Sarthak Pati, Sourav Kumar, Amokh Varma, Brandon Edwards, Charles Lu, Liangqiong Qu, Justin J. Wang, Anantharaman Lakshminarayanan, Shih-han Wang, Micah J. Sheller, Ken Chang, Praveer Singh, Daniel L. Rubin, Jayashree Kalpathy-Cramer, Spyridon Bakas
Artificial intelligence (AI) shows potential to improve health care by leveraging data to build models that can inform clinical workflows. However, access to large quantities of diverse data is needed to develop robust generalizable models. Data sharing across institutions is not always feasible due to legal, security, and privacy concerns. Federated learning (FL) allows for multi-institutional training of AI models, obviating data sharing, albeit with different security and privacy concerns. Specifically, insights exchanged during FL can leak information about institutional data. In addition, FL can introduce issues when there is limited trust among the entities performing the compute. With the growing adoption of FL in health care, it is imperative to elucidate the potential risks. We thus summarize privacy-preserving FL literature in this work with special regard to health care. We draw attention to threats and review mitigation approaches. We anticipate this review to become a health-care researcher’s guide to security and privacy in FL.
{"title":"Privacy preservation for federated learning in health care","authors":"Sarthak Pati, Sourav Kumar, Amokh Varma, Brandon Edwards, Charles Lu, Liangqiong Qu, Justin J. Wang, Anantharaman Lakshminarayanan, Shih-han Wang, Micah J. Sheller, Ken Chang, Praveer Singh, Daniel L. Rubin, Jayashree Kalpathy-Cramer, Spyridon Bakas","doi":"10.1016/j.patter.2024.100974","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100974","url":null,"abstract":"<p>Artificial intelligence (AI) shows potential to improve health care by leveraging data to build models that can inform clinical workflows. However, access to large quantities of diverse data is needed to develop robust generalizable models. Data sharing across institutions is not always feasible due to legal, security, and privacy concerns. Federated learning (FL) allows for multi-institutional training of AI models, obviating data sharing, albeit with different security and privacy concerns. Specifically, insights exchanged during FL can leak information about institutional data. In addition, FL can introduce issues when there is limited trust among the entities performing the compute. With the growing adoption of FL in health care, it is imperative to elucidate the potential risks. We thus summarize privacy-preserving FL literature in this work with special regard to health care. We draw attention to threats and review mitigation approaches. We anticipate this review to become a health-care researcher’s guide to security and privacy in FL.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"9 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141609808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}