[This corrects the article DOI: 10.3389/frai.2024.1431156.].
[This corrects the article DOI: 10.3389/frai.2024.1431156.].
Cell-penetrating peptides (CPPs) are highly effective at passing through eukaryotic membranes with various cargo molecules, like drugs, proteins, nucleic acids, and nanoparticles, without causing significant harm. Creating drug delivery systems with CPP is associated with cancer, genetic disorders, and diabetes due to their unique chemical properties. Wet lab experiments in drug discovery methodologies are time-consuming and expensive. Machine learning (ML) techniques can enhance and accelerate the drug discovery process with accurate and intricate data quality. ML classifiers, such as support vector machine (SVM), random forest (RF), gradient-boosted decision trees (GBDT), and different types of artificial neural networks (ANN), are commonly used for CPP prediction with cross-validation performance evaluation. Functional CPP prediction is improved by using these ML strategies by using CPP datasets produced by high-throughput sequencing and computational methods. This review focuses on several ML-based CPP prediction tools. We discussed the CPP mechanism to understand the basic functioning of CPPs through cells. A comparative analysis of diverse CPP prediction methods was conducted based on their algorithms, dataset size, feature encoding, software utilities, assessment metrics, and prediction scores. The performance of the CPP prediction was evaluated based on accuracy, sensitivity, specificity, and Matthews correlation coefficient (MCC) on independent datasets. In conclusion, this review will encourage the use of ML algorithms for finding effective CPPs, which will have a positive impact on future research on drug delivery and therapeutics.
Packed columns are commonly used in post-combustion processes to capture CO2 emissions by providing enhanced contact area between a CO2-laden gas and CO2-absorbing solvent. To study and optimize solvent-based post-combustion carbon capture systems (CCSs), computational fluid dynamics (CFD) can be used to model the liquid-gas countercurrent flow hydrodynamics in these columns and derive key determinants of CO2-capture efficiency. However, the large design space of these systems hinders the application of CFD for design optimization due to its high computational cost. In contrast, data-driven modeling approaches can produce fast surrogates to study large-scale physics problems. We build our surrogates using MeshGraphNets (MGN), a graph neural network framework that efficiently learns and produces mesh-based simulations. We apply MGN to a random packed column modeled with over 160K graph nodes and a design space consisting of three key input parameters: solvent surface tension, inlet velocity, and contact angle. Our models can adapt to a wide range of these parameters and accurately predict the complex interactions within the system at rates over 1700 times faster than CFD, affirming its practicality in downstream design optimization tasks. This underscores the robustness and versatility of MGN in modeling complex fluid dynamics for large-scale CCS analyses.
The rapid advancement of artificial intelligence (AI) has introduced transformative opportunities in oncology, enhancing the precision and efficiency of tumor diagnosis and treatment. This review examines recent advancements in AI applications across tumor imaging diagnostics, pathological analysis, and treatment optimization, with a particular focus on breast cancer, lung cancer, and liver cancer. By synthesizing findings from peer-reviewed studies published over the past decade, this paper analyzes the role of AI in enhancing diagnostic accuracy, streamlining therapeutic decision-making, and personalizing treatment strategies. Additionally, this paper addresses challenges related to AI integration into clinical workflows and regulatory compliance. As AI continues to evolve, its applications in oncology promise further improvements in patient outcomes, though additional research is needed to address its limitations and ensure ethical and effective deployment.
Traditional Chinese medicine (TCM) has long utilized tongue diagnosis as a crucial method for assessing internal visceral condition. This study aims to modernize this ancient practice by developing an automated system for analyzing tongue images in relation to the five organs, corresponding to the heart, liver, spleen, lung, and kidney-collectively known as the "five viscera" in TCM. We propose a novel tongue image partitioning algorithm that divides the tongue into four regions associated with these specific organs, according to TCM principles. These partitioned regions are then processed by our newly developed OrganNet, a specialized neural network designed to focus on organ-specific features. Our method simulates the TCM diagnostic process while leveraging modern machine learning techniques. To support this research, we have created a comprehensive tongue image dataset specifically tailored for these five visceral pattern assessment. Results demonstrate the effectiveness of our approach in accurately identifying correlations between tongue regions and visceral conditions. This study bridges TCM practices with contemporary technology, potentially enhancing diagnostic accuracy and efficiency in both TCM and modern medical contexts.
This study explores the evolving role of social media in the spread of misinformation during the Ukraine-Russia conflict, with a focus on how artificial intelligence (AI) contributes to the creation of deceptive war imagery. Specifically, the research examines the relationship between color patterns (LUTs) in war-related visuals and their perceived authenticity, highlighting the economic, political, and social ramifications of such manipulative practices. AI technologies have significantly advanced the production of highly convincing, yet artificial, war imagery, blurring the line between fact and fiction. An experimental project is proposed to train a generative AI model capable of creating war imagery that mimics real-life footage. By analyzing the success of this experiment, the study aims to establish a link between specific color patterns and the likelihood of images being perceived as authentic. This could shed light on the mechanics of visual misinformation and manipulation. Additionally, the research investigates the potential of a serverless AI framework to advance both the generation and detection of fake news, marking a pivotal step in the fight against digital misinformation. Ultimately, the study seeks to contribute to ongoing debates on the ethical implications of AI in information manipulation and to propose strategies to combat these challenges in the digital era.
Given close relationships between ocular structure and ophthalmic disease, ocular biometry measurements (including axial length, lens thickness, anterior chamber depth, and keratometry values) may be leveraged as features in the prediction of eye diseases. However, ocular biometry measurements are often stored as PDFs rather than as structured data in electronic health records. Thus, time-consuming and laborious manual data entry is required for using biometry data as a disease predictor. Herein, we used two separate models, PaddleOCR and Gemini, to extract eye specific biometric measurements from 2,965 Lenstar, 104 IOL Master 500, and 3,616 IOL Master 700 optical biometry reports. For each patient eye, our text extraction pipeline, referred to as Ocular Biometry OCR, involves 1) cropping the report to the biometric data, 2) extracting the text via the optical character recognition model, 3) post-processing the metrics and values into key value pairs, 4) correcting erroneous angles within the pairs, 5) computing the number of errors or missing values, and 6) selecting the window specific results with fewest errors or missing values. To ensure the models' predictions could be put into a machine learning-ready format, artifacts were removed from categorical text data through manual modification where necessary. Performance was evaluated by scoring PaddleOCR and Gemini results. In the absence of ground truth, higher scoring indicated greater inter-model reliability, assuming an equal value between models indicated an accurate result. The detection scores, measuring the number of valid values (i.e., not missing or erroneous), were Lenstar: 0.990, IOLM 500: 1.000, and IOLM 700: 0.998. The similarity scores, measuring the number of equal values, were Lenstar: 0.995, IOLM 500: 0.999, and IOLM 700: 0.999. The agreement scores, combining detection and similarity scores, were Lenstar: 0.985, IOLM 500: 0.999, and IOLM 700: 0.998. IOLM 500 was annotated for ground truths; in this case, higher scoring indicated greater model-to-annotator accuracy. PaddleOCR-to-Annotator achieved scores of detection: 1.000, similarity: 0.999, and agreement: 0.999. Gemini-to-Annotator achieved scores of detection: 1.000, similarity: 1.000, and agreement: 1.000. Scores range from 0 to 1. While PaddleOCR and Gemini demonstrated high agreement, PaddleOCR offered slightly better performance upon reviewing quantitative and qualitative results.
Introduction: As artificial intelligence systems like large language models (LLM) and natural language processing advance, the need to evaluate their utility within medicine and medical education grows. As medical research publications continue to grow exponentially, AI systems offer valuable opportunities to condense and synthesize information, especially in underrepresented areas such as Sleep Medicine. The present study aims to compare summarization capacity between LLM generated summaries of sleep medicine research article abstracts, to summaries generated by Medical Student (humans) and to evaluate if the research content, and literary readability summarized is retained comparably.
Methods: A collection of three AI-generated and human-generated summaries of sleep medicine research article abstracts were shared with 19 study participants (medical students) attending a sleep medicine conference. Participants were blind as to which summary was human or LLM generated. After reading both human and AI-generated research summaries participants completed a 1-5 Likert scale survey on the readability of the extracted writings. Participants also answered article-specific multiple-choice questions evaluating their comprehension of the summaries, as a representation of the quality of content retained by the AI-generated summaries.
Results: An independent sample t-test between the AI-generated and human-generated summaries comprehension by study participants revealed no significant difference between the Likert readability ratings (p = 0.702). A chi-squared test of proportions revealed no significant association (χ 2 = 1.485, p = 0.223), and a McNemar test revealed no significant association between summary type and the proportion of correct responses to the comprehension multiple choice questions (p = 0.289).
Discussion: Some limitations in this study were a small number of participants and user bias. Participants attended at a sleep conference and study summaries were all from sleep medicine journals. Lastly the summaries did not include graphs, numbers, and pictures, and thus were limited in material extraction. While the present analysis did not demonstrate a significant difference among the readability and content quality between the AI and human-generated summaries, limitations in the present study indicate that more research is needed to objectively measure, and further define strengths and weaknesses of AI models in condensing medical literature into efficient and accurate summaries.
Background: Detecting programmed death ligand 1 (PD-L1) expression based on immunohistochemical (IHC) staining is an important guide for the treatment of lung cancer with immune checkpoint inhibitors. However, this method has problems such as high staining costs, tumor heterogeneity, and subjective differences among pathologists. Therefore, the application of deep learning models to segment and quantitatively predict PD-L1 expression in digital sections of Hematoxylin and eosin (H&E) stained lung squamous cell carcinoma is of great significance.
Methods: We constructed a dataset comprising H&E-stained digital sections of lung squamous cell carcinoma and used a Transformer Unet (TransUnet) deep learning network with an encoder-decoder design to segment PD-L1 negative and positive regions and quantitatively predict the tumor cell positive score (TPS).
Results: The results showed that the dice similarity coefficient (DSC) and intersection overunion (IoU) of deep learning for PD-L1 expression segmentation of H&E-stained digital slides of lung squamous cell carcinoma were 80 and 72%, respectively, which were better than the other seven cutting-edge segmentation models. The root mean square error (RMSE) of quantitative prediction TPS was 26.8, and the intra-group correlation coefficients with the gold standard was 0.92 (95% CI: 0.90-0.93), which was better than the consistency between the results of five pathologists and the gold standard.
Conclusion: The deep learning model is capable of segmenting and quantitatively predicting PD-L1 expression in H&E-stained digital sections of lung squamous cell carcinoma, which has significant implications for the application and guidance of immune checkpoint inhibitor treatments. And the link to the code is https://github.com/Baron-Huang/PD-L1-prediction-via-HE-image.
Social media platforms, including X, Facebook, and Instagram, host millions of daily users, giving rise to bots automated programs disseminating misinformation and ideologies with tangible real-world consequences. While bot detection in platform X has been the area of many deep learning models with adequate results, most approaches neglect the graph structure of social media relationships and often rely on hand-engineered architectures. Our work introduces the implementation of a Neural Architecture Search (NAS) technique, namely Deep and Flexible Graph Neural Architecture Search (DFG-NAS), tailored to Relational Graph Convolutional Neural Networks (RGCNs) in the task of bot detection in platform X. Our model constructs a graph that incorporates both the user relationships and their metadata. Then, DFG-NAS is adapted to automatically search for the optimal configuration of Propagation and Transformation functions in the RGCNs. Our experiments are conducted on the TwiBot-20 dataset, constructing a graph with 229,580 nodes and 227,979 edges. We study the five architectures with the highest performance during the search and achieve an accuracy of 85.7%, surpassing state-of-the-art models. Our approach not only addresses the bot detection challenge but also advocates for the broader implementation of NAS models in neural network design automation.