Cancer classification is crucial for effective patient treatment, and recent years have seen various methods emerge based on protein expression levels. However, existing methods oversimplify by assuming uniform interaction strengths and neglecting intermediate influences among proteins. Addressing these limitations, GATDE employs a graph attention network enhanced with diffusion on protein-protein interactions. By constructing a weighted protein-protein interaction network, GATDE captures the diversity of these interactions and uses a diffusion process to assess multi-hop influences between proteins. This information is subsequently incorporated into the graph attention network, resulting in precise cancer classification. Experimental results on breast cancer and pan-cancer datasets demonstrate that GATDE surpasses current leading methods. Additionally, in-depth case studies further validate the effectiveness of the diffusion process and the attention mechanism, highlighting GATDE's robustness and potential for real-world applications.
Arabidopsis thaliana synthesizes various medicinal compounds, and serves as a model plant for medicinal plant research. Single-cell transcriptomics technologies are essential for understanding the developmental trajectory of plant roots, facilitating the analysis of synthesis and accumulation patterns of medicinal compounds in different cell subpopulations. Although methods for interpreting single-cell transcriptomics data are rapidly advancing in Arabidopsis, challenges remain in precisely annotating cell identity due to the lack of marker genes for certain cell types. In this work, we trained a machine learning system, AtML, using sequencing datasets from six cell subpopulations, comprising a total of 6000 cells, to predict Arabidopsis root cell stages and identify biomarkers through complete model interpretability. Performance testing using an external dataset revealed that AtML achieved 96.50% accuracy and 96.51% recall. Through the interpretability provided by AtML, our model identified 160 important marker genes, contributing to the understanding of cell type annotations. In conclusion, we trained AtML to efficiently identify Arabidopsis root cell stages, providing a new tool for elucidating the mechanisms of medicinal compound accumulation in Arabidopsis roots.