Introduction: Malaria remains a major global health burden and motivates fast, reliable in silico prioritization of antimalarial (AM) peptide candidates. Designing such peptides is challenging due to the vast search space, scarce or noisy supervision, and potential out-of-distribution miscalibration of computational scores. Prior pipelines typically rank existing sequences rather than generate new candidates under explicit design constraints with calibrated, risk-aware decision rules.
Methods: We propose a constraint-guided generate-then-classify framework. A low-data generator-an optimized variant of CTCM-Neo-proposes de novo sequences within APD3-derived windows for net charge, GRAVY, and Boman index. A frozen, temperature-scaled protein language-model classifier (ConformaX-PEP) outputs calibrated probabilities for predicted antimalarial activity and hemolysis, and a split-conformal gate with risk level α=0.1 converts these scores into accept/reject decisions at fixed operating thresholds p act ≥ 0.78 and p hemo ≤ 0.20.
Results: On the initial 322-sequence corpus (52 AM, 200 unlabeled, 70 positive-like), a held-out evaluation achieves AUROC ≈0.93, AUPRC ≈0.80, and ECE ≈0.03, indicating strong discrimination with low calibration error prior to external testing. The method outperforms strong baselines in convergence speed and reliability. On 210 previously unseen peptides (80 AM, 130 NM), two independent runs achieve 92.86% and 93.33% accuracy with balanced precision and recall and good calibration. Hyperparameter sweeps reveal broad, stable optima, supporting reproducibility. Template-based docking with GalaxyPepDock is used strictly as a hypothesis-generating structural sanity check and does not constitute evidence of biological binding or efficacy.
Discussion: Overall, the framework compresses the search space into a small, risk-bounded set of computationally prioritized candidates and provides a scalable, uncertainty-aware route for downstream experimental follow-up. All results reported here are computational, and antimalarial activity remains to be confirmed experimentally.
扫码关注我们
求助内容:
应助结果提醒方式:
