Numerous web-based tools have been developed to support large-scale genomics research, whereas challenges remain due to their limited functionality. Therefore, we developed VarXOmics, an end-to-end, versatile web server for querying variants and genes, streamlining germline variant analysis, prioritizing variants with multi-omics insights, and providing interactive visualizations. The utility of VarXOmics was demonstrated by analyzing multiple small variants of the whole genome sequencing data from a breast cancer patient. It prioritized BRCA2 c.3751dup as the most likely pathogenic variant, and highlighted disease associations with cell cycle regulation, DNA repair pathways, and type 2 diabetes through multi-omics evidence, gene set enrichment, and network analysis. Overall, VarXOmics serves as a practical genomics platform for researchers and clinicians. It shows potential in identifying pathogenic variants and causal genes, uncovering the molecular mechanisms of disease pathogenesis, providing valuable references for clinical decision-making and therapeutic strategies, thus advancing precision medicine. VarXOmics is publicly available at https://www.phenomeportal.org/varxomics.
Molecular modeling and simulation play a crucial role in advancing our understanding of protein function at the molecular level, offering insights that complement experimental approaches. In particular, molecular dynamics (MD) simulations with explicit lipid bilayers have become essential for a molecular level understanding of protein-lipid interactions that regulate the structure, dynamics, and function of membrane proteins. CHARMM-GUI (http://www.charmm-gui.org) is a web-based graphical user interface designed to generate MD simulation systems and input files for various simulation engines. Here, we introduce Quick Bilayer, a new CHARMM-GUI module, which provides a streamlined and efficient one-stop platform for assembling protein structures with a diverse set of biologically relevant membrane environments. It features advanced search capabilities that allow users to identify specific lipid types and design bilayers with customized lipid compositions to meet specific research needs. To further enhance usability and scalability, Quick Bilayer now supports a REST-like API that enables seamless integration with backend services. This newly implemented command-line interface allows users to programmatically access the module, facilitating automated workflows and large-scale system generation.
The rise of protein Language Models (pLMs) is reshaping the landscape of protein prediction. Embeddings are powerful protein representations provided by pLMs, but they come at a cost: their generation requires expensive hardware, and leveraging models often requires expert knowledge. To some extent, these hurdles limit the ease of use and benefits of those methods both for experimental and computational biologists. Biocentral aims at providing a free and open embedding-based service, which addresses these challenges. We support standardized access to most pLMs currently in use, enabling researchers to generate embeddings, get embedding-based protein feature predictions, and train embedding-based models. Here, we showcase biocentral in a large-scale analysis of the BFVD virus database through biocentral's predict module. We also show how readily biocentral's training module reproduces an existing embedding-based prediction method. The server is accessible through a graphical user interface and a programmatic Application Programming Interface (API) at: https://biocentral.rostlab.org.
The precise localization of proteins within prokaryotic cells is fundamental to understanding their function. However, existing models still struggle with challenging localization classes, such as cell wall or outer membrane proteins. We introduce LocPred-Prok, a novel deep learning framework that redefines performance standards for prokaryotic subcellular localization. LocPred-Prok employs a purpose-built dual-branch architecture that synergistically integrates global and local sequence features extracted from pLM embeddings. On a stringent, homology-partitioned benchmark, LocPred-Prok achieves a state-of-the-art accuracy of 91.2 % and a Matthews Correlation Coefficient (MCC) of 0.889. Critically, it resolves long-standing prediction challenges, demonstrating exceptional performance on notoriously difficult classes like Gram-positive cell wall and Gram-negative outer membrane proteins. It substantially outperforms recent and classic methods across all organismal subgroups, representing a significant leap forward in the field. The LocPred-Prok web server is freely accessible athttps://huggingface.co/spaces/isyslab/LocPred-Prok.
The human reference proteome is routinely modeled with predictive tools such as AlphaFold2 and ESMFold. The two methods, based on different procedures, can behave differently depending on the experimental information available for a protein. We previously released a public database that stores pairs of predicted models, allowing us to obtain insights into the two methods and providing a resource where users can select the better model for downstream analysis. Here, we update the database after the latest release of UniProt (2025_04), we functionally characterize the models by mapping Pfam entries on the 3D structures, and we introduce external quality assessment metrics to evaluate and compare the models. We observe that, regardless of the quality and similarity of the predicted models, both AlphaFold2 and ESMFold converge with high pLDDT values in regions covered by Pfam entries. Alpha&ESMhFolds, including all its features, is freely available at https://alpha-esmhfolds.biocomp.unibo.it/.
Heavy metal contamination poses a significant threat to environmental health, agriculture, and microbial ecosystems, necessitating the identification of molecular components that confer resistance. Heavy metal resistance (HMR) proteins enable organisms to survive toxic metal exposure through mechanisms such as efflux transport, enzymatic detoxification, and metal sequestration. However, the diversity and functional overlap of these proteins across taxa present challenges for reliable identification using conventional homology-based methods. Furthermore, current machine learning approaches for resistance gene prediction primarily focus on antibiotics, with no comprehensive resource available for systematically classifying HMR proteins across multiple metals and biological domains. To address this, we developed HMRPred, a machine learning-based predictive framework for the identification of HMR proteins across ten metals of concern: arsenic, cadmium, chromium, copper, iron, lead, mercury, nickel, silver, and zinc. Curated datasets comprising experimentally validated resistance and non-resistance proteins were used to extract a comprehensive set of sequence-derived features, including amino acid composition and physicochemical descriptors. For each metal, optimized classifiers were trained using various machine learning algorithms, achieving high performance with an AUC-ROC of more than 98% in both cross-validation and independent testing. HMRPred is deployed as a web-accessible resource (available at https://hmrpred.streamlit.app/), allowing researchers to submit protein sequences and obtain predictions with confidence scores. By facilitating genome-wide annotation of metal resistance determinants, HMRPred supports applications in bioremediation, environmental microbiology, phytoremediation, and synthetic biology.
Understanding how mutations affect protein function remains critical yet challenging, particularly for variants in clinical databases lacking experimental characterisation and for intrinsically disordered regions. Current computational approaches often operate as black boxes, providing predictions without sufficient transparency or quality assessment of the underlying data. Here we present ProteoCast, a user-friendly web server that predicts variant effects through evolutionary constraint analysis and structural context integration. ProteoCast provides a three-tier variant classification (impactful, mild, neutral) to help prioritise mutations for clinical interpretation and experimental validation. It incorporates multiple sequence alignment quality controls to ensure prediction reliability and flag positions with insufficient evolutionary information. Beyond single-variant classification, ProteoCast employs a novel segmentation approach based on mutational sensitivity to identify functional linear peptides in disordered regions. Interactive visualisations guide users through results interpretation, from variant-level predictions to protein-wide functional landscapes. Evaluation on 63,000 ClinVar variants demonstrates 77 % sensitivity and 87 % specificity for pathogenicity prediction, with performance maintained across species (85 % accuracy on Drosophila lethal mutations). ProteoCast successfully identifies twice as many functional motifs in intrinsically disordered regions compared to conservation-based phylogenetic methods. Predictions can be tuned to specific conformations, such as bound forms in protein complexes, for improved accuracy and interpretability. With its transparent, unsupervised methodology and computational efficiency (minutes per protein), ProteoCast democratises access to variant effect prediction and functional site discovery for the broader research community. The web server is freely available at: https://proteocast.ijm.fr/.
The recent development of highly accurate protein structure prediction tools has led to a rapid expansion in the scope of computational structural biology, enabling a much wider range of modelling studies than ever before. These new in silico opportunities help life science researchers understand how proteins interact with their environment and support design of new molecules with desired properties. Ultimately, they have broad applications, e.g. in medicine, drug discovery or engineering. To ensure reproducibility and to facilitate data exchange and reuse, predicted structures or computed structure models can be stored using ModelCIF, a rich data representation designed to include the atomic coordinates/metadata. The previously published version of ModelCIF (1.4.4; 2022-12-21) mainly covered protein structure predictions generated by homology and ab initio modelling. In this work, we present an extension of the ModelCIF (https://github.com/ihmwg/ModelCIF) data standard and its associated tools. This extension supports important new use cases, including modelling protein-ligand and protein-protein interactions, sampling multiple conformational states and designing proteins de novo. We define guidelines for storage and validation of modelling results for those use cases by applying new and existing ModelCIF categories to capture protocols, inputs and outputs. Additionally, we outline updates to the software tools and resources that implement these new standards and provide functionality for model generation, validation, archiving, and visualisation. By enabling consistent metadata capture across different modelling workflows, this framework aims to support the FAIR dissemination of computational models, thereby promoting reproducibility and reusability in downstream applications.
Accurate determination of initial reaction rates (v0) is essential for characterizing enzyme function, designing inhibitors, and modeling biological systems. Traditional methods rely on linear approximations valid for reaction phases difficult to capture, while substrate excess over the enzyme does not ensure constant rates. To overcome these limitations, we developed ACCU-RATES, a user-friendly web tool that analyzes heuristically product accumulation or substrate depletion curves containing at least two time points. Using a differential form of the Michaelis-Menten equation, ACCU-RATES numerically fits progress curves to interpolate v0, enabling precise determination of the Michaelis constant (Km) and limiting rate (V). Simulations across diverse scenarios, including data noise and low sampling rates, show that ACCU-RATES delivers reliable, user-independent parameter estimates without relying on linear phases. Compared to existing methods, it offers superior accuracy and robustness against assay interferences, with applications in inhibitor discovery, synthetic biology, and biomarker assays. ACCU-RATES is freely available at https://accu-rates.i3s.up.pt.

