The advent of powerful machine learning algorithms as well as the availability of high volume of pharmacological data has given new fuel to QSAR, opening new unprecedented options for deriving highly predictive models for assisting the rationale design of new bioactive compounds, for screening and prioritizing large molecular libraries, and for repurposing new drugs toward new clinical uses. Here, we present PoseidonQ (an acronym for Personal Optimization Software for Efficient Implementation and Derivation of Online QSAR), a user-friendly software solution designed to simplify the derivation of the QSAR model for drug design and discovery. PoseidonQ incorporates 22 machine learning algorithms, 17 types of molecular fingerprints, and 208 RDKit molecular descriptors and enables the quick derivation of both regression and classification models along with a calculated and easily interpretable applicability domain. Importantly, the platform is automatically linked to the latest version of the ChEMBL database, thus providing streamlined access to large amounts of curated bioactivity data. Importantly, the user is also given the option of gathering high-quality experimental data based on customizable filtering settings. Noteworthy, PoseidonQ facilitates the deployment of trained QSAR models as web-based applications through seamless integration with Streamlit Cloud and GitHub, empowering users to share, refine, and integrate models effortlessly. Interestingly, the translation of QSAR models into web-based applications makes them free accessible, portable, and ready for screening large volumes of new data without limits. By unifying data preparation, model generation, and deployment into an intuitive workflow, PoseidonQ makes advanced QSAR modeling for drug design and discovery accessible to a wide audience of researchers irrespective of their skill levels. PoseidonQ bridges the gap between complex machine learning techniques and practical drug discovery applications, enhancing the efficiency, collaboration, and adoption of QSAR approaches in modern drug discovery programs. PoseidonQ is available for Windows and Linux (ubuntu 22.04 distro) operating systems and can be downloaded for free at https://github.com/Muzatheking12/PoseidonQ.
With the release of AlphaFold3, modeling capabilities have expanded beyond protein structure prediction to embrace the inherent complexity of biomolecular systems, including nucleic acids, ions, small molecules, and their interactions. The increased complexity of these assemblies is reflected in the input file generation process, presenting a significant hurdle for researchers without advanced computational expertise. While AlphaFold Server comes with a user-friendly graphical user interface, it supports only a subset of the features of AlphaFold3. To address this, we present af3cli, an open-source tool designed to facilitate the generation of AlphaFold3 input files, specifically tailored to the standalone version of AlphaFold3 and its unrestricted functionality. Featuring a user-friendly command-line interface and an accompanying Python library, af3cli simplifies the input generation process while maintaining flexibility and customization, which makes af3cli especially useful for fast (automated) generation of a large number of input files since it enables direct incorporation of FASTA files, keeps track of IDs, and validates the JSON file. Through practical examples, we demonstrate its capabilities for constructing input data for diverse biological structures, ranging from simple proteins to complex systems, and demonstrate its seamless integration into both manual and automated workflows.
Deep learning has revolutionized difficult tasks in chemistry and biology, yet existing language models often treat these domains separately, relying on concatenated architectures and independently pretrained weights. These approaches fail to fully exploit the shared atomic foundations of molecular and protein sequences. Here, we introduce T5ProtChem, a unified model based on the T5 architecture, designed to simultaneously process molecular and protein sequences. Using a new pretraining objective, ProtiSMILES, T5ProtChem bridges the molecular and protein domains, enabling efficient, generalizable protein-chemical modeling. The model achieves a state-of-the-art performance in tasks such as binding affinity prediction and reaction prediction, while having a strong performance in protein function prediction. Additionally, it supports novel applications, including covalent binder classification and sequence-level adduct prediction. These results demonstrate the versatility of unified language models for drug discovery, protein engineering, and other interdisciplinary efforts in computational biology and chemistry.
The activity of the enzyme JAK3 is modulated by tyrosine phosphorylation, yet the underlying molecular details remain not fully understood. In this study, we employed a GaMD trajectory-based Markov model and correlation network analysis (CNA) to investigate the impact of single phosphorylation (SP) at Y980 (pY980) and double phosphorylation (DP) at Y980/Y981 (pY980/pY981) on the conformational dynamics of JAK3 bound by inhibitors IZA and MI1. The Markov model analysis indicated that both SP and DP result in fewer conformational states and significantly influence the conformational dynamics of the P-loop, αC-helix, and loop1-loop3, while maintaining the hinge region's high rigidity. The CNA findings revealed that phosphorylation alters the communication network among different structural regions of JAK3, providing a rational explanation for how phosphorylation affects the conformational dynamics of the distant P-loop and loop1-loop3. Moreover, the conformational changes mediated by SP and DP further affect the interactions between the inhibitors and the hot spots (L828, V836, E903, Y904, L905, and L956) of JAK3. This work offers valuable theoretical insights into the molecular mechanisms that regulate JAK3 activity.