Based on the important role of antibiotic treatment in the research of the interaction between Wolbachia and insect hosts, this study aimed to identify the most suitable antibiotic and concentration for Wolbachia elimination in the P. xylostella, and to investigate the effect of Wolbachia and antibiotic treatment on the bacterial community of P. xylostella. Our results showed that the Wolbachia-infected strain was plutWB1 of supergroup B in the P. xylostella population collected in Nepal in this study; 1 mg/mL rifampicin could remove Wolbachia infection in P. xylostella after 1 generation of feeding treatment and the toxic effect was relatively low; among the 29 samples of adult P. xylostella in our study (10 WU samples, 10 WA samples, and 9 WI samples), 52.5% of the sequences were of Firmicutes and 47.5% were of Proteobacteria, with the dominant genera being mainly Carnobacterium (46.2%), Enterobacter (10.1%), and Enterococcus (6.2%); Moreover, antibiotic removal of Wolbachia infection in P. xylostella and transfer to normal conditions for 10 generations no longer significantly affected the bacterial community of P. xylostella. This study provides a theoretical basis for the elimination method of Wolbachia in the P. xylostella, as well as a reference for the elimination method of Wolbachia in other Wolbachia-infected insect species, and a basis for the study of the extent and duration of the effect of antibiotic treatment on the bacterial community of the P. xylostella.
Hepatocellular carcinoma (HCC) is the most common primary malignancy of the liver. Although the RNA modification N6-methyladenine (m6A) has been reported to be involved in HCC carcinogenesis, early diagnostic markers and promising personalized therapeutic targets are still lacking. In this study, we identified that 19 m6A regulators and 34 co-expressed lncRNAs were significantly upregulated in HCC samples; based on these factors, we established a prognostic signal of HCC associated with 9 lncRNAs and 19 m6A regulators using LASSO Cox regression analysis. Kaplan-Meier survival estimate revealed correlations between the risk scores and patients' OS in the training and validation dataset. The ROC curve demonstrated that the risk score-based curve has satisfactory prediction efficiency for both training and validation datasets. Multivariate Cox's proportional hazard regression analysis indicated that the risk score was an independent risk factor within the training and validation dataset. In addition, the risk score could distinguish HCC patients from normal non-cancerous samples and HCC samples of different pathological grades. Eventually, 232 mRNAs were co-expressed with these 9 lncRNAs according to GSE101685 and GSE112790; these mRNAs were enriched in cell cycle and cell metabolic activities, drug metabolism, liver disease-related pathways, and some important cancer related pathways such as p53, MAPK, Wnt, RAS and so forth. The expression of the 9 lncRNAs was significantly higher in HCC samples than that in the neighboring non-cancerous samples. Altogether, by using the Consensus Clustering, PCA, ESTIMATE algorithm, LASSO regression model, Kaplan-Meier survival assessment, ROC curve analysis, and multivariate Cox's proportional hazard regression model analysis, we established a prognostic marker consisting of 9 m6A regulator-related lncRNAs that markers may have prognostic and diagnostic potential for HCC.
B12D-Like is a member of the B12D domain-containing protein family, which includes several transmembrane proteins in plants. In this study, the cDNA of PgB12D-Like from Pennisetum glaucum subsp. monodii (Maire) Brunken was sequenced and characterized. The 446-bp cDNA for PgB12D-Like encodes for a deduced protein of 95 amino acids. The PgB12D-Like protein contains a B12D domain and a transmembrane helix embedded in the mitochondrial membrane. Cis-regulatory elements analysis reveals binding sites for various transcription factors involved in responses to stress, light, and plant hormones in the putative promoter sequence for PgB12D-Like. Several proteins involved in floral organ development were also found to have binding sites in the PgB12D-Like promoter, such as agamous-like proteins and squamosa promoter binding proteins. Real-time PCR reveals high expression of PgB12D-Like in flowers during heading, whereas its expression in a 4-day-old seedling shoot was the lowest. Moreover, cold, drought, and heat stress were found to upregulate PgB12D-Like, whereas gibberellic acid downregulated its expression in seedlings. The present study helps to uncover the function of the B12D-Like in response to plant hormones and abiotic stress during P. glaucum development.
Lineage-specific genes can contribute to the emergence and evolution of novel traits and adaptations. Tardigrades are animals that have adapted to tolerate extreme conditions by undergoing a form of cryptobiosis called anhydrobiosis, a physical transformation to an inactive desiccated state. While studies to understand the genetics underlying the interspecies diversity in anhydrobiotic transitions have identified tardigrade-specific genes and family expansions involved in this process, the contributions of species-specific genes to the variation in tardigrade development and cryptobiosis are less clear. We used previously published transcriptomes throughout development and anhydrobiosis (5 embryonic stages, 7 juvenile stages, active adults, and tun adults) to assess the transcriptional biases of different classes of genes between 2 tardigrade species, Hypsibius exemplaris and Ramazzottius varieornatus. We also used the transcriptomes of 2 other tardigrades, Echiniscoides sigismundi and Richtersius coronifer, and data from 3 non-tardigrade species (Adenita vaga, Drosophila melanogaster, and Caenorhabditis elegans) to help identify lineage-specific genes. We found that lineage-specific genes have generally low and narrow expression but are enriched among biased genes in different stages of development depending on the species. Biased genes tend to be specific to early and late development, but there is little overlap in functional enrichment of biased genes between species. Gene expansions in the 2 tardigrades also involve families with different functions despite homologous genes being expressed during anhydrobiosis in both species. Our results demonstrate the interspecific variation in transcriptional contributions and biases of lineage-specific genes during development and anhydrobiosis in 2 tardigrades.
Background: Statistical methods developed to address various questions in single-cell datasets show increased variability to different parameter regimes. In order to delineate further the robustness of commonly utilized methods for single-cell RNA-Seq, we aimed to comprehensively review scRNA-Seq analysis workflows in the setting of dimension reduction, clustering, and trajectory inference.
Methods: We utilized datasets with temporal single-cell transcriptomics profiles from public repositories. Combining multiple methods at each level of the workflow, we have performed over 6k analysis and evaluated the results of clustering and pseudotime estimation using adjusted rand index and rank correlation metrics. We have further integrated neural network methods to assess whether models with increased complexity can show increased bias/variance trade-off.
Results: Combinatorial workflows showed that utilizing non-linear dimension reduction techniques such as t-SNE and UMAP are sensitive to initial preprocessing steps hence clustering results on dimension reduced space of single-cell datasets should be utilized carefully. Similarly, pseudotime estimation methods that depend on previous non-linear dimension reduction steps can result in highly variable trajectories. In contrast, methods that avoid non-linearity such as WOT can result in repeatable inferences of temporal gene expression dynamics. Furthermore, imputation methods do not improve clustering or trajectory inference results substantially in terms of repeatability. In contrast, the selection of the normalization method shows an increased effect on downstream analysis where ScTransform reduces variability overall.
SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.
Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.