The broadly used 10X Genomics technology for single-cell RNA sequencing (scRNA-seq) captures RNA 3' ends. Thus, some reads contain part of the non-templated polyadenosine tails, providing direct evidence for the sites of 3' end cleavage and polyadenylation on the respective RNAs. Taking advantage of this property, we recently developed the SCINPAS workflow to infer polyadenylation sites (PASs) from scRNA-seq data. Here, we used this workflow to construct version 3.0 (v3.0, https://polyasite.unibas.ch/) of the PolyASite Atlas from a big compendium of publicly available human, mouse and worm scRNA-seq datasets obtained from healthy tissues. As the resolution of scRNA-seq was too low for robust detection of cell-level differences in PAS usage, we aggregated samples based on their tissue-of-origin to construct tissue-level catalogs of PASs. These provide qualitatively new information about PAS usage, in comparison to the previous PAS catalogs that were based on bulk 3' end sequencing experiments primarily in cell lines. In the new version, we document stringency levels associated with each PAS so that users can balance sensitivity and specificity in their analysis. We also upgraded the integration with the UCSC Genome Browser and developed track hubs conveniently displaying pooled and tissue-specific expression of PASs.
Eukaryotic genes can encode multiple distinct transcripts through the alternative splicing (AS) of genes. Interest in the AS mechanism and its evolution across different species has stimulated numerous studies, leading to several databases that provide information on AS and transcriptome data across multiple eukaryotic species. However, existing resources do not offer information on transcript conservation and evolution between genes of multiple species. Similarly to genes, identifying conserved transcripts-those from homologous genes that have retained a similar exon composition-is useful for determining transcript homology relationships, studying transcript functions and reconstructing transcript phylogenies. To address this gap, we have developed TranscriptDB, a database dedicated to studying the conservation and evolution of transcripts within gene families. TranscriptDB offers an extensive catalog of conserved transcripts and phylogenies for 317 annotated eukaryotic species, sourced from Ensembl database version 111. It serves multiple purposes, including the exploration of gene and transcript evolution. Users can access TranscriptDB through various browsing and querying tools, including a user-friendly web interface. The incorporated web servers enable users to retrieve information on transcript evolution using their own data as input. Additionally, a REST application programming interface is available for programmatic data retrieval. A data directory is also available for bulk downloads. TranscriptDB and its resources are freely accessible at https://transcriptdb.cobius.usherbrooke.ca.