Hepatitis Delta Virus (HDV) is an RNA virus and causes delta hepatitis in humans. Although a lot of data is available for HDV, but retrieval of information is a complicated task. Current web database 'HDVDB' provides a comprehensive web-resource for HDV. The database is basically concerned with basic information about HDV and disease caused by this virus, genome structure, pathogenesis, epidemiology, symptoms and prevention, etc. Database also supplies sequence data and bibliographic information about HDV. A tool 'siHDV Predict' to design the effective siRNA molecule to control the activity of HDV, is also integrated in database. It is a user friendly information system available at public domain and provides annotated information about HDV for research scholars, scientists, pharma industry people for further study.
A number of clustering methods introduced for analysis of gene expression data for extracting potential relationships among the genes are studied and reported in this paper. An effective unsupervised method (TDAC) is proposed for simultaneous detection of outliers and biologically relevant co-expressed patterns. Effectiveness of TDAC is established in comparison to its other competing algorithms over six publicly available benchmark gene expression datasets in terms of both internal and external validity measures. Main attractions of TDAC are: (a) it does not require discretisation, (b) it is capable of identifying biologically relevant gene co-expressed patterns as well as outlier genes(s), (c) it is cost-effective in terms of time and space, (d) it does not require the number of clusters a priori, and (e) it is free from the restrictions of using any proximity measure.
Identification of protein complexes is crucial to understand principles of cellular organisation and predict protein functions. In this paper, a novel protein complex discovery algorithm IPCIPG is proposed based on the integration of Protein-Protein Interaction network (PPI network) and gene expression data. IPCIPG is a local search algorithm which has two versions: IPCIPG-n for identifying non-overlapping clusters and IPCIPG-o for detecting overlapping clusters. The experimental results on the yeast PPI network show that IPCIPG can identify protein complexes with specific biological meaning more effectively, precisely and comprehensively than six other algorithms: HUNTER, HC-PIN, CMC, SPICi, MOCDE and MCL.
Investigation of essential proteins is significantly valuable for understanding of cellular life, drug design and other practical purposes. In most of current studies, essential proteins are generally mined in protein-protein interaction (PPI) networks with diverse topology features. In this study, we investigate what kind of proteins is inclined to be essential from a new perspective. The investigation implies that protein essentiality is correlated with protein domains, which are functional, structural and evolutionary units of proteins. Proteins with a larger Number of Domain Types (NDT) tend to be essential. The analyses on 22 species show that essential proteins identified by NDT are much more than those identified by ten random identifications. The consideration of the structural feature makes us less dependent on network data and thus enables us to investigate protein essentiality of more species with incomplete and/or inconsistent network data.