In the health care and medical domain, it has been proven challenging to diagnose correctly many diseases with complicated and interferential symptoms, including arrhythmia. However, with the evolution of artificial intelligence (AI) techniques, the diagnosis and prognosis of arrhythmia became easier for the physicians and practitioners using only an electrocardiogram (ECG) examination. This review presents a synthesis of the studies conducted in the last 12 years to predict arrhythmia's occurrence by classifying automatically different heartbeat rhythms. From a variety of research academic databases, 40 studies were selected to analyze, among which 29 of them applied deep learning methods (72.5%), 9 of them addressed the problem with machine learning methods (22.5%), and 2 of them combined both deep learning and machine learning to predict arrhythmia (5%). Indeed, the use of AI for arrhythmia diagnosis is emerging in literature, although there are some challenging issues, such as the explicability of the Deep Learning methods and the computational resources needed to achieve high performance. However, with the continuous development of cloud platforms and quantum calculation for AI, we can achieve a breakthrough in arrhythmia diagnosis.
{"title":"A Literature Review: ECG-Based Models for Arrhythmia Diagnosis Using Artificial Intelligence Techniques.","authors":"Abir Boulif, Bouchra Ananou, Mustapha Ouladsine, Stéphane Delliaux","doi":"10.1177/11779322221149600","DOIUrl":"https://doi.org/10.1177/11779322221149600","url":null,"abstract":"<p><p>In the health care and medical domain, it has been proven challenging to diagnose correctly many diseases with complicated and interferential symptoms, including arrhythmia. However, with the evolution of artificial intelligence (AI) techniques, the diagnosis and prognosis of arrhythmia became easier for the physicians and practitioners using only an electrocardiogram (ECG) examination. This review presents a synthesis of the studies conducted in the last 12 years to predict arrhythmia's occurrence by classifying automatically different heartbeat rhythms. From a variety of research academic databases, 40 studies were selected to analyze, among which 29 of them applied deep learning methods (72.5%), 9 of them addressed the problem with machine learning methods (22.5%), and 2 of them combined both deep learning and machine learning to predict arrhythmia (5%). Indeed, the use of AI for arrhythmia diagnosis is emerging in literature, although there are some challenging issues, such as the explicability of the Deep Learning methods and the computational resources needed to achieve high performance. However, with the continuous development of cloud platforms and quantum calculation for AI, we can achieve a breakthrough in arrhythmia diagnosis.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2c/a1/10.1177_11779322221149600.PMC9926384.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9291423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231167977
Kumar Vishal, Piplu Bhuiyan, Junxia Qi, Yang Chen, Jubiao Zhang, Fen Yang, Juxue Li
Individuals with type 2 diabetes (T2D) and obesity have a higher risk of developing Alzheimer disease (AD), and increasing evidence indicates a link between impaired immune signaling pathways and the development of AD. However, the shared cellular mechanisms and molecular signatures among these 3 diseases remain unknown. The purpose of this study was to uncover similar molecular markers and pathways involved in obesity, T2D, and AD using bioinformatics and a network biology approach. First, we investigated the 3 RNA sequencing (RNA-seq) gene expression data sets and determined 224 commonly shared differentially expressed genes (DEGs) from obesity, T2D, and AD diseases. Gene ontology and pathway enrichment analyses revealed that mutual DEGs were mainly enriched with immune and inflammatory signaling pathways. In addition, we constructed a protein-protein interactions network for finding hub genes, which have not previously been identified as playing a critical role in these 3 diseases. Furthermore, the transcriptional factors and protein kinases regulating commonly shared DEGs among obesity, T2D, and AD were also identified. Finally, we suggested potential drug candidates as possible therapeutic interventions for 3 diseases. The results of this bioinformatics analysis provided a new understanding of the potential links between obesity, T2D, and AD pathologies.
{"title":"Unraveling the Mechanism of Immunity and Inflammation Related to Molecular Signatures Crosstalk Among Obesity, T2D, and AD: Insights From Bioinformatics Approaches.","authors":"Kumar Vishal, Piplu Bhuiyan, Junxia Qi, Yang Chen, Jubiao Zhang, Fen Yang, Juxue Li","doi":"10.1177/11779322231167977","DOIUrl":"https://doi.org/10.1177/11779322231167977","url":null,"abstract":"<p><p>Individuals with type 2 diabetes (T2D) and obesity have a higher risk of developing Alzheimer disease (AD), and increasing evidence indicates a link between impaired immune signaling pathways and the development of AD. However, the shared cellular mechanisms and molecular signatures among these 3 diseases remain unknown. The purpose of this study was to uncover similar molecular markers and pathways involved in obesity, T2D, and AD using bioinformatics and a network biology approach. First, we investigated the 3 RNA sequencing (RNA-seq) gene expression data sets and determined 224 commonly shared differentially expressed genes (DEGs) from obesity, T2D, and AD diseases. Gene ontology and pathway enrichment analyses revealed that mutual DEGs were mainly enriched with immune and inflammatory signaling pathways. In addition, we constructed a protein-protein interactions network for finding hub genes, which have not previously been identified as playing a critical role in these 3 diseases. Furthermore, the transcriptional factors and protein kinases regulating commonly shared DEGs among obesity, T2D, and AD were also identified. Finally, we suggested potential drug candidates as possible therapeutic interventions for 3 diseases. The results of this bioinformatics analysis provided a new understanding of the potential links between obesity, T2D, and AD pathologies.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/c2/a4/10.1177_11779322231167977.PMC10134115.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9386555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231160397
Arthur Yosef, Eli Shnaider, Moti Schneider, Michael Gurevich
In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured under different conditions. While the data from the same batch (measurements performed under the same conditions) are compatible, combining various batches into 1 data set is problematic because of incompatible measurements. Therefore, it is necessary to perform correction of the combined data (normalization), before performing biological analysis. There are numerous methods attempting to correct data set for batch effect. These methods rely on various assumptions regarding the distribution of the measurements. Forcing the data elements into pre-supposed distribution can severely distort biological signals, thus leading to incorrect results and conclusions. As the discrepancy between the assumptions regarding the data distribution and the actual distribution is wider, the biases introduced by such "correction methods" are greater. We introduce a heuristic method to reduce batch effect. The method does not rely on any assumptions regarding the distribution and the behavior of data elements. Hence, it does not introduce any new biases in the process of correcting the batch effect. It strictly maintains the integrity of measurements within the original batches.
{"title":"Normalization of Large-Scale Transcriptome Data Using Heuristic Methods.","authors":"Arthur Yosef, Eli Shnaider, Moti Schneider, Michael Gurevich","doi":"10.1177/11779322231160397","DOIUrl":"https://doi.org/10.1177/11779322231160397","url":null,"abstract":"<p><p>In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured under different conditions. While the data from the same batch (measurements performed under the same conditions) are compatible, combining various batches into 1 data set is problematic because of incompatible measurements. Therefore, it is necessary to perform correction of the combined data (normalization), before performing biological analysis. There are numerous methods attempting to correct data set for batch effect. These methods rely on various assumptions regarding the distribution of the measurements. Forcing the data elements into pre-supposed distribution can severely distort biological signals, thus leading to incorrect results and conclusions. As the discrepancy between the assumptions regarding the data distribution and the actual distribution is wider, the biases introduced by such \"correction methods\" are greater. We introduce a heuristic method to reduce batch effect. The method does not rely on any assumptions regarding the distribution and the behavior of data elements. Hence, it does not introduce any new biases in the process of correcting the batch effect. It strictly maintains the integrity of measurements within the original batches.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/1a/e0/10.1177_11779322231160397.PMC10068970.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9612102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231164828
Humaira Aziz Sawal, Shagufta Nighat, Tanzeela Safdar, Laiba Anees
Protein modelling plays a vital role in the drug discovery process. TANK-binding kinase 1-binding protein 1 is also called an adapter protein, which is encoded by gene TBK1 present in Homo sapiens. It is found in lungs, small intestine, leukocytes, heart, placenta, muscle, kidney, lower level of thymus, and brain. It has a number of protein-binding sites, to which TBK1 and IKBKE bind and perform different functions as immunomodulatory, antiproliferative, and antiviral innate immunity which release different types of interferons. Our study predicts the comparative model of 3-dimensional (3D) structure through different bioinformatics tools that will be helpful for further studies in future. The reactivity and stability of these proteins were evaluated physicochemically and through domain determination and prediction of secondary structure using bioinformatics methods such as ProtParam, Pfam, and SOPMA, respectively. Robetta, an ab initio approach, I-TASSER, and AlphaFold was used for 3D structure prediction, and the models were validated using the SAVESv6.0 (PROCHECK) server. Conclusively, the best 3D structure of TBK1-binding protein 1 was predicted using Robetta software. After unveiling the 3D structure of the novel protein, we concluded that this structure will help us to find out its role other than in antiviral innate immunity and by producing torsion in its 3D structure researchers will be able to detect either this protein is involved in any disease or not because according to previous studies it was not associated with any disease.
蛋白质建模在药物发现过程中起着至关重要的作用。TANK-binding kinase 1-binding protein 1也被称为适配蛋白(adapter protein),由智人体内存在的TBK1基因编码。见于肺、小肠、白细胞、心脏、胎盘、肌肉、肾脏、胸腺下层和大脑。它有许多蛋白质结合位点,TBK1和IKBKE结合并发挥不同的功能,如免疫调节、抗增殖和抗病毒先天免疫,释放不同类型的干扰素。我们的研究通过不同的生物信息学工具预测了三维(3D)结构的比较模型,这将有助于未来的进一步研究。这些蛋白的反应性和稳定性分别通过ProtParam、Pfam和SOPMA等生物信息学方法进行了物理化学评价,并通过结构域测定和二级结构预测进行了评价。采用Robetta、从头算法、I-TASSER和AlphaFold进行三维结构预测,并使用SAVESv6.0 (PROCHECK)服务器对模型进行验证。最后,利用Robetta软件预测tbk1结合蛋白1的最佳三维结构。在揭示了这种新蛋白的3D结构后,我们得出结论,这种结构将帮助我们找出它在抗病毒先天免疫之外的作用,通过在其3D结构中产生扭转,研究人员将能够检测出这种蛋白质是否与任何疾病有关,因为根据之前的研究,它与任何疾病无关。
{"title":"Comparative In Silico Analysis and Functional Characterization of TANK-Binding Kinase 1-Binding Protein 1.","authors":"Humaira Aziz Sawal, Shagufta Nighat, Tanzeela Safdar, Laiba Anees","doi":"10.1177/11779322231164828","DOIUrl":"https://doi.org/10.1177/11779322231164828","url":null,"abstract":"<p><p>Protein modelling plays a vital role in the drug discovery process. TANK-binding kinase 1-binding protein 1 is also called an adapter protein, which is encoded by gene <i>TBK1</i> present in <i>Homo sapiens.</i> It is found in lungs, small intestine, leukocytes, heart, placenta, muscle, kidney, lower level of thymus, and brain. It has a number of protein-binding sites, to which TBK1 and IKBKE bind and perform different functions as immunomodulatory, antiproliferative, and antiviral innate immunity which release different types of interferons. Our study predicts the comparative model of 3-dimensional (3D) structure through different bioinformatics tools that will be helpful for further studies in future. The reactivity and stability of these proteins were evaluated physicochemically and through domain determination and prediction of secondary structure using bioinformatics methods such as ProtParam, Pfam, and SOPMA, respectively. Robetta, an ab initio approach, I-TASSER, and AlphaFold was used for 3D structure prediction, and the models were validated using the SAVESv6.0 (PROCHECK) server. Conclusively, the best 3D structure of TBK1-binding protein 1 was predicted using Robetta software. After unveiling the 3D structure of the novel protein, we concluded that this structure will help us to find out its role other than in antiviral innate immunity and by producing torsion in its 3D structure researchers will be able to detect either this protein is involved in any disease or not because according to previous studies it was not associated with any disease.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/a2/f9/10.1177_11779322231164828.PMC10074619.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9641105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231182054
Souad Kartti, El Mehdi Bouricha, Oumaima Zarrik, Youssef Aghlallou, Chaimaa Mounjid, Rachid ELJaoudi, Lahcen Belyamani, Azeddine Ibrahimi, Basma El Khannoussi
The increasing commercialization of new gene panels based on next-generation sequencing for clinical research has significantly improved our understanding of breast cancer genetics and has led to the discovery of new mutation variants. The study included 16 unselected Moroccan breast cancer patients tested with multi-gene panel (HEVA screen panel) using Illumina Miseq, followed by Sanger sequencing to validate the most relevant mutation. Mutational analysis revealed the presence of 13 mutations (11 single-nucleotide polymorphisms [SNPs] and 2 indels), and 6 of 11 identified SNPs were predicted as pathogenic. One of the 6 pathogenic mutations was c.7874G>C, a heterozygous SNP in HD-OB domain of BRCA2 gene, which led to the arginine to threonine change at codon 2625 of the protein. This work describes the first case of a patient with breast cancer harboring this pathogenic variant and analyzes its functional impact using molecular docking and molecular dynamics simulation. Further experimental investigations are needed to validate its pathogenicity and to verify its association with breast cancer.
基于下一代测序的临床研究新基因面板的日益商业化,大大提高了我们对乳腺癌遗传学的理解,并导致了新的突变变体的发现。该研究包括16名未选择的摩洛哥乳腺癌患者,使用Illumina Miseq进行多基因面板(HEVA筛选面板)测试,然后进行Sanger测序以验证最相关的突变。突变分析显示存在13个突变(11个单核苷酸多态性[SNPs]和2个indel),鉴定出的11个snp中有6个被预测为致病性。6个致病性突变之一是BRCA2基因HD-OB结构域的杂合SNP C . 7874g >C,导致该蛋白密码子2625处精氨酸向苏氨酸转变。本研究描述了第一例携带该致病变异的乳腺癌患者,并利用分子对接和分子动力学模拟分析了其功能影响。需要进一步的实验研究来证实其致病性并证实其与乳腺癌的关联。
{"title":"Targeted Gene Panel Sequencing Unveiled New Pathogenic Mutations in Patients With Breast Cancer.","authors":"Souad Kartti, El Mehdi Bouricha, Oumaima Zarrik, Youssef Aghlallou, Chaimaa Mounjid, Rachid ELJaoudi, Lahcen Belyamani, Azeddine Ibrahimi, Basma El Khannoussi","doi":"10.1177/11779322231182054","DOIUrl":"https://doi.org/10.1177/11779322231182054","url":null,"abstract":"<p><p>The increasing commercialization of new gene panels based on next-generation sequencing for clinical research has significantly improved our understanding of breast cancer genetics and has led to the discovery of new mutation variants. The study included 16 unselected Moroccan breast cancer patients tested with multi-gene panel (HEVA screen panel) using Illumina Miseq, followed by Sanger sequencing to validate the most relevant mutation. Mutational analysis revealed the presence of 13 mutations (11 single-nucleotide polymorphisms [SNPs] and 2 indels), and 6 of 11 identified SNPs were predicted as pathogenic. One of the 6 pathogenic mutations was c.7874G>C, a heterozygous SNP in HD-OB domain of BRCA2 gene, which led to the arginine to threonine change at codon 2625 of the protein. This work describes the first case of a patient with breast cancer harboring this pathogenic variant and analyzes its functional impact using molecular docking and molecular dynamics simulation. Further experimental investigations are needed to validate its pathogenicity and to verify its association with breast cancer.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/75/b4/10.1177_11779322231182054.PMC10291397.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10664466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231154148
Renan Passos Freire, Jorge Enrique Hernandez-Gonzalez, Eliana Rosa Lima, Miriam Fussae Suzuki, João Ezequiel de Oliveira, Lucas Simon Torai, Paolo Bartolini, Carlos Roberto Jorge Soares
Arapaima gigas, known as Pirarucu in Brazil, is one of the largest freshwater fish in the world. Some individuals could reach 3 m in length and weight up to 200 kg. Due to extinction risks and its economic value, the species has been a focus for preservation and reproduction studies. Thyrotropin (TSH) is a glycoprotein hormone formed by 2 subunits α and β whose main activity is related to the synthesis of thyroid hormones (THs)-T3 and T4. In this work, we present a combination of bioinformatics tools to identify Arapaima gigas βTSH (ag-βTSH), modeling its molecular structure and express the recombinant heterodimer form in mammalian cells. Using the combination of computational biology, based on genome-related information, in silico molecular cloning and modeling led to confirm results of the ag-βTSH sequence by reverse transcriptase-polymerase chain reaction (RT-PCR) and transient expression in human embryonic kidney (HEK293F) cells. Molecular cloning of ag-βTSH retrieved 146 amino acids with a signal peptide of 21 amino acid residues and 6 disulfide bonds. The sequence has a similarity to 39 fish species, ranging between 43.1% and 81.6%, whose domains are extremely conserved, such as cystine knot motif and N-glycosylation site. The Arapaima gigas thyrotropin (ag-TSH) model, solved by AlphaFold, was used in molecular dynamics simulations with Scleropages formosus receptor, providing similar values of free energy ΔGbind and ΔGPMF in comparison with Homo sapiens model. The recombinant expression in HEK293F cells reached a yield of 25 mg/L, characterized via chromatographic and physical-chemical techniques. This work shows that other Arapaima gigas proteins could be studied in a similar way, using the combination of these techniques, recovering more information from its genome and improving the reproduction and preservation of this prehistoric fish.
{"title":"Molecular Cloning and AlphaFold Modeling of Thyrotropin (ag-TSH) From the Amazonian Fish Pirarucu (<i>Arapaima gigas</i>).","authors":"Renan Passos Freire, Jorge Enrique Hernandez-Gonzalez, Eliana Rosa Lima, Miriam Fussae Suzuki, João Ezequiel de Oliveira, Lucas Simon Torai, Paolo Bartolini, Carlos Roberto Jorge Soares","doi":"10.1177/11779322231154148","DOIUrl":"https://doi.org/10.1177/11779322231154148","url":null,"abstract":"<p><p><i>Arapaima gigas</i>, known as Pirarucu in Brazil, is one of the largest freshwater fish in the world. Some individuals could reach 3 m in length and weight up to 200 kg. Due to extinction risks and its economic value, the species has been a focus for preservation and reproduction studies. Thyrotropin (TSH) is a glycoprotein hormone formed by 2 subunits α and β whose main activity is related to the synthesis of thyroid hormones (THs)-T3 and T4. In this work, we present a combination of bioinformatics tools to identify <i>Arapaima gigas</i> βTSH (ag-βTSH), modeling its molecular structure and express the recombinant heterodimer form in mammalian cells. Using the combination of computational biology, based on genome-related information, in silico molecular cloning and modeling led to confirm results of the ag-βTSH sequence by reverse transcriptase-polymerase chain reaction (RT-PCR) and transient expression in human embryonic kidney (HEK293F) cells. Molecular cloning of ag-βTSH retrieved 146 amino acids with a signal peptide of 21 amino acid residues and 6 disulfide bonds. The sequence has a similarity to 39 fish species, ranging between 43.1% and 81.6%, whose domains are extremely conserved, such as cystine knot motif and N-glycosylation site. The <i>Arapaima gigas</i> thyrotropin (ag-TSH) model, solved by AlphaFold, was used in molecular dynamics simulations with <i>Scleropages formosus</i> receptor, providing similar values of free energy ΔG<sub>bind</sub> and ΔG<sub>PMF</sub> in comparison with <i>Homo sapiens</i> model. The recombinant expression in HEK293F cells reached a yield of 25 mg/L, characterized via chromatographic and physical-chemical techniques. This work shows that other <i>Arapaima gigas</i> proteins could be studied in a similar way, using the combination of these techniques, recovering more information from its genome and improving the reproduction and preservation of this prehistoric fish.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/a0/da/10.1177_11779322231154148.PMC9926385.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10798468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231182768
Nurul Jadid, Nur Laili Alfina Rosidah, Muhammad Rifqi Nur Ramadani, Indah Prasetyowati, Noor Nailis Sa'adah, Aulia Febrianti Widodo, Dwi Oktafitria
Reutealis trisperma belonging to the family Euphorbiaceae is currently used for biodiesel production, and rapid development in plant-based biofuel production has led to its increasing demand. However, massive utilization of bio-industrial plants has led to conservation issues. Moreover, genetic information on R trisperma is still limited, which is crucial for developmental, physiological, and molecular studies. Studying gene expression is essential to explain plant physiological processes. Nonetheless, this technique requires sensitive and precise measurement of messenger RNA (mRNA). In addition, the presence of internal control genes is important to avoid bias. Therefore, collecting and preserving genetic data for R trisperma is indispensable. In this study, we aimed to evaluate the application of plastid loci, rbcL, and matK, to the DNA barcode of R trisperma for use in conservation programs. In addition, we isolated and cloned the RtActin (RtACT) gene fragment for use in gene expression studies. Sequence information was analyzed in silico by comparison with other Euphorbiaceae plants. For actin fragment isolation, reverse-transcription polymerase chain reaction was used. Molecular cloning of RtActin was performed using the pTA2 plasmid before sequencing. We successfully isolated and cloned 592 and 840 bp of RtrbcL and RtmatK fragment genes, respectively. The RtrbcL barcoding marker, rather than the RtmatK plastidial marker, provided discriminative molecular phylogenetic data for R Trisperma. We also isolated 986 bp of RtACT gene fragments. Our phylogenetic analysis demonstrated that R trisperma is closely related to the Vernicia fordii Actin gene (97% identity). Our results suggest that RtrbcL could be further developed and used as a barcoding marker for R trisperma. Moreover, the RtACT gene could be further investigated for use in gene expression studies of plant.
{"title":"Plastid DNA Barcoding and <i>RtActin</i> cDNA Fragment Isolation of <i>Reutealis Trisperma</i>: A Promising Bioresource for Biodiesel Production.","authors":"Nurul Jadid, Nur Laili Alfina Rosidah, Muhammad Rifqi Nur Ramadani, Indah Prasetyowati, Noor Nailis Sa'adah, Aulia Febrianti Widodo, Dwi Oktafitria","doi":"10.1177/11779322231182768","DOIUrl":"https://doi.org/10.1177/11779322231182768","url":null,"abstract":"<p><p><i>Reutealis trisperma</i> belonging to the family <i>Euphorbiaceae</i> is currently used for biodiesel production, and rapid development in plant-based biofuel production has led to its increasing demand. However, massive utilization of bio-industrial plants has led to conservation issues. Moreover, genetic information on <i>R trisperma</i> is still limited, which is crucial for developmental, physiological, and molecular studies. Studying gene expression is essential to explain plant physiological processes. Nonetheless, this technique requires sensitive and precise measurement of messenger RNA (mRNA). In addition, the presence of internal control genes is important to avoid bias. Therefore, collecting and preserving genetic data for <i>R trisperma</i> is indispensable. In this study, we aimed to evaluate the application of plastid loci, <i>rbcL</i>, and <i>matK</i>, to the DNA barcode of <i>R trisperma</i> for use in conservation programs. In addition, we isolated and cloned the <i>RtActin</i> (<i>RtACT</i>) gene fragment for use in gene expression studies. Sequence information was analyzed <i>in silico</i> by comparison with other <i>Euphorbiaceae</i> plants. For actin fragment isolation, reverse-transcription polymerase chain reaction was used. Molecular cloning of <i>RtActin</i> was performed using the pTA2 plasmid before sequencing. We successfully isolated and cloned 592 and 840 bp of <i>RtrbcL</i> and <i>RtmatK</i> fragment genes, respectively. The <i>RtrbcL</i> barcoding marker, rather than the <i>RtmatK</i> plastidial marker, provided discriminative molecular phylogenetic data for <i>R Trisperma</i>. We also isolated 986 bp of <i>RtACT</i> gene fragments. Our phylogenetic analysis demonstrated that <i>R trisperma</i> is closely related to the <i>Vernicia fordii Actin</i> gene (97% identity). Our results suggest that <i>RtrbcL</i> could be further developed and used as a barcoding marker for <i>R trisperma</i>. Moreover, the <i>RtACT</i> gene could be further investigated for use in gene expression studies of plant.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/8a/9c/10.1177_11779322231182768.PMC10286179.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10298531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231152972
Haitao Zhao, Sujay Datta, Zhong-Hui Duan
Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.
{"title":"An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.","authors":"Haitao Zhao, Sujay Datta, Zhong-Hui Duan","doi":"10.1177/11779322231152972","DOIUrl":"https://doi.org/10.1177/11779322231152972","url":null,"abstract":"<p><p>Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4e/ca/10.1177_11779322231152972.PMC9972065.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10823900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1177/11779322231162777
Qurshid Hasan Khan
MicroRNAs (miRNAs) are single-stranded, endogenous, non-coding RNAs of 20-24 nucleotides that play a significant role in post-transcriptional gene regulation. Various conserved and novel miRNAs have been characterized, especially from the plant species whose genomes were well-characterized; however, information on miRNA in economically important plants such as pea (Pisum sativum L.) is limited. In this study, I have identified conserved and novel miRNA in garden pea plant leaves samples along with their targets by analyzing the next generation sequencing (NGS) data. The raw data obtained from NGS were processed and 1.38 million high-quality non-redundant reads were retained for analysis, this tremendous quantity of reads indicates a large and diverse small RNA population in pea leaves. After analyzing the deep sequencing data, 255 conserved and 11 novel miRNAs were identified in the garden pea leaves sample. Utilizing psRNATarget tool, the miRNA targets of conserved and novel miRNA were predicted. Further, the functional annotation of the miRNA targets were performed using blast2Go software and the target gene products were predicted. The miRNA target gene products along with GO_ID (Gene Ontology Identifier) were categorized into biological processes, cellular components, and molecular functions. The information obtained from this study will provide genomic resources that will help in understanding miRNA-mediated post-transcriptional gene regulation in garden peas.
{"title":"Identification of Conserved and Novel MicroRNAs with their Targets in Garden Pea (<i>Pisum Sativum</i> L.) Leaves by High-Throughput Sequencing.","authors":"Qurshid Hasan Khan","doi":"10.1177/11779322231162777","DOIUrl":"https://doi.org/10.1177/11779322231162777","url":null,"abstract":"<p><p>MicroRNAs (miRNAs) are single-stranded, endogenous, non-coding RNAs of 20-24 nucleotides that play a significant role in post-transcriptional gene regulation. Various conserved and novel miRNAs have been characterized, especially from the plant species whose genomes were well-characterized; however, information on miRNA in economically important plants such as pea (<i>Pisum sativum</i> L.) is limited. In this study, I have identified conserved and novel miRNA in garden pea plant leaves samples along with their targets by analyzing the next generation sequencing (NGS) data. The raw data obtained from NGS were processed and 1.38 million high-quality non-redundant reads were retained for analysis, this tremendous quantity of reads indicates a large and diverse small RNA population in pea leaves. After analyzing the deep sequencing data, 255 conserved and 11 novel miRNAs were identified in the garden pea leaves sample. Utilizing psRNATarget tool, the miRNA targets of conserved and novel miRNA were predicted. Further, the functional annotation of the miRNA targets were performed using blast2Go software and the target gene products were predicted. The miRNA target gene products along with GO_ID (Gene Ontology Identifier) were categorized into biological processes, cellular components, and molecular functions. The information obtained from this study will provide genomic resources that will help in understanding miRNA-mediated post-transcriptional gene regulation in garden peas.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/cb/37/10.1177_11779322231162777.PMC10068972.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9263176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-09eCollection Date: 2022-01-01DOI: 10.1177/11779322221136002
Sazzad Shahrear, Maliha Afroj Zinnia, Md Rabi Us Sany, Abul Bashar Mir Md Khademul Islam
Vibrio parahaemolyticus, an aquatic pathogen, is a major concern in the shrimp aquaculture industry. Several strains of this pathogen are responsible for causing acute hepatopancreatic necrosis disease as well as other serious illness, both of which result in severe economic losses. The genome sequence of two pathogenic strains of V. parahaemolyticus, MSR16 and MSR17, isolated from Bangladesh, have been reported to gain a better understanding of their diversity and virulence. However, the prevalence of hypothetical proteins (HPs) makes it challenging to obtain a comprehensive understanding of the pathogenesis of V. parahaemolyticus. The aim of the present study is to provide a functional annotation of the HPs to elucidate their role in pathogenesis employing several in silico tools. The exploration of protein domains and families, similarity searches against proteins with known function, gene ontology enrichment, along with protein-protein interaction analysis of the HPs led to the functional assignment with a high level of confidence for 656 proteins out of a pool of 2631 proteins. The in silico approach used in this study was important for accurately assigning function to HPs and inferring interactions with proteins with previously described functions. The HPs with function predicted were categorized into various groups such as enzymes involved in small-compound biosynthesis pathway, iron binding proteins, antibiotics resistance proteins, and other proteins. Several proteins with potential druggability were identified among them. In addition, the HPs were investigated in search of virulent factors, which led to the identification of proteins that have the potential to be exploited as vaccine candidate. The findings of the study will be effective in gaining a better understanding of the molecular mechanisms of bacterial pathogenesis. They may also provide an insight into the process of evaluating promising targets for the development of drugs and vaccines against V. parahaemolyticus.
{"title":"Functional Analysis of Hypothetical Proteins of <i>Vibrio parahaemolyticus</i> Reveals the Presence of Virulence Factors and Growth-Related Enzymes With Therapeutic Potential.","authors":"Sazzad Shahrear, Maliha Afroj Zinnia, Md Rabi Us Sany, Abul Bashar Mir Md Khademul Islam","doi":"10.1177/11779322221136002","DOIUrl":"https://doi.org/10.1177/11779322221136002","url":null,"abstract":"<p><p><i>Vibrio parahaemolyticus</i>, an aquatic pathogen, is a major concern in the shrimp aquaculture industry. Several strains of this pathogen are responsible for causing acute hepatopancreatic necrosis disease as well as other serious illness, both of which result in severe economic losses. The genome sequence of two pathogenic strains of <i>V. parahaemolyticus</i>, MSR16 and MSR17, isolated from Bangladesh, have been reported to gain a better understanding of their diversity and virulence. However, the prevalence of hypothetical proteins (HPs) makes it challenging to obtain a comprehensive understanding of the pathogenesis of <i>V. parahaemolyticus</i>. The aim of the present study is to provide a functional annotation of the HPs to elucidate their role in pathogenesis employing several in silico tools. The exploration of protein domains and families, similarity searches against proteins with known function, gene ontology enrichment, along with protein-protein interaction analysis of the HPs led to the functional assignment with a high level of confidence for 656 proteins out of a pool of 2631 proteins. The in silico approach used in this study was important for accurately assigning function to HPs and inferring interactions with proteins with previously described functions. The HPs with function predicted were categorized into various groups such as enzymes involved in small-compound biosynthesis pathway, iron binding proteins, antibiotics resistance proteins, and other proteins. Several proteins with potential druggability were identified among them. In addition, the HPs were investigated in search of virulent factors, which led to the identification of proteins that have the potential to be exploited as vaccine candidate. The findings of the study will be effective in gaining a better understanding of the molecular mechanisms of bacterial pathogenesis. They may also provide an insight into the process of evaluating promising targets for the development of drugs and vaccines against <i>V. parahaemolyticus</i>.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/d5/47/10.1177_11779322221136002.PMC9661560.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40468611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}