首页 > 最新文献

Cancer Informatics最新文献

英文 中文
Erratum to "TP53 and its Regulatory Genes as Prognosis of Cutaneous Melanoma". “TP53及其调控基因与皮肤黑色素瘤预后的关系”的勘误。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231202605

[This corrects the article DOI: 10.1177/11769351231177267.].

[这更正了文章DOI: 10.1177/11769351231177267.]。
{"title":"Erratum to \"TP53 and its Regulatory Genes as Prognosis of Cutaneous Melanoma\".","authors":"","doi":"10.1177/11769351231202605","DOIUrl":"https://doi.org/10.1177/11769351231202605","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1177/11769351231177267.].</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231202605"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/70/fe/10.1177_11769351231202605.PMC10503288.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10286536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Immunogenetic Profiles and Associations of Breast, Cervical, Ovarian, and Uterine Cancers. 乳腺癌、子宫颈癌、卵巢癌和子宫癌的免疫遗传特征和关联。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351221148588
Lisa M James, Apostolos P Georgopoulos

It is increasingly recognized that the human immune response influences cancer risk, progression, and survival; consequently, there is growing interest in the role of human leukocyte antigen (HLA), genes that play a critical role in initiating the immune response, on cancer. Recent evidence documented clustering of cancers based on immunogenetic profiles such that breast and ovarian cancers clustered together as did uterine and cervical cancers. Here we extend that line of research to evaluate the HLA profile of those 4 cancers and their associations. Specifically, we evaluated the associations between the frequencies of 127 HLA alleles and the population prevalences of breast, ovarian, cervical, and uterine cancer in 14 countries in Continental Western Europe. Factor analysis and hierarchical clustering were used to evaluate groupings of cancers based on their immunogenetic profiles. The results documented highly similar immunogenetic profiles for breast and ovarian cancers that were characterized predominantly by protective HLA effects. In addition, highly similar immunogenetic profiles for cervical and uterine cancers were observed that were, conversely, characterized by susceptibility effects. In light of the role of HLA in host immune system protection against non-self antigens, these findings suggest that certain cancers may be associated with similar contributory factors such as viral oncoproteins or neoantigens.

越来越多的人认识到,人类免疫反应影响癌症的风险、进展和生存;因此,人们对人类白细胞抗原(HLA)的作用越来越感兴趣,这些基因在启动免疫反应中起着关键作用。最近的证据表明,基于免疫遗传谱的癌症聚集在一起,例如乳腺癌和卵巢癌以及子宫癌和宫颈癌聚集在一起。在这里,我们扩展了这条研究线,以评估这4种癌症及其相关性的HLA谱。具体来说,我们评估了西欧大陆14个国家127个HLA等位基因的频率与乳腺癌、卵巢癌、宫颈癌和子宫癌人群患病率之间的关系。因子分析和分层聚类被用于评估基于免疫遗传学特征的癌症分组。结果表明,乳腺癌和卵巢癌的免疫遗传谱高度相似,其主要特征是HLA的保护性作用。此外,观察到宫颈癌和子宫癌高度相似的免疫遗传谱,相反,其特点是易感性效应。鉴于HLA在宿主免疫系统对非自身抗原的保护中的作用,这些发现表明某些癌症可能与类似的促成因素有关,如病毒癌蛋白或新抗原。
{"title":"Immunogenetic Profiles and Associations of Breast, Cervical, Ovarian, and Uterine Cancers.","authors":"Lisa M James,&nbsp;Apostolos P Georgopoulos","doi":"10.1177/11769351221148588","DOIUrl":"https://doi.org/10.1177/11769351221148588","url":null,"abstract":"<p><p>It is increasingly recognized that the human immune response influences cancer risk, progression, and survival; consequently, there is growing interest in the role of human leukocyte antigen (HLA), genes that play a critical role in initiating the immune response, on cancer. Recent evidence documented clustering of cancers based on immunogenetic profiles such that breast and ovarian cancers clustered together as did uterine and cervical cancers. Here we extend that line of research to evaluate the HLA profile of those 4 cancers and their associations. Specifically, we evaluated the associations between the frequencies of 127 HLA alleles and the population prevalences of breast, ovarian, cervical, and uterine cancer in 14 countries in Continental Western Europe. Factor analysis and hierarchical clustering were used to evaluate groupings of cancers based on their immunogenetic profiles. The results documented highly similar immunogenetic profiles for breast and ovarian cancers that were characterized predominantly by protective HLA effects. In addition, highly similar immunogenetic profiles for cervical and uterine cancers were observed that were, conversely, characterized by susceptibility effects. In light of the role of HLA in host immune system protection against non-self antigens, these findings suggest that certain cancers may be associated with similar contributory factors such as viral oncoproteins or neoantigens.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351221148588"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/47/11/10.1177_11769351221148588.PMC9846304.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10585897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Prevalence of Breast Cancer Subtypes Among Different Ethnicities and Bangladeshi Women: Demographic, Clinicopathological, and Integrated Cancer Informatics Analysis. 乳腺癌亚型在不同种族和孟加拉国妇女中的流行:人口统计学、临床病理学和综合癌症信息学分析。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351221148584
Diganta Islam, Md Shihabul Islam, Sanjida Islam Dorin, Jesmin

Background: The molecular subtyping of breast cancer is related to estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). The present study aimed to systematically analyze the expression, function, and prognostic value of ER, PR, HER2, and their prevalence in different ethnic groups and among Bangladeshi breast cancer (BC) patients.

Method: This study included 25 BC patients and 25 healthy controls, aged between 25 and 70 years. The study characteristics were compared using the ANOVA and Chi-square tests. Also, the multi-Omics dataset of 775 BC patients from TCGA was analyzed for ER, PR, and HER2 in breast cancer subtypes and compared among different ethnicities.

Results: For most BD breast cancer cases, the age at diagnosis was ⩾40 years, had only a histopathological diagnosis (P-value .004), and no history of mammography or other pathological tests. For treatment, had only chemotherapy (P-value .004) and no hormone therapy (P-value <.001). The majority of patients (>60%) were of stage-II cancer and TNBC (40%) subtype. The BC ethnicity-stratified data of ER, PR, and HER2 indicated a strong correlation across all ethnicities (P-value 4.99e-35; P-value 3.79e-18). The subtypes stratified data indicated a higher percentage of Luminal A (58.3%) in Caucasians whereas Luminal B (24.3%) and HER2 (25.2%) subtypes were found higher in Asians and TNBC (36.0%) were found in Africans. However, a significantly higher frequency of TNBC (52.2%) compared to Asians (14.8%) was found in BD patients (P-value <.001). The overall survival analysis of BC subtypes demonstrated that Luminal B (P-value .005) and HER2 enriched (P-value .015) were significantly more aggressive and were dominant in the Asian population.

Conclusion: A significant association was found between BC subtypes with different ethnicities and Bangladeshi women and these findings might aid in the prevention, management, and raising of awareness against risk factors in the near future.

背景:乳腺癌的分子分型与雌激素受体(ER)、孕激素受体(PR)和人表皮生长因子受体2 (HER2)有关。本研究旨在系统分析不同民族和孟加拉乳腺癌(BC)患者中ER、PR、HER2的表达、功能、预后价值及其流行情况。方法:本研究纳入25例BC患者和25例健康对照,年龄在25 ~ 70岁之间。采用方差分析和卡方检验比较研究特征。此外,对来自TCGA的775例BC患者的多组学数据集进行了乳腺癌亚型的ER、PR和HER2分析,并在不同种族之间进行了比较。结果:对于大多数BD乳腺癌病例,诊断时的年龄为大于或等于40岁,只有组织病理学诊断(p值0.004),没有乳房x光检查或其他病理检查的历史。在治疗方面,只有化疗(p值0.004)和没有激素治疗(p值60%)的ii期癌症和TNBC亚型(p值40%)。BC种族分层数据显示,ER、PR和HER2在所有种族之间具有很强的相关性(p值4.99e-35;假定值3.79 e-18)。亚型分层数据显示,白种人中Luminal a(58.3%)的比例较高,而亚洲人中Luminal B(24.3%)和HER2(25.2%)亚型的比例较高,非洲人中TNBC(36.0%)的比例较高。然而,在BD患者中发现TNBC的频率(52.2%)明显高于亚洲人(14.8%)(p值p值0.005),HER2富集(p值0.015)明显更具侵袭性,在亚洲人群中占主导地位。结论:不同种族的BC亚型与孟加拉国妇女之间存在显著关联,这些发现可能有助于在不久的将来预防、管理和提高对危险因素的认识。
{"title":"Prevalence of Breast Cancer Subtypes Among Different Ethnicities and Bangladeshi Women: Demographic, Clinicopathological, and Integrated Cancer Informatics Analysis.","authors":"Diganta Islam,&nbsp;Md Shihabul Islam,&nbsp;Sanjida Islam Dorin,&nbsp;Jesmin","doi":"10.1177/11769351221148584","DOIUrl":"https://doi.org/10.1177/11769351221148584","url":null,"abstract":"<p><strong>Background: </strong>The molecular subtyping of breast cancer is related to estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). The present study aimed to systematically analyze the expression, function, and prognostic value of ER, PR, HER2, and their prevalence in different ethnic groups and among Bangladeshi breast cancer (BC) patients.</p><p><strong>Method: </strong>This study included 25 BC patients and 25 healthy controls, aged between 25 and 70 years. The study characteristics were compared using the ANOVA and Chi-square tests. Also, the multi-Omics dataset of 775 BC patients from TCGA was analyzed for ER, PR, and HER2 in breast cancer subtypes and compared among different ethnicities.</p><p><strong>Results: </strong>For most BD breast cancer cases, the age at diagnosis was ⩾40 years, had only a histopathological diagnosis (<i>P</i>-value .004), and no history of mammography or other pathological tests. For treatment, had only chemotherapy (<i>P</i>-value .004) and no hormone therapy (<i>P</i>-value <.001). The majority of patients (>60%) were of stage-II cancer and TNBC (40%) subtype. The BC ethnicity-stratified data of ER, PR, and HER2 indicated a strong correlation across all ethnicities (<i>P</i>-value 4.99e-35; <i>P</i>-value 3.79e-18). The subtypes stratified data indicated a higher percentage of Luminal A (58.3%) in Caucasians whereas Luminal B (24.3%) and HER2 (25.2%) subtypes were found higher in Asians and TNBC (36.0%) were found in Africans. However, a significantly higher frequency of TNBC (52.2%) compared to Asians (14.8%) was found in BD patients (<i>P</i>-value <.001). The overall survival analysis of BC subtypes demonstrated that Luminal B (<i>P</i>-value .005) and HER2 enriched (<i>P</i>-value .015) were significantly more aggressive and were dominant in the Asian population.</p><p><strong>Conclusion: </strong>A significant association was found between BC subtypes with different ethnicities and Bangladeshi women and these findings might aid in the prevention, management, and raising of awareness against risk factors in the near future.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351221148584"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/f5/b1/10.1177_11769351221148584.PMC9850134.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10635591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prevalence of Iron Deficiency and its Association With Breast Cancer in Premenopausal Compared to Postmenopausal Women in Al Ahsa, Saudi Arabia. 在沙特阿拉伯的Al Ahsa,与绝经后妇女相比,绝经前妇女缺铁的患病率及其与乳腺癌的关系
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231172589
Mohammad Al Khamees, Aymen A Alqurain, Abdulmonem A Alsaleh, Yousef A Alhashem, Nida AlSaffar, Noura N Alibrahim, Fardus A Aljunibi, Zaheda Alradwan, Nesreen Almohammade, Bader AlAlwan

Iron is an essential cofactor needed for normal functions of various enzymes and its depletion lead to increase DNA damage, genomic instability, deteriorate innate, adaptive immunity, and promote tumor development. It is also linked to tumorigenesis of breast cancer cells through enhancing mammary tumor growth and metastasis. There is insufficient data describing this association in Saudi Arabia. This study aims to determine the prevalence of iron deficiency and its association with breast cancer among premenopausal and postmenopausal women referred for breast cancer screening center in Al Ahsa, Eastern Province of Saudi Arabia. Age, hemoglobin level, iron level, history of anemia, or iron deficiency were collected from patients' medical records. The included participants were grouped based on their age into premenopausal (<50 years) or postmenopausal (⩾50 years). The definition of low Hb implemented (Hb below 12 g/dL) and low total serum Iron levels (below 8 μmol/L). Logistic regression test was used to compute the association between having a positive cancer screening test (radiological or histocytological) and participant's lab results. The results are presented as odds ratios and 95% confidence intervals. Thrree hundred fifty-seven women were included, 77% (n = 274) of them were premenopausal. This group cases had more history of iron deficiency (149 [60%] vs 25 (30%), P = .001) compared to those in the postmenopausal group. The risk of having a positive radiological cancer screening test was associated with age (OR = 1.04, 95% CI 1.02-1.06), but negatively was associated with iron level (OR = 0.9, 95% CI 0.86-0.97) among the entire cohort. This study is the first to propose an association between iron deficiency and breast cancer among Saudi young females. This could suggest iron level as a new risk factor that may be used by clinicians to assess breast cancer risk.

铁是多种酶正常功能所需的重要辅助因子,其耗竭会导致DNA损伤增加,基因组不稳定,先天免疫和适应性免疫恶化,促进肿瘤发展。它还通过促进乳腺肿瘤的生长和转移与乳腺癌细胞的肿瘤发生有关。在沙特阿拉伯没有足够的数据描述这种关联。本研究旨在确定在沙特阿拉伯东部省Al Ahsa乳腺癌筛查中心转介的绝经前和绝经后妇女中缺铁的患病率及其与乳腺癌的关系。收集患者的年龄、血红蛋白水平、铁水平、贫血史或缺铁史。与绝经后组相比,纳入的参与者根据年龄分为绝经前组(P = 0.001)。在整个队列中,放射学癌症筛查试验阳性的风险与年龄相关(OR = 1.04, 95% CI 1.02-1.06),但与铁水平负相关(OR = 0.9, 95% CI 0.86-0.97)。这项研究首次提出了沙特年轻女性缺铁与乳腺癌之间的联系。这可能表明铁水平作为一个新的危险因素,可能被临床医生用来评估乳腺癌的风险。
{"title":"Prevalence of Iron Deficiency and its Association With Breast Cancer in Premenopausal Compared to Postmenopausal Women in Al Ahsa, Saudi Arabia.","authors":"Mohammad Al Khamees,&nbsp;Aymen A Alqurain,&nbsp;Abdulmonem A Alsaleh,&nbsp;Yousef A Alhashem,&nbsp;Nida AlSaffar,&nbsp;Noura N Alibrahim,&nbsp;Fardus A Aljunibi,&nbsp;Zaheda Alradwan,&nbsp;Nesreen Almohammade,&nbsp;Bader AlAlwan","doi":"10.1177/11769351231172589","DOIUrl":"https://doi.org/10.1177/11769351231172589","url":null,"abstract":"<p><p>Iron is an essential cofactor needed for normal functions of various enzymes and its depletion lead to increase DNA damage, genomic instability, deteriorate innate, adaptive immunity, and promote tumor development. It is also linked to tumorigenesis of breast cancer cells through enhancing mammary tumor growth and metastasis. There is insufficient data describing this association in Saudi Arabia. This study aims to determine the prevalence of iron deficiency and its association with breast cancer among premenopausal and postmenopausal women referred for breast cancer screening center in Al Ahsa, Eastern Province of Saudi Arabia. Age, hemoglobin level, iron level, history of anemia, or iron deficiency were collected from patients' medical records. The included participants were grouped based on their age into premenopausal (<50 years) or postmenopausal (⩾50 years). The definition of low Hb implemented (Hb below 12 g/dL) and low total serum Iron levels (below 8 μmol/L). Logistic regression test was used to compute the association between having a positive cancer screening test (radiological or histocytological) and participant's lab results. The results are presented as odds ratios and 95% confidence intervals. Thrree hundred fifty-seven women were included, 77% (n = 274) of them were premenopausal. This group cases had more history of iron deficiency (149 [60%] vs 25 (30%), <i>P</i> = .001) compared to those in the postmenopausal group. The risk of having a positive radiological cancer screening test was associated with age (OR = 1.04, 95% CI 1.02-1.06), but negatively was associated with iron level (OR = 0.9, 95% CI 0.86-0.97) among the entire cohort. This study is the first to propose an association between iron deficiency and breast cancer among Saudi young females. This could suggest iron level as a new risk factor that may be used by clinicians to assess breast cancer risk.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231172589"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/dc/b1/10.1177_11769351231172589.PMC10201173.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9871493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Patient Identification and Tumor Identification Management: Quality Program in a Cancer Multicentric Clinical Data Warehouse. 病人识别和肿瘤识别管理:癌症多中心临床数据仓库的质量项目。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231172609
Karine Pallier, Olivier Prot, Simone Naldi, Francisco Silva, Thierry Denis, Olivier Giry, Sophie Leobon, Elise Deluche, Nicole Tubiana-Mathieu

Background: The Regional Basis of Solid Tumor (RBST), a clinical data warehouse, centralizes information related to cancer patient care in 5 health establishments in 2 French departments.

Purpose: To develop algorithms matching heterogeneous data to "real" patients and "real" tumors with respect to patient identification (PI) and tumor identification (TI).

Methods: A graph database programed in java Neo4j was used to build the RBST with data from ~20 000 patients. The PI algorithm using the Levenshtein distance was based on the regulatory criteria identifying a patient. A TI algorithm was built on 6 characteristics: tumor location and laterality, date of diagnosis, histology, primary and metastatic status. Given the heterogeneous nature and semantics of the collected data, the creation of repositories (organ, synonym, and histology repositories) was required. The TI algorithm used the Dice coefficient to match tumors.

Results: Patients matched if there was complete agreement of the given name, surname, sex, and date/month/year of birth. These parameters were assigned weights of 28%, 28%, 21%, and 23% (with 18% for year, 2.5% for month, and 2.5% for day), respectively. The algorithm had a sensitivity of 99.69% (95% confidence interval [CI] [98.89%, 99.96%]) and a specificity of 100% (95% CI [99.72%, 100%]). The TI algorithm used repositories, weights were assigned to the diagnosis date and associated organ (37.5% and 37.5%, respectively), laterality (16%) histology (5%), and metastatic status (4%). This algorithm had a sensitivity of 71% (95% CI [62.68%, 78.25%]) and a specificity of 100% (95% CI [94.31%, 100%]).

Conclusion: The RBST encompasses 2 quality controls: PI and TI. It facilitates the implementation of transversal structuring and assessments of the performance of the provided care.

背景:区域实体瘤基础(RBST)是一个临床数据仓库,集中了法国2个部门5家卫生机构的癌症患者护理相关信息。目的:开发在患者识别(PI)和肿瘤识别(TI)方面将异构数据与“真实”患者和“真实”肿瘤匹配的算法。方法:采用java Neo4j编程的图形数据库构建约2万例患者的RBST数据。使用Levenshtein距离的PI算法基于识别患者的监管标准。TI算法基于6个特征:肿瘤的位置和侧边性、诊断日期、组织学、原发和转移状态。鉴于所收集数据的异构性质和语义,需要创建存储库(器官、同义词和组织学存储库)。TI算法使用Dice系数来匹配肿瘤。结果:如果患者的名字、姓氏、性别和出生日期/月/年完全一致,则患者匹配。这些参数的权重分别为28%、28%、21%和23%(年为18%,月为2.5%,日为2.5%)。该算法的灵敏度为99.69%(95%置信区间[CI][98.89%, 99.96%]),特异性为100% (95% CI[99.72%, 100%])。TI算法使用存储库,将权重分配给诊断日期和相关器官(分别为37.5%和37.5%)、侧边性(16%)、组织学(5%)和转移状态(4%)。该算法的灵敏度为71% (95% CI[62.68%, 78.25%]),特异性为100% (95% CI[94.31%, 100%])。结论:RBST包括PI和TI两种质量控制。它有助于实施横向结构和评估所提供护理的绩效。
{"title":"Patient Identification and Tumor Identification Management: Quality Program in a Cancer Multicentric Clinical Data Warehouse.","authors":"Karine Pallier,&nbsp;Olivier Prot,&nbsp;Simone Naldi,&nbsp;Francisco Silva,&nbsp;Thierry Denis,&nbsp;Olivier Giry,&nbsp;Sophie Leobon,&nbsp;Elise Deluche,&nbsp;Nicole Tubiana-Mathieu","doi":"10.1177/11769351231172609","DOIUrl":"https://doi.org/10.1177/11769351231172609","url":null,"abstract":"<p><strong>Background: </strong>The Regional Basis of Solid Tumor (RBST), a clinical data warehouse, centralizes information related to cancer patient care in 5 health establishments in 2 French departments.</p><p><strong>Purpose: </strong>To develop algorithms matching heterogeneous data to \"real\" patients and \"real\" tumors with respect to patient identification (PI) and tumor identification (TI).</p><p><strong>Methods: </strong>A graph database programed in java Neo4j was used to build the RBST with data from ~20 000 patients. The PI algorithm using the Levenshtein distance was based on the regulatory criteria identifying a patient. A TI algorithm was built on 6 characteristics: tumor location and laterality, date of diagnosis, histology, primary and metastatic status. Given the heterogeneous nature and semantics of the collected data, the creation of repositories (organ, synonym, and histology repositories) was required. The TI algorithm used the Dice coefficient to match tumors.</p><p><strong>Results: </strong>Patients matched if there was complete agreement of the given name, surname, sex, and date/month/year of birth. These parameters were assigned weights of 28%, 28%, 21%, and 23% (with 18% for year, 2.5% for month, and 2.5% for day), respectively. The algorithm had a sensitivity of 99.69% (95% confidence interval [CI] [98.89%, 99.96%]) and a specificity of 100% (95% CI [99.72%, 100%]). The TI algorithm used repositories, weights were assigned to the diagnosis date and associated organ (37.5% and 37.5%, respectively), laterality (16%) histology (5%), and metastatic status (4%). This algorithm had a sensitivity of 71% (95% CI [62.68%, 78.25%]) and a specificity of 100% (95% CI [94.31%, 100%]).</p><p><strong>Conclusion: </strong>The RBST encompasses 2 quality controls: PI and TI. It facilitates the implementation of transversal structuring and assessments of the performance of the provided care.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231172609"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/f9/25/10.1177_11769351231172609.PMC10201142.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9888090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Biomarker Prediction for Lung Cancer Using Random Forest Classifiers. 使用随机森林分类器预测肺癌的新生物标志物。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231167992
Lavanya C, Pooja S, Abhay H Kashyap, Abdur Rahaman, Swarna Niranjan, Vidya Niranjan

Lung cancer is considered the most common and the deadliest cancer type. Lung cancer could be mainly of 2 types: small cell lung cancer and non-small cell lung cancer. Non-small cell lung cancer is affected by about 85% while small cell lung cancer is only about 14%. Over the last decade, functional genomics has arisen as a revolutionary tool for studying genetics and uncovering changes in gene expression. RNA-Seq has been applied to investigate the rare and novel transcripts that aid in discovering genetic changes that occur in tumours due to different lung cancers. Although RNA-Seq helps to understand and characterise the gene expression involved in lung cancer diagnostics, discovering the biomarkers remains a challenge. Usage of classification models helps uncover and classify the biomarkers based on gene expression levels over the different lung cancers. The current research concentrates on computing transcript statistics from gene transcript files with a normalised fold change of genes and identifying quantifiable differences in gene expression levels between the reference genome and lung cancer samples. The collected data is analysed, and machine learning models were developed to classify genes as causing NSCLC, causing SCLC, causing both or neither. An exploratory data analysis was performed to identify the probability distribution and principal features. Due to the limited number of features available, all of them were used in predicting the class. To address the imbalance in the dataset, an under-sampling algorithm Near Miss was carried out on the dataset. For classification, the research primarily focused on 4 supervised machine learning algorithms: Logistic Regression, KNN classifier, SVM classifier and Random Forest classifier and additionally, 2 ensemble algorithms were considered: XGboost and AdaBoost. Out of these, based on the weighted metrics considered, the Random Forest classifier showing 87% accuracy was considered to be the best performing algorithm and thus was used to predict the biomarkers causing NSCLC and SCLC. The imbalance and limited features in the dataset restrict any further improvement in the model's accuracy or precision. In our present study using the gene expression values (LogFC, P Value) as the feature sets in the Random Forest Classifier BRAF, KRAS, NRAS, EGFR is predicted to be the possible biomarkers causing NSCLC and ATF6, ATF3, PGDFA, PGDFD, PGDFC and PIP5K1C is predicted to be the possible biomarkers causing SCLC from the transcriptome analysis. It gave a precision of 91.3% and 91% recall after fine tuning. Some of the common biomarkers predicted for NSCLC and SCLC were CDK4, CDK6, BAK1, CDKN1A, DDB2.

肺癌被认为是最常见和最致命的癌症类型。肺癌主要有两种类型:小细胞肺癌和非小细胞肺癌。非小细胞肺癌约占85%,而小细胞肺癌仅占14%左右。在过去的十年中,功能基因组学已经成为研究遗传学和揭示基因表达变化的革命性工具。RNA-Seq已被应用于研究罕见和新颖的转录本,这些转录本有助于发现由不同肺癌引起的肿瘤中发生的遗传变化。尽管RNA-Seq有助于理解和表征肺癌诊断中涉及的基因表达,但发现生物标志物仍然是一个挑战。分类模型的使用有助于发现和分类基于不同肺癌基因表达水平的生物标志物。目前的研究主要集中在通过基因的归一化折叠变化计算基因转录文件的转录统计,并确定参考基因组和肺癌样本之间基因表达水平的可量化差异。对收集到的数据进行分析,并开发机器学习模型,将基因分类为导致NSCLC、导致SCLC、两者都导致或两者都不导致。进行探索性数据分析,以确定概率分布和主要特征。由于可用的特征数量有限,所有特征都用于预测类别。为了解决数据集中的不平衡问题,对数据集进行了欠采样算法Near Miss。在分类方面,主要研究了4种监督式机器学习算法:Logistic回归、KNN分类器、SVM分类器和Random Forest分类器,并考虑了2种集成算法:XGboost和AdaBoost。其中,基于所考虑的加权指标,随机森林分类器的准确率为87%,被认为是表现最好的算法,因此被用于预测导致NSCLC和SCLC的生物标志物。数据集中的不平衡和有限的特征限制了模型的准确性或精度的进一步提高。本研究采用随机森林分类器BRAF、KRAS、NRAS中的基因表达值(LogFC、P Value)作为特征集,预测EGFR是可能导致NSCLC的生物标志物,通过转录组分析预测ATF6、ATF3、PGDFA、PGDFD、PGDFC和PIP5K1C是可能导致SCLC的生物标志物。经过微调后,准确率为91.3%,召回率为91%。预测NSCLC和SCLC的一些常见生物标志物是CDK4、CDK6、BAK1、CDKN1A、DDB2。
{"title":"Novel Biomarker Prediction for Lung Cancer Using Random Forest Classifiers.","authors":"Lavanya C,&nbsp;Pooja S,&nbsp;Abhay H Kashyap,&nbsp;Abdur Rahaman,&nbsp;Swarna Niranjan,&nbsp;Vidya Niranjan","doi":"10.1177/11769351231167992","DOIUrl":"https://doi.org/10.1177/11769351231167992","url":null,"abstract":"<p><p>Lung cancer is considered the most common and the deadliest cancer type. Lung cancer could be mainly of 2 types: small cell lung cancer and non-small cell lung cancer. Non-small cell lung cancer is affected by about 85% while small cell lung cancer is only about 14%. Over the last decade, functional genomics has arisen as a revolutionary tool for studying genetics and uncovering changes in gene expression. RNA-Seq has been applied to investigate the rare and novel transcripts that aid in discovering genetic changes that occur in tumours due to different lung cancers. Although RNA-Seq helps to understand and characterise the gene expression involved in lung cancer diagnostics, discovering the biomarkers remains a challenge. Usage of classification models helps uncover and classify the biomarkers based on gene expression levels over the different lung cancers. The current research concentrates on computing transcript statistics from gene transcript files with a normalised fold change of genes and identifying quantifiable differences in gene expression levels between the reference genome and lung cancer samples. The collected data is analysed, and machine learning models were developed to classify genes as causing NSCLC, causing SCLC, causing both or neither. An exploratory data analysis was performed to identify the probability distribution and principal features. Due to the limited number of features available, all of them were used in predicting the class. To address the imbalance in the dataset, an under-sampling algorithm Near Miss was carried out on the dataset. For classification, the research primarily focused on 4 supervised machine learning algorithms: Logistic Regression, KNN classifier, SVM classifier and Random Forest classifier and additionally, 2 ensemble algorithms were considered: XGboost and AdaBoost. Out of these, based on the weighted metrics considered, the Random Forest classifier showing 87% accuracy was considered to be the best performing algorithm and thus was used to predict the biomarkers causing NSCLC and SCLC. The imbalance and limited features in the dataset restrict any further improvement in the model's accuracy or precision. In our present study using the gene expression values (LogFC, P Value) as the feature sets in the Random Forest Classifier BRAF, KRAS, NRAS, EGFR is predicted to be the possible biomarkers causing NSCLC and ATF6, ATF3, PGDFA, PGDFD, PGDFC and PIP5K1C is predicted to be the possible biomarkers causing SCLC from the transcriptome analysis. It gave a precision of 91.3% and 91% recall after fine tuning. Some of the common biomarkers predicted for NSCLC and SCLC were CDK4, CDK6, BAK1, CDKN1A, DDB2.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231167992"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/c4/97/10.1177_11769351231167992.PMC10126698.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9718472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Trends in Subcutaneous Tumour Height and Impact on Measurement Accuracy. 皮下肿瘤高度变化趋势及其对测量精度的影响。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231165181
Daniel Brough, Hope Amos, Karl Turley, Jake Murkin

Tumour volume is typically calculated using only length and width measurements, using width as a proxy for height in a 1:1 ratio. When tracking tumour growth over time, important morphological information and measurement accuracy is lost by ignoring height, which we show is a unique variable. Lengths, widths, and heights of 9522 subcutaneous tumours in mice were measured using 3D and thermal imaging. The average height:width ratio was found to be 1:3 proving that using width as a proxy for height overestimates tumour volume. Comparing volumes calculated with and without tumour height to the true volumes of excised tumours indeed showed that using the volume formula including height produced volumes 36X more accurate (based off of percentage difference). Monitoring the height:width relationship (prominence) across tumour growth curves indicated that prominence varied, and that height could change independent of width. Twelve cell lines were investigated individually; the scale of tumour prominence was cell line-dependent with relatively less prominent tumours (MC38, BL2, LL/2) and more prominent tumours (RENCA, HCT116) detected. Prominence trends across the growth cycle were also dependent on cell line; prominence was correlated with tumour growth in some cell lines (4T1, CT26, LNCaP), but not others (MC38, TC-1, LL/2). When pooled, invasive cell lines produced tumours that were significantly less prominent at volumes >1200 mm3 compared to non-invasive cell lines (P < .001). Modelling was used to show the impact of the increased accuracy gained by including height in volume calculations on several efficacy study outcomes. Variations in measurement accuracy contribute to experimental variation and irreproducibility of data, therefore we strongly advise researchers to measure height to improve accuracy in tumour studies.

肿瘤体积通常仅使用长度和宽度测量来计算,以1:1的比例使用宽度代替高度。当随时间跟踪肿瘤生长时,忽略高度会丢失重要的形态学信息和测量精度,我们发现高度是一个独特的变量。采用三维和热成像技术测量9522个小鼠皮下肿瘤的长度、宽度和高度。平均高宽比为1:3,证明用宽度代替高度高估了肿瘤体积。将考虑和不考虑肿瘤高度的计算体积与切除肿瘤的真实体积进行比较确实表明,使用包括高度在内的体积公式产生的体积比实际精确36倍(基于百分比差异)。监测肿瘤生长曲线的高度:宽度关系(突出)表明突出变化,并且高度可以独立于宽度变化。分别研究了12个细胞系;肿瘤突出程度与细胞系相关,肿瘤突出程度相对较低(MC38、BL2、LL/2),肿瘤突出程度较高(RENCA、HCT116)。整个生长周期的显著趋势也依赖于细胞系;在一些细胞系(4T1、CT26、LNCaP)中,突出与肿瘤生长相关,而在其他细胞系(MC38、TC-1、LL/2)中则无关。当合并时,侵袭性细胞系产生的肿瘤在体积> 1200mm3时明显低于非侵袭性细胞系(P
{"title":"Trends in Subcutaneous Tumour Height and Impact on Measurement Accuracy.","authors":"Daniel Brough,&nbsp;Hope Amos,&nbsp;Karl Turley,&nbsp;Jake Murkin","doi":"10.1177/11769351231165181","DOIUrl":"https://doi.org/10.1177/11769351231165181","url":null,"abstract":"<p><p>Tumour volume is typically calculated using only length and width measurements, using width as a proxy for height in a 1:1 ratio. When tracking tumour growth over time, important morphological information and measurement accuracy is lost by ignoring height, which we show is a unique variable. Lengths, widths, and heights of 9522 subcutaneous tumours in mice were measured using 3D and thermal imaging. The average height:width ratio was found to be 1:3 proving that using width as a proxy for height overestimates tumour volume. Comparing volumes calculated with and without tumour height to the true volumes of excised tumours indeed showed that using the volume formula including height produced volumes 36X more accurate (based off of percentage difference). Monitoring the height:width relationship (prominence) across tumour growth curves indicated that prominence varied, and that height could change independent of width. Twelve cell lines were investigated individually; the scale of tumour prominence was cell line-dependent with relatively less prominent tumours (MC38, BL2, LL/2) and more prominent tumours (RENCA, HCT116) detected. Prominence trends across the growth cycle were also dependent on cell line; prominence was correlated with tumour growth in some cell lines (4T1, CT26, LNCaP), but not others (MC38, TC-1, LL/2). When pooled, invasive cell lines produced tumours that were significantly less prominent at volumes >1200 mm<sup>3</sup> compared to non-invasive cell lines (<i>P</i> < .001). Modelling was used to show the impact of the increased accuracy gained by including height in volume calculations on several efficacy study outcomes. Variations in measurement accuracy contribute to experimental variation and irreproducibility of data, therefore we strongly advise researchers to measure height to improve accuracy in tumour studies.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231165181"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2a/51/10.1177_11769351231165181.PMC10126793.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9718474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Different Tumor Types Share a Common Nuclear Map of Chromosome Territories. 不同类型的肿瘤共享一个共同的染色体区域核图谱。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351221148592
Fritz F Parl

Different tumor types are characterized by unique histopathological patterns including distinctive nuclear architectures. I hypothesized that the difference in nuclear appearance is reflected in different nuclear maps of chromosome territories, the discrete regions occupied by individual chromosomes in the interphase nucleus. To test this hypothesis, I used interchromosomal translocations (ITLs) as an analytical tool to map chromosome territories in 11 different tumor types from the TCGA PanCancer database encompassing 6003 tumors with 5295 ITLs. For each chromosome I determined the number and percentage of all ITLs for any given tumor type. Chromosomes were ranked according to the frequency and percentage of ITLs per chromosome. The ranking showed similar patterns for all tumor types. Chromosomes 1, 8, 11, 17, and 19 were ranked in the top quarter, accounting for 35.2% of 5295 ITLs, whereas chromosomes 13, 15, 18, 21, and X were in the bottom quarter, accounting for only 10.5% ITLs. The correlation between the chromosome ranking in the total group of 6003 tumors and the ranking in individual tumor types was significant, ranging from P < .0001 to .0033. Thus, contrary to my hypothesis, different tumor types share a common nuclear map of chromosome territories. Based on the large number of ITLs in 11 different types of malignancy one can discern a shared pattern of chromosome territories in cancer and propose a probabilistic model of chromosomes 1, 8, 11, 17, 19 in the center of the nucleus and chromosomes 13, 15, 18, 21, X at the periphery.

不同的肿瘤类型具有独特的组织病理学模式,包括独特的核结构。我假设细胞核外观的差异反映在染色体区域的不同核图上,染色体区域是间期细胞核中单个染色体所占据的离散区域。为了验证这一假设,我使用染色体间易位(ITLs)作为分析工具,从TCGA PanCancer数据库中绘制了11种不同肿瘤类型的染色体区域图,该数据库包含6003个具有5295个ITLs的肿瘤。对于每条染色体,我确定了任何给定肿瘤类型的所有itl的数量和百分比。根据每条染色体出现itl的频率和百分比对染色体进行排序。排名显示所有肿瘤类型的模式相似。染色体1、8、11、17和19位于前1 / 4,占5295个itl的35.2%,而染色体13、15、18、21和X位于后1 / 4,仅占10.5%的itl。在6003个肿瘤的总组中,染色体排名与单个肿瘤类型的排名之间存在显著的相关性,从P
{"title":"Different Tumor Types Share a Common Nuclear Map of Chromosome Territories.","authors":"Fritz F Parl","doi":"10.1177/11769351221148592","DOIUrl":"https://doi.org/10.1177/11769351221148592","url":null,"abstract":"<p><p>Different tumor types are characterized by unique histopathological patterns including distinctive nuclear architectures. I hypothesized that the difference in nuclear appearance is reflected in different nuclear maps of chromosome territories, the discrete regions occupied by individual chromosomes in the interphase nucleus. To test this hypothesis, I used interchromosomal translocations (ITLs) as an analytical tool to map chromosome territories in 11 different tumor types from the TCGA PanCancer database encompassing 6003 tumors with 5295 ITLs. For each chromosome I determined the number and percentage of all ITLs for any given tumor type. Chromosomes were ranked according to the frequency and percentage of ITLs per chromosome. The ranking showed similar patterns for all tumor types. Chromosomes 1, 8, 11, 17, and 19 were ranked in the top quarter, accounting for 35.2% of 5295 ITLs, whereas chromosomes 13, 15, 18, 21, and X were in the bottom quarter, accounting for only 10.5% ITLs. The correlation between the chromosome ranking in the total group of 6003 tumors and the ranking in individual tumor types was significant, ranging from <i>P</i> < .0001 to .0033. Thus, contrary to my hypothesis, different tumor types share a common nuclear map of chromosome territories. Based on the large number of ITLs in 11 different types of malignancy one can discern a shared pattern of chromosome territories in cancer and propose a probabilistic model of chromosomes 1, 8, 11, 17, 19 in the center of the nucleus and chromosomes 13, 15, 18, 21, X at the periphery.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351221148592"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/cd/06/10.1177_11769351221148592.PMC9903037.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10747546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Computational Approach to Predict the Role of Genetic Alterations in Methyltransferase Histones Genes With Implications in Liver Cancer. 预测甲基转移酶组蛋白基因遗传改变在肝癌中的作用的计算方法。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231161480
Tania Isabella Aravena, Elizabeth Valdés, Nicolás Ayala, Vívian D'Afonseca

Histone methyltransferases (HMTs) comprise a subclass of epigenetic regulators. Dysregulation of these enzymes results in aberrant epigenetic regulation, commonly observed in various tumor types, including hepatocellular adenocarcinoma (HCC). Probably, these epigenetic changes could lead to tumorigenesis processes. To predict how histone methyltransferase genes and their genetic alterations (somatic mutations, somatic copy number alterations, and gene expression changes) are involved in hepatocellular adenocarcinoma processes, we performed an integrated computational analysis of genetic alterations in 50 HMT genes present in hepatocellular adenocarcinoma. Biological data were obtained through the public repository with 360 samples from patients with hepatocellular carcinoma. Through these biological data, we identified 10 HMT genes (SETDB1, ASH1L, SMYD2, SMYD3, EHMT2, SETD3, PRDM14, PRDM16, KMT2C, and NSD3) with a significant genetic alteration rate (14%) within 360 samples. Of these 10 HMT genes, KMT2C and ASH1L have the highest mutation rate in HCC samples, 5.6% and 2.8%, respectively. Regarding somatic copy number alteration, ASH1L and SETDB1 are amplified in several samples, while SETD3, PRDM14, and NSD3 showed a high rate of large deletion. Finally, SETDB1, SETD3, PRDM14, and NSD3 could play an important role in the progression of hepatocellular adenocarcinoma since alterations in these genes lead to a decrease in patient survival, unlike patients who present these genes without genetic alterations. Our computational analysis provides new insights that help to understand how HMTs are associated with hepatocellular carcinoma, as well as provide a basis for future experimental investigations using HMTs as genetic targets against hepatocellular carcinoma.

组蛋白甲基转移酶(hmt)包括一个亚类的表观遗传调控因子。这些酶的失调导致异常的表观遗传调控,通常在各种肿瘤类型中观察到,包括肝细胞腺癌(HCC)。这些表观遗传变化可能导致肿瘤发生过程。为了预测组蛋白甲基转移酶基因及其遗传改变(体细胞突变、体细胞拷贝数改变和基因表达改变)如何参与肝细胞腺癌过程,我们对肝细胞腺癌中存在的50个HMT基因的遗传改变进行了综合计算分析。生物学数据通过公共存储库获得,其中包括360例肝细胞癌患者的样本。通过这些生物学数据,我们在360个样本中鉴定出10个HMT基因(SETDB1, ASH1L, SMYD2, SMYD3, EHMT2, SETD3, PRDM14, PRDM16, KMT2C和NSD3)具有显著的遗传变异率(14%)。在这10个HMT基因中,KMT2C和ASH1L在HCC样本中的突变率最高,分别为5.6%和2.8%。在体细胞拷贝数改变方面,ASH1L和SETDB1在多个样本中被扩增,而SETD3、PRDM14和NSD3则表现出较高的大缺失率。最后,SETDB1、SETD3、PRDM14和NSD3可能在肝细胞腺癌的进展中发挥重要作用,因为这些基因的改变会导致患者生存期降低,而不像没有遗传改变的患者。我们的计算分析提供了新的见解,有助于了解hmt如何与肝细胞癌相关,并为未来使用hmt作为肝细胞癌遗传靶点的实验研究提供了基础。
{"title":"A Computational Approach to Predict the Role of Genetic Alterations in Methyltransferase Histones Genes With Implications in Liver Cancer.","authors":"Tania Isabella Aravena,&nbsp;Elizabeth Valdés,&nbsp;Nicolás Ayala,&nbsp;Vívian D'Afonseca","doi":"10.1177/11769351231161480","DOIUrl":"https://doi.org/10.1177/11769351231161480","url":null,"abstract":"<p><p>Histone methyltransferases (HMTs) comprise a subclass of epigenetic regulators. Dysregulation of these enzymes results in aberrant epigenetic regulation, commonly observed in various tumor types, including hepatocellular adenocarcinoma (HCC). Probably, these epigenetic changes could lead to tumorigenesis processes. To predict how histone methyltransferase genes and their genetic alterations (somatic mutations, somatic copy number alterations, and gene expression changes) are involved in hepatocellular adenocarcinoma processes, we performed an integrated computational analysis of genetic alterations in 50 HMT genes present in hepatocellular adenocarcinoma. Biological data were obtained through the public repository with 360 samples from patients with hepatocellular carcinoma. Through these biological data, we identified 10 HMT genes (<i>SETDB1, ASH1L, SMYD2, SMYD3, EHMT2, SETD3, PRDM14, PRDM16, KMT2C</i>, and <i>NSD3</i>) with a significant genetic alteration rate (14%) within 360 samples. Of these 10 HMT genes, <i>KMT2C</i> and <i>ASH1L</i> have the highest mutation rate in HCC samples, 5.6% and 2.8%, respectively. Regarding somatic copy number alteration, <i>ASH1L</i> and <i>SETDB1</i> are amplified in several samples, while <i>SETD3, PRDM14</i>, and <i>NSD3</i> showed a high rate of large deletion. Finally, <i>SETDB1, SETD3, PRDM14</i>, and <i>NSD3</i> could play an important role in the progression of hepatocellular adenocarcinoma since alterations in these genes lead to a decrease in patient survival, unlike patients who present these genes without genetic alterations. Our computational analysis provides new insights that help to understand how HMTs are associated with hepatocellular carcinoma, as well as provide a basis for future experimental investigations using HMTs as genetic targets against hepatocellular carcinoma.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231161480"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/1e/b4/10.1177_11769351231161480.PMC10064455.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9610566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Simple Method for Robust and Accurate Intrinsic Subtyping of Breast Cancer. 一种简便、可靠、准确的乳腺癌固有亚型分型方法。
IF 2 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-01-01 DOI: 10.1177/11769351231159893
Mehdi Hamaneh, Yi-Kuo Yu

Motivation: The PAM50 signature/method is widely used for intrinsic subtyping of breast cancer samples. However, depending on the number and composition of the samples included in a cohort, the method may assign different subtypes to the same sample. This lack of robustness is mainly due to the fact that PAM50 subtracts a reference profile, which is computed using all samples in the cohort, from each sample before classification. In this paper we propose modifications to PAM50 to develop a simple and robust single-sample classifier, called MPAM50, for intrinsic subtyping of breast cancer. Like PAM50, the modified method uses a nearest centroid approach for classification, but the centroids are computed differently, and the distances to the centroids are determined using an alternative method. Additionally, MPAM50 uses unnormalized expression values for classification and does not subtract a reference profile from the samples. In other words, MPAM50 classifies each sample independently, and so avoids the previously mentioned robustness issue.

Results: A training set was employed to find the new MPAM50 centroids. MPAM50 was then tested on 19 independent datasets (obtained using various expression profiling technologies) containing 9637 samples. Overall good agreement was observed between the PAM50- and MPAM50-assigned subtypes with a median accuracy of 0.792, which (we show) is comparable with the median concordance between various implementations of PAM50. Additionally, MPAM50- and PAM50-assigned intrinsic subtypes were found to agree comparably with the reported clinical subtypes. Also, survival analyses indicated that MPAM50 preserves the prognostic value of the intrinsic subtypes. These observations demonstrate that MPAM50 can replace PAM50 without loss of performance. On the other hand, MPAM50 was compared with 2 previously published single-sample classifiers, and with 3 alternative modified PAM50 approaches. The results indicated a superior performance by MPAM50.

Conclusions: MPAM50 is a robust, simple, and accurate single-sample classifier of intrinsic subtypes of breast cancer.

动机:PAM50特征/方法被广泛用于乳腺癌样本的内在亚型分型。然而,根据队列中样本的数量和组成,该方法可能为同一样本分配不同的亚型。这种鲁棒性的缺乏主要是由于PAM50在分类前从每个样本中减去了参考概况,该参考概况是使用队列中的所有样本计算的。在本文中,我们提出修改PAM50,以开发一个简单而稳健的单样本分类器,称为MPAM50,用于乳腺癌的内在亚型。与PAM50一样,改进的方法使用最近质心方法进行分类,但质心的计算方式不同,并且使用替代方法确定到质心的距离。此外,MPAM50使用非规范化表达式值进行分类,并且不会从样本中减去参考配置文件。换句话说,MPAM50对每个样本进行独立分类,从而避免了前面提到的鲁棒性问题。结果:利用训练集找到新的MPAM50质心。然后在包含9637个样本的19个独立数据集(使用各种表达谱分析技术获得)上测试MPAM50。在PAM50和mpam50分配的亚型之间观察到总体上良好的一致性,中位数准确性为0.792,(我们表明)与PAM50的各种实现之间的中位数一致性相当。此外,发现MPAM50和pam50分配的内在亚型与报道的临床亚型相当一致。此外,生存分析表明MPAM50保留了内在亚型的预后价值。这些观察结果表明,MPAM50可以代替PAM50而不损失性能。另一方面,将MPAM50与先前发表的2个单样本分类器以及3个可选的修改PAM50方法进行比较。结果表明,MPAM50具有优异的性能。结论:MPAM50是一种强大、简单、准确的乳腺癌固有亚型单样本分类器。
{"title":"A Simple Method for Robust and Accurate Intrinsic Subtyping of Breast Cancer.","authors":"Mehdi Hamaneh,&nbsp;Yi-Kuo Yu","doi":"10.1177/11769351231159893","DOIUrl":"https://doi.org/10.1177/11769351231159893","url":null,"abstract":"<p><strong>Motivation: </strong>The PAM50 signature/method is widely used for intrinsic subtyping of breast cancer samples. However, depending on the number and composition of the samples included in a cohort, the method may assign different subtypes to the same sample. This lack of robustness is mainly due to the fact that PAM50 subtracts a reference profile, which is computed using all samples in the cohort, from each sample before classification. In this paper we propose modifications to PAM50 to develop a simple and robust single-sample classifier, called MPAM50, for intrinsic subtyping of breast cancer. Like PAM50, the modified method uses a nearest centroid approach for classification, but the centroids are computed differently, and the distances to the centroids are determined using an alternative method. Additionally, MPAM50 uses unnormalized expression values for classification and does not subtract a reference profile from the samples. In other words, MPAM50 classifies each sample independently, and so avoids the previously mentioned robustness issue.</p><p><strong>Results: </strong>A training set was employed to find the new MPAM50 centroids. MPAM50 was then tested on 19 independent datasets (obtained using various expression profiling technologies) containing 9637 samples. Overall good agreement was observed between the PAM50- and MPAM50-assigned subtypes with a median accuracy of 0.792, which (we show) is comparable with the median concordance between various implementations of PAM50. Additionally, MPAM50- and PAM50-assigned intrinsic subtypes were found to agree comparably with the reported clinical subtypes. Also, survival analyses indicated that MPAM50 preserves the prognostic value of the intrinsic subtypes. These observations demonstrate that MPAM50 can replace PAM50 without loss of performance. On the other hand, MPAM50 was compared with 2 previously published single-sample classifiers, and with 3 alternative modified PAM50 approaches. The results indicated a superior performance by MPAM50.</p><p><strong>Conclusions: </strong>MPAM50 is a robust, simple, and accurate single-sample classifier of intrinsic subtypes of breast cancer.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"22 ","pages":"11769351231159893"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/38/68/10.1177_11769351231159893.PMC10052604.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9234981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Cancer Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1