首页 > 最新文献

2022 26th International Conference Information Visualisation (IV)最新文献

英文 中文
Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models 基于一般线坐标的决策树可视化以支持可解释模型
Pub Date : 2022-05-09 DOI: 10.1109/IV56949.2022.00065
Alex Worland, S. Wagle, B. Kovalerchuk
Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three standard benchmarks.
机器学习模型可视化是机器学习过程中提高模型可解释性和预测精度的重要组成部分。本文提出了一种将决策树可视化为可解释模型的新方法SPC-DT。这些方法使用一种称为移位配对坐标(SPC)的通用直线坐标。在SPC中,每个n-D点在一组位移的二维笛卡尔坐标对中被可视化为一个有向图。新方法扩展并补充了现有方法的功能,以可视化DT模型。它表明:(1)属性之间的关系,(2)相对于DT结构的个别情况,(3)DT中的数据流,(4)每次分割与DT节点中的阈值的紧密程度,以及(5)n-D空间中部分情况的密度。这些信息对于领域专家评估和改进DT模型非常重要,包括避免模型的过度泛化和过度拟合,以及它们的性能。使用三个标准基准,在案例研究中展示了这些方法的好处。
{"title":"Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models","authors":"Alex Worland, S. Wagle, B. Kovalerchuk","doi":"10.1109/IV56949.2022.00065","DOIUrl":"https://doi.org/10.1109/IV56949.2022.00065","url":null,"abstract":"Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three standard benchmarks.","PeriodicalId":153161,"journal":{"name":"2022 26th International Conference Information Visualisation (IV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129627056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A deep learning and genetic algorithm based feature selection processes on Leukemia Data 基于深度学习和遗传算法的白血病数据特征选择处理
Pub Date : 2022-01-25 DOI: 10.1109/IV56949.2022.00074
R. Francese, M. Frasca, M. Risi, G. Tortora
Acute Leukemia is classified in terms of two distinct classes: Acute Lymphoblastic Leukemia (ALL) and Acute Myeloid Leukemia (AML). This paper aims at defining a feature selection analysis process mainly based on Deep Learning for classifying the acute leukemia type. The considered dataset consists in data of patients affected by both the leukemia types. Both the leukemia types are characterized by a list of identical genes for all the patients. The analysis exploits feature selection techniques for reducing the consistent number of variables (genes). To this aim, we use linear models for differential expression for microarray data, and an autoencoder based unsupervised deep learning model to simplify and speed up the classification. Then, classification models have been implemented with the use of a deep neural network (DNN), obtaining an accuracy of approximately 92%. Moreover, the results have been compared with the ones provided by an approach based on support vector machines (SVM), giving an accuracy of 87,39%. Another feature selection approach based on genetic algorithms has been experimented, with worse performances. We also conducted a gene enrichment analysis based on the functional annotation of the differentially expressed genes. As a result, a differentially expressed pathway between the two pathologies has been detected.
急性白血病分为两类:急性淋巴母细胞白血病(ALL)和急性髓系白血病(AML)。本文旨在定义一种基于深度学习的特征选择分析过程,用于急性白血病类型的分类。考虑的数据集包括受两种白血病类型影响的患者的数据。这两种白血病的特点是所有患者都有一组相同的基因。该分析利用特征选择技术来减少变量(基因)的一致数量。为此,我们使用线性模型对微阵列数据进行差分表达,并使用基于自编码器的无监督深度学习模型来简化和加速分类。然后,使用深度神经网络(DNN)实现分类模型,获得约92%的准确率。此外,将结果与基于支持向量机(SVM)的方法进行了比较,准确率为87,39%。另一种基于遗传算法的特征选择方法也进行了实验,但性能较差。我们还根据差异表达基因的功能注释进行了基因富集分析。因此,两种病理之间的差异表达途径已经被发现。
{"title":"A deep learning and genetic algorithm based feature selection processes on Leukemia Data","authors":"R. Francese, M. Frasca, M. Risi, G. Tortora","doi":"10.1109/IV56949.2022.00074","DOIUrl":"https://doi.org/10.1109/IV56949.2022.00074","url":null,"abstract":"Acute Leukemia is classified in terms of two distinct classes: Acute Lymphoblastic Leukemia (ALL) and Acute Myeloid Leukemia (AML). This paper aims at defining a feature selection analysis process mainly based on Deep Learning for classifying the acute leukemia type. The considered dataset consists in data of patients affected by both the leukemia types. Both the leukemia types are characterized by a list of identical genes for all the patients. The analysis exploits feature selection techniques for reducing the consistent number of variables (genes). To this aim, we use linear models for differential expression for microarray data, and an autoencoder based unsupervised deep learning model to simplify and speed up the classification. Then, classification models have been implemented with the use of a deep neural network (DNN), obtaining an accuracy of approximately 92%. Moreover, the results have been compared with the ones provided by an approach based on support vector machines (SVM), giving an accuracy of 87,39%. Another feature selection approach based on genetic algorithms has been experimented, with worse performances. We also conducted a gene enrichment analysis based on the functional annotation of the differentially expressed genes. As a result, a differentially expressed pathway between the two pathologies has been detected.","PeriodicalId":153161,"journal":{"name":"2022 26th International Conference Information Visualisation (IV)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129651499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 26th International Conference Information Visualisation (IV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1