Pub Date : 2024-06-01DOI: 10.1016/j.visinf.2024.06.002
Jiazhe Wang , Xi Li , Chenlu Li , Di Peng , Arran Zeyu Wang , Yuhui Gu , Xingui Lai , Haifeng Zhang , Xinyue Xu , Xiaoqing Dong , Zhifeng Lin , Jiehui Zhou , Xingyu Liu , Wei Chen
With the incredible growth of the scale and complexity of datasets, creating proper visualizations for users becomes more and more challenging in large datasets. Though several visualization recommendation systems have been proposed, so far, the lack of practical engineering inputs is still a major concern regarding the usage of visualization recommendations in the industry. In this paper, we proposed AVA, an open-sourced web-based framework for Automated Visual Analytics. AVA contains both empiric-driven and insight-driven visualization recommendation methods to meet the demands of creating aesthetic visualizations and understanding expressible insights respectively. The code is available at https://github.com/antvis/AVA.
{"title":"AVA: An automated and AI-driven intelligent visual analytics framework","authors":"Jiazhe Wang , Xi Li , Chenlu Li , Di Peng , Arran Zeyu Wang , Yuhui Gu , Xingui Lai , Haifeng Zhang , Xinyue Xu , Xiaoqing Dong , Zhifeng Lin , Jiehui Zhou , Xingyu Liu , Wei Chen","doi":"10.1016/j.visinf.2024.06.002","DOIUrl":"10.1016/j.visinf.2024.06.002","url":null,"abstract":"<div><p>With the incredible growth of the scale and complexity of datasets, creating proper visualizations for users becomes more and more challenging in large datasets. Though several visualization recommendation systems have been proposed, so far, the lack of practical engineering inputs is still a major concern regarding the usage of visualization recommendations in the industry. In this paper, we proposed <em>AVA</em>, an open-sourced web-based framework for <strong>A</strong>utomated <strong>V</strong>isual <strong>A</strong>nalytics. AVA contains both empiric-driven and insight-driven visualization recommendation methods to meet the demands of creating aesthetic visualizations and understanding expressible insights respectively. The code is available at <span>https://github.com/antvis/AVA</span><svg><path></path></svg>.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 2","pages":"Pages 106-114"},"PeriodicalIF":3.8,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000226/pdfft?md5=d535cfeb7d4bca4f8b918b02581ff6a3&pid=1-s2.0-S2468502X24000226-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141410971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.visinf.2024.04.002
Yunpeng Chen , Ying Zhao , Xuanjing Li , Jiang Zhang , Jiang Long , Fangfang Zhou
{"title":"Corrigendum to “An open dataset of data lineage graphs for data governance research” [Vis. Inform. 8 (1) (2024) 1-5]","authors":"Yunpeng Chen , Ying Zhao , Xuanjing Li , Jiang Zhang , Jiang Long , Fangfang Zhou","doi":"10.1016/j.visinf.2024.04.002","DOIUrl":"https://doi.org/10.1016/j.visinf.2024.04.002","url":null,"abstract":"","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 2","pages":"Page 115"},"PeriodicalIF":3.8,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000159/pdfft?md5=70e77c4a6673309b62e427b282f276e0&pid=1-s2.0-S2468502X24000159-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141486899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.visinf.2024.04.006
Jin Xu , Chaojian Zhang , Ming Xie , Xiuxiu Zhan , Luwang Yan , Yubo Tao , Zhigeng Pan
Influence maximization (IM) algorithms play a significant role in hypergraph analysis tasks, such as epidemic control analysis, viral marketing, and social influence analysis, and various IM algorithms have been proposed. The main challenge lies in IM algorithm evaluation, due to the complexity and diversity of the spreading processes of different IM algorithms in different hypergraphs. Existing evaluation methods mainly leverage statistical metrics, such as influence spread, to quantify overall performance, but do not fully unravel spreading characteristics and patterns. In this paper, we propose an exploratory visual analytics system, IMVis, to assist users in exploring and evaluating IM algorithms at the overview, pattern, and node levels. A spreading pattern mining method is first proposed to characterize spreading processes and extract important spreading patterns to facilitate efficient analysis and comparison of IM algorithms. Novel visualization glyphs are designed to comprehensively reveal both temporal and structural features of IM algorithms’ spreading processes in hypergraphs at multiple levels. The effectiveness and usefulness of IMVis are demonstrated through two case studies and expert interviews.
影响最大化(IM)算法在流行病控制分析、病毒营销和社会影响分析等超图分析任务中发挥着重要作用,目前已提出了多种 IM 算法。由于不同 IM 算法在不同超图中的传播过程具有复杂性和多样性,其主要挑战在于 IM 算法的评估。现有的评估方法主要利用影响力传播等统计指标来量化整体性能,但不能完全揭示传播特征和模式。在本文中,我们提出了一个探索性的可视化分析系统--IMVis,以帮助用户从概览、模式和节点三个层面探索和评估 IM 算法。我们首先提出了一种传播模式挖掘方法,以描述传播过程并提取重要的传播模式,从而促进对 IM 算法的有效分析和比较。设计了新颖的可视化字形,以全面揭示 IM 算法在多层次超图中传播过程的时间和结构特征。通过两个案例研究和专家访谈,证明了 IMVis 的有效性和实用性。
{"title":"IMVis: Visual analytics for influence maximization algorithm evaluation in hypergraphs","authors":"Jin Xu , Chaojian Zhang , Ming Xie , Xiuxiu Zhan , Luwang Yan , Yubo Tao , Zhigeng Pan","doi":"10.1016/j.visinf.2024.04.006","DOIUrl":"10.1016/j.visinf.2024.04.006","url":null,"abstract":"<div><p>Influence maximization (IM) algorithms play a significant role in hypergraph analysis tasks, such as epidemic control analysis, viral marketing, and social influence analysis, and various IM algorithms have been proposed. The main challenge lies in IM algorithm evaluation, due to the complexity and diversity of the spreading processes of different IM algorithms in different hypergraphs. Existing evaluation methods mainly leverage statistical metrics, such as influence spread, to quantify overall performance, but do not fully unravel spreading characteristics and patterns. In this paper, we propose an exploratory visual analytics system, IMVis, to assist users in exploring and evaluating IM algorithms at the overview, pattern, and node levels. A spreading pattern mining method is first proposed to characterize spreading processes and extract important spreading patterns to facilitate efficient analysis and comparison of IM algorithms. Novel visualization glyphs are designed to comprehensively reveal both temporal and structural features of IM algorithms’ spreading processes in hypergraphs at multiple levels. The effectiveness and usefulness of IMVis are demonstrated through two case studies and expert interviews.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 2","pages":"Pages 13-26"},"PeriodicalIF":3.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000172/pdfft?md5=8a25558f06e02bd13aac06e34e54a160&pid=1-s2.0-S2468502X24000172-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141026551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.visinf.2024.05.001
Mengya Zheng, David Lillis, Abraham G. Campbell
Augmented Reality (AR), as a novel data visualization tool, is advantageous in revealing spatial data patterns and data-context associations. Accordingly, recent research has identified AR data visualization as a promising approach to increasing decision-making efficiency and effectiveness. As a result, AR has been applied in various decision support systems to enhance knowledge conveying and comprehension, in which the different data-reality associations have been constructed to aid decision-making.
However, how these AR visualization strategies can enhance different decision support datasets has not been reviewed thoroughly. Especially given the rise of big data in the modern world, this support is critical to decision-making in the coming years. Using AR to embed the decision support data and explanation data into the end user’s physical surroundings and focal contexts avoids isolating the human decision-maker from the relevant data. Integrating the decision-maker’s contexts and the DSS support in AR is a difficult challenge. This paper outlines the current state of the art through a literature review in allowing AR data visualization to support decision-making.
To facilitate the publication classification and analysis, the paper proposes one taxonomy to classify different AR data visualization based on the semantic associations between the AR data and physical context. Based on this taxonomy and a decision support system taxonomy, 37 publications have been classified and analyzed from multiple aspects. One of the contributions of this literature review is a resulting AR visualization taxonomy that can be applied to decision support systems. Along with this novel tool, the paper discusses the current state of the art in this field and indicates possible future challenges and directions that AR data visualization will bring to support decision-making.
增强现实(AR)作为一种新颖的数据可视化工具,在揭示空间数据模式和数据与上下文的关联方面具有优势。因此,最近的研究发现,AR 数据可视化是提高决策效率和效果的一种有前途的方法。因此,AR 已被应用于各种决策支持系统,以加强知识的传达和理解,其中不同的数据-现实关联已被构建以帮助决策。尤其是在大数据兴起的现代社会,这种支持对未来几年的决策至关重要。使用 AR 将决策支持数据和解释数据嵌入最终用户的物理环境和焦点情境中,可避免将人类决策者与相关数据隔离开来。在 AR 中整合决策者的情境和 DSS 支持是一项艰巨的挑战。为了便于对出版物进行分类和分析,本文根据 AR 数据与物理情境之间的语义关联,提出了一种分类法来对不同的 AR 数据可视化进行分类。根据该分类法和决策支持系统分类法,从多个方面对 37 篇出版物进行了分类和分析。本文献综述的贡献之一是提出了可应用于决策支持系统的 AR 可视化分类法。除了这个新颖的工具之外,本文还讨论了该领域的技术现状,并指出了 AR 数据可视化在支持决策方面未来可能面临的挑战和发展方向。
{"title":"Current state of the art and future directions: Augmented reality data visualization to support decision-making","authors":"Mengya Zheng, David Lillis, Abraham G. Campbell","doi":"10.1016/j.visinf.2024.05.001","DOIUrl":"10.1016/j.visinf.2024.05.001","url":null,"abstract":"<div><p>Augmented Reality (AR), as a novel data visualization tool, is advantageous in revealing spatial data patterns and data-context associations. Accordingly, recent research has identified AR data visualization as a promising approach to increasing decision-making efficiency and effectiveness. As a result, AR has been applied in various decision support systems to enhance knowledge conveying and comprehension, in which the different data-reality associations have been constructed to aid decision-making.</p><p>However, how these AR visualization strategies can enhance different decision support datasets has not been reviewed thoroughly. Especially given the rise of big data in the modern world, this support is critical to decision-making in the coming years. Using AR to embed the decision support data and explanation data into the end user’s physical surroundings and focal contexts avoids isolating the human decision-maker from the relevant data. Integrating the decision-maker’s contexts and the DSS support in AR is a difficult challenge. This paper outlines the current state of the art through a literature review in allowing AR data visualization to support decision-making.</p><p>To facilitate the publication classification and analysis, the paper proposes one taxonomy to classify different AR data visualization based on the semantic associations between the AR data and physical context. Based on this taxonomy and a decision support system taxonomy, 37 publications have been classified and analyzed from multiple aspects. One of the contributions of this literature review is a resulting AR visualization taxonomy that can be applied to decision support systems. Along with this novel tool, the paper discusses the current state of the art in this field and indicates possible future challenges and directions that AR data visualization will bring to support decision-making.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 2","pages":"Pages 80-105"},"PeriodicalIF":3.8,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000202/pdfft?md5=f80d87851c5113d4a9dd7255cbbe2978&pid=1-s2.0-S2468502X24000202-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141045667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-20DOI: 10.1016/j.visinf.2024.04.001
Zhiguang Zhou , Yize Li , Yuna Ni , Weiwen Xu , Guoting Hu , Ying Lai , Peixiong Chen , Weihua Su
Composite index is always derived with the weighted aggregation of hierarchical components, which is widely utilized to distill intricate and multidimensional matters in economic and business statistics. However, the composite indices always present inevitable anomalies at different levels oriented from the calculation and expression processes of hierarchical components, thereby impairing the precise depiction of specific economic issues. In this paper, we propose VisCI, a visualization framework for anomaly detection and interactive optimization of composite index. First, LSTM-AE model is performed to detect anomalies from the lower level to the higher level of the composite index. Then, a comprehensive array of visual cues is designed to visualize anomalies, such as hierarchy and anomaly visualization. In addition, an interactive operation is provided to ensure accurate and efficient index optimization, mitigating the adverse impact of anomalies on index calculation and representation. Finally, we implement a visualization framework with interactive interfaces, facilitating both anomaly detection and intuitive composite index optimization. Case studies based on real-world datasets and expert interviews are conducted to demonstrate the effectiveness of our VisCI in commodity index anomaly exploration and anomaly optimization.
{"title":"VisCI: A visualization framework for anomaly detection and interactive optimization of composite index","authors":"Zhiguang Zhou , Yize Li , Yuna Ni , Weiwen Xu , Guoting Hu , Ying Lai , Peixiong Chen , Weihua Su","doi":"10.1016/j.visinf.2024.04.001","DOIUrl":"10.1016/j.visinf.2024.04.001","url":null,"abstract":"<div><p>Composite index is always derived with the weighted aggregation of hierarchical components, which is widely utilized to distill intricate and multidimensional matters in economic and business statistics. However, the composite indices always present inevitable anomalies at different levels oriented from the calculation and expression processes of hierarchical components, thereby impairing the precise depiction of specific economic issues. In this paper, we propose VisCI, a visualization framework for anomaly detection and interactive optimization of composite index. First, LSTM-AE model is performed to detect anomalies from the lower level to the higher level of the composite index. Then, a comprehensive array of visual cues is designed to visualize anomalies, such as hierarchy and anomaly visualization. In addition, an interactive operation is provided to ensure accurate and efficient index optimization, mitigating the adverse impact of anomalies on index calculation and representation. Finally, we implement a visualization framework with interactive interfaces, facilitating both anomaly detection and intuitive composite index optimization. Case studies based on real-world datasets and expert interviews are conducted to demonstrate the effectiveness of our VisCI in commodity index anomaly exploration and anomaly optimization.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 2","pages":"Pages 1-12"},"PeriodicalIF":3.0,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000147/pdfft?md5=e75183915943bf3b9b9ca949c47ab656&pid=1-s2.0-S2468502X24000147-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140793971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Creating realistic materials is essential in the construction of immersive virtual environments. While existing techniques for material capture and conditional generation rely on flash-lit photos, they often produce artifacts when the illumination mismatches the training data. In this study, we introduce DiffMat, a novel diffusion model that integrates the CLIP image encoder and a multi-layer, cross-attention denoising backbone to generate latent materials from images under various illuminations. Using a pre-trained StyleGAN-based material generator, our method converts these latent materials into high-resolution SVBRDF textures, a process that enables a seamless fit into the standard physically based rendering pipeline, reducing the requirements for vast computational resources and expansive datasets. DiffMat surpasses existing generative methods in terms of material quality and variety, and shows adaptability to a broader spectrum of lighting conditions in reference images.
{"title":"DiffMat: Latent diffusion models for image-guided material generation","authors":"Liang Yuan , Dingkun Yan , Suguru Saito , Issei Fujishiro","doi":"10.1016/j.visinf.2023.12.001","DOIUrl":"10.1016/j.visinf.2023.12.001","url":null,"abstract":"<div><p>Creating realistic materials is essential in the construction of immersive virtual environments. While existing techniques for material capture and conditional generation rely on flash-lit photos, they often produce artifacts when the illumination mismatches the training data. In this study, we introduce DiffMat, a novel diffusion model that integrates the CLIP image encoder and a multi-layer, cross-attention denoising backbone to generate latent materials from images under various illuminations. Using a pre-trained StyleGAN-based material generator, our method converts these latent materials into high-resolution SVBRDF textures, a process that enables a seamless fit into the standard physically based rendering pipeline, reducing the requirements for vast computational resources and expansive datasets. DiffMat surpasses existing generative methods in terms of material quality and variety, and shows adaptability to a broader spectrum of lighting conditions in reference images.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 6-14"},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X24000019/pdfft?md5=fb0200304a9b292debbf18a3162d10e8&pid=1-s2.0-S2468502X24000019-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139396034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.visinf.2023.11.002
Bryson Lawton , Nanjia Wang , Steven Samoil , Parisa Daeijavad , Siqi Xie , Zhangxin Chen , Frank Maurer
To help determine in what ways virtual reality (VR) technologies may benefit reservoir engineering workflows, we conducted a usability study on a prototype VR tool for performing reservoir model analysis tasks. By leveraging the strengths of VR technologies, this tool’s aim is to help advance reservoir analysis workflows beyond conventional methods by improving how one understands, analyzes, and interacts with reservoir model visualizations. To evaluate our tool’s VR approach to this, the study presented herein was conducted with reservoir engineering experts who used the VR tool to perform three common reservoir model analysis tasks: the spatial filtering of model cells using movable planes, the cross-comparison of multiple models, and well path planning. Our study found that accomplishing these tasks with the VR tool was generally regarded as easier, quicker, more effective, and more intuitive than traditional model analysis software while maintaining a feeling of low task workload on average. Overall, participants provided positive feedback regarding their experience with using VR to perform reservoir engineering work tasks, and in general, it was found to improve multi-model cross-analysis and rough object manipulation in 3D. This indicates the potential for VR to be better than conventional means for some work tasks and participants also expressed they could see it best utilized as an addition to current software in their reservoir model analysis workflows. There were, however, some concerns voiced when considering the full adoption of VR into their work that would be best first addressed before this took place.
{"title":"Empirically evaluating virtual reality’s effect on reservoir engineering tasks","authors":"Bryson Lawton , Nanjia Wang , Steven Samoil , Parisa Daeijavad , Siqi Xie , Zhangxin Chen , Frank Maurer","doi":"10.1016/j.visinf.2023.11.002","DOIUrl":"https://doi.org/10.1016/j.visinf.2023.11.002","url":null,"abstract":"<div><p>To help determine in what ways virtual reality (VR) technologies may benefit reservoir engineering workflows, we conducted a usability study on a prototype VR tool for performing reservoir model analysis tasks. By leveraging the strengths of VR technologies, this tool’s aim is to help advance reservoir analysis workflows beyond conventional methods by improving how one understands, analyzes, and interacts with reservoir model visualizations. To evaluate our tool’s VR approach to this, the study presented herein was conducted with reservoir engineering experts who used the VR tool to perform three common reservoir model analysis tasks: the spatial filtering of model cells using movable planes, the cross-comparison of multiple models, and well path planning. Our study found that accomplishing these tasks with the VR tool was generally regarded as easier, quicker, more effective, and more intuitive than traditional model analysis software while maintaining a feeling of low task workload on average. Overall, participants provided positive feedback regarding their experience with using VR to perform reservoir engineering work tasks, and in general, it was found to improve multi-model cross-analysis and rough object manipulation in 3D. This indicates the potential for VR to be better than conventional means for some work tasks and participants also expressed they could see it best utilized as an addition to current software in their reservoir model analysis workflows. There were, however, some concerns voiced when considering the full adoption of VR into their work that would be best first addressed before this took place.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 26-46"},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X23000542/pdfft?md5=1b711e1ac53d26ef09020082f01a69a6&pid=1-s2.0-S2468502X23000542-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140339085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.visinf.2023.06.008
Ying Zhao , Shenglan Lv , Wenwei Long , Yilun Fan , Jian Yuan , Haojin Jiang , Fangfang Zhou
Malicious webshells currently present tremendous threats to cloud security. Most relevant studies and open webshell datasets consider malicious webshell defense as a binary classification problem, that is, identifying whether a webshell is malicious or benign. However, a fine-grained multi-classification is urgently needed to enable precise responses and active defenses on malicious webshell threats. This paper introduces a malicious webshell family dataset named MWF to facilitate webshell multi-classification researches. This dataset contains 1359 malicious webshell samples originally obtained from the cloud servers of Alibaba Cloud. Each of them is provided with a family label. The samples of the same family generally present similar characteristics or behaviors. The dataset has a total of 78 families and 22 outliers. Moreover, this paper introduces the human–machine collaboration process that is adopted to remove benign or duplicate samples, address privacy issues, and determine the family of each sample. This paper also compares the distinguished features of the MWF dataset with previous datasets and summarizes the potential applied areas in cloud security and generalized sequence, graph, and tree data analytics and visualization.
{"title":"Malicious webshell family dataset for webshell multi-classification research","authors":"Ying Zhao , Shenglan Lv , Wenwei Long , Yilun Fan , Jian Yuan , Haojin Jiang , Fangfang Zhou","doi":"10.1016/j.visinf.2023.06.008","DOIUrl":"10.1016/j.visinf.2023.06.008","url":null,"abstract":"<div><p>Malicious webshells currently present tremendous threats to cloud security. Most relevant studies and open webshell datasets consider malicious webshell defense as a binary classification problem, that is, identifying whether a webshell is malicious or benign. However, a fine-grained multi-classification is urgently needed to enable precise responses and active defenses on malicious webshell threats. This paper introduces a malicious webshell family dataset named MWF to facilitate webshell multi-classification researches. This dataset contains 1359 malicious webshell samples originally obtained from the cloud servers of Alibaba Cloud. Each of them is provided with a family label. The samples of the same family generally present similar characteristics or behaviors. The dataset has a total of 78 families and 22 outliers. Moreover, this paper introduces the human–machine collaboration process that is adopted to remove benign or duplicate samples, address privacy issues, and determine the family of each sample. This paper also compares the distinguished features of the MWF dataset with previous datasets and summarizes the potential applied areas in cloud security and generalized sequence, graph, and tree data analytics and visualization.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 47-55"},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X23000335/pdfft?md5=0e04b6b31402572c03a419f9b7597a47&pid=1-s2.0-S2468502X23000335-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75769287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.visinf.2023.11.001
Jun Wang , Bohan Lei , Liya Ding , Xiaoyin Xu , Xianfeng Gu , Min Zhang
Medical image generation has recently garnered significant interest among researchers. However, the primary generative models, such as Generative Adversarial Networks (GANs), often encounter challenges during training, including mode collapse. To address these issues, we proposed the AE-COT-GAN model (Autoencoder-based Conditional Optimal Transport Generative Adversarial Network) for the generation of medical images belonging to specific categories. The training process of our model comprises three fundamental components. The training process of our model encompasses three fundamental components. First, we employ an autoencoder model to obtain a low-dimensional manifold representation of real images. Second, we apply extended semi-discrete optimal transport to map Gaussian noise distribution to the latent space distribution and obtain corresponding labels effectively. This procedure leads to the generation of new latent codes with known labels. Finally, we integrate a GAN to train the decoder further to generate medical images. To evaluate the performance of the AE-COT-GAN model, we conducted experiments on two medical image datasets, namely DermaMNIST and BloodMNIST. The model’s performance was compared with state-of-the-art generative models. Results show that the AE-COT-GAN model had excellent performance in generating medical images. Moreover, it effectively addressed the common issues associated with traditional GANs.
医学图像生成最近引起了研究人员的极大兴趣。然而,主要的生成模型,如生成对抗网络(GAN),在训练过程中经常会遇到模式崩溃等挑战。为了解决这些问题,我们提出了 AE-COT-GAN 模型(基于自动编码器的条件优化传输生成对抗网络),用于生成属于特定类别的医学图像。我们模型的训练过程包括三个基本组成部分。我们模型的训练过程包括三个基本组成部分。首先,我们采用自动编码器模型获得真实图像的低维流形表示。其次,我们应用扩展的半离散最优传输将高斯噪声分布映射到潜空间分布,并有效地获得相应的标签。这一过程可生成带有已知标签的新潜码。最后,我们整合了一个 GAN 来进一步训练解码器,以生成医学图像。为了评估 AE-COT-GAN 模型的性能,我们在两个医学图像数据集(即 DermaMNIST 和 BloodMNIST)上进行了实验。我们将该模型的性能与最先进的生成模型进行了比较。结果表明,AE-COT-GAN 模型在生成医学图像方面表现出色。此外,它还有效地解决了与传统 GAN 相关的常见问题。
{"title":"Autoencoder-based conditional optimal transport generative adversarial network for medical image generation","authors":"Jun Wang , Bohan Lei , Liya Ding , Xiaoyin Xu , Xianfeng Gu , Min Zhang","doi":"10.1016/j.visinf.2023.11.001","DOIUrl":"https://doi.org/10.1016/j.visinf.2023.11.001","url":null,"abstract":"<div><p>Medical image generation has recently garnered significant interest among researchers. However, the primary generative models, such as Generative Adversarial Networks (GANs), often encounter challenges during training, including mode collapse. To address these issues, we proposed the AE-COT-GAN model (Autoencoder-based Conditional Optimal Transport Generative Adversarial Network) for the generation of medical images belonging to specific categories. The training process of our model comprises three fundamental components. The training process of our model encompasses three fundamental components. First, we employ an autoencoder model to obtain a low-dimensional manifold representation of real images. Second, we apply extended semi-discrete optimal transport to map Gaussian noise distribution to the latent space distribution and obtain corresponding labels effectively. This procedure leads to the generation of new latent codes with known labels. Finally, we integrate a GAN to train the decoder further to generate medical images. To evaluate the performance of the AE-COT-GAN model, we conducted experiments on two medical image datasets, namely DermaMNIST and BloodMNIST. The model’s performance was compared with state-of-the-art generative models. Results show that the AE-COT-GAN model had excellent performance in generating medical images. Moreover, it effectively addressed the common issues associated with traditional GANs.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 15-25"},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X23000529/pdfft?md5=3af566b28e15f895521e10dc5d8d1dbc&pid=1-s2.0-S2468502X23000529-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140339084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.visinf.2023.11.003
Jialu Dong , Huijie Zhang , Meiqi Cui , Yiming Lin , Hsiang-Yun Wu , Chongke Bi
Traffic congestion is becoming increasingly severe as a result of urbanization, which not only impedes people’s ability to travel but also hinders the economic development of cities. Modeling the correlation between congestion and its influencing factors using machine learning methods makes it possible to quickly identify congested road segments. Due to the intrinsic black-box character of machine learning models, it is difficult for experts to trust the decision results of road congestion prediction models and understand the significance of congestion-causing factors. In this paper, we present a model interpretability method to investigate the potential causes of traffic congestion and quantify the importance of various influencing factors using the SHAP method. Due to the multidimensionality of these factors, it can be challenging to visually represent the impact of all factors. In response, we propose TCEVis, an interactive visual analytics system that enables multi-level exploration of road conditions. Through three case studies utilizing actual data, we demonstrate that the TCEVis system offers advantages for assisting traffic managers in analyzing the causes of traffic congestion and elucidating the significance of various influencing factors.
{"title":"TCEVis: Visual analytics of traffic congestion influencing factors based on explainable machine learning","authors":"Jialu Dong , Huijie Zhang , Meiqi Cui , Yiming Lin , Hsiang-Yun Wu , Chongke Bi","doi":"10.1016/j.visinf.2023.11.003","DOIUrl":"https://doi.org/10.1016/j.visinf.2023.11.003","url":null,"abstract":"<div><p>Traffic congestion is becoming increasingly severe as a result of urbanization, which not only impedes people’s ability to travel but also hinders the economic development of cities. Modeling the correlation between congestion and its influencing factors using machine learning methods makes it possible to quickly identify congested road segments. Due to the intrinsic black-box character of machine learning models, it is difficult for experts to trust the decision results of road congestion prediction models and understand the significance of congestion-causing factors. In this paper, we present a model interpretability method to investigate the potential causes of traffic congestion and quantify the importance of various influencing factors using the SHAP method. Due to the multidimensionality of these factors, it can be challenging to visually represent the impact of all factors. In response, we propose TCEVis, an interactive visual analytics system that enables multi-level exploration of road conditions. Through three case studies utilizing actual data, we demonstrate that the TCEVis system offers advantages for assisting traffic managers in analyzing the causes of traffic congestion and elucidating the significance of various influencing factors.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 56-66"},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X23000530/pdfft?md5=71c05bc362850cbe9f83fb75c6e85e7f&pid=1-s2.0-S2468502X23000530-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140341083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}