Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3594817
Min Chen, David Ebert, Theresa-Marie Rhyne
While it is necessary for most (if not all) visualization and visual analytics (VIS) publication venues to use peer review processes to assure the quality of the papers to be published, it is also necessary for the VIS community to appraise and improve the quality of peer review processes from time to time. In recent years, rejecting a VIS paper seems to have become rather easy, as many rejection reasons are available to criticize a given paper. In this article, we analyze possible causes of this phenomenon and recommend possible remedies. In particular, over the past decades, the visualization field has rapidly grown to include many types of contributions and specialized research areas. Given this large landscape of topics, we need to ensure that good contributions within each area are reviewed properly, published, and built upon to make significant advancement in the area concerned. Therefore, it is crucial that our review process applies specific criteria for each area and does not expect individual publications to satisfy many review criteria designed for other areas. In this way, we hope VIS review processes will enable more VIS research with X factors (original, innovative, significant, impactful, rigorous, insightful, or inspirational) to be published promptly, allowing VIS researchers and practitioners to make even more impactful contributions to data sciences.
{"title":"How to Reject a VIS Paper, or Not?","authors":"Min Chen, David Ebert, Theresa-Marie Rhyne","doi":"10.1109/MCG.2025.3594817","DOIUrl":"10.1109/MCG.2025.3594817","url":null,"abstract":"<p><p>While it is necessary for most (if not all) visualization and visual analytics (VIS) publication venues to use peer review processes to assure the quality of the papers to be published, it is also necessary for the VIS community to appraise and improve the quality of peer review processes from time to time. In recent years, rejecting a VIS paper seems to have become rather easy, as many rejection reasons are available to criticize a given paper. In this article, we analyze possible causes of this phenomenon and recommend possible remedies. In particular, over the past decades, the visualization field has rapidly grown to include many types of contributions and specialized research areas. Given this large landscape of topics, we need to ensure that good contributions within each area are reviewed properly, published, and built upon to make significant advancement in the area concerned. Therefore, it is crucial that our review process applies specific criteria for each area and does not expect individual publications to satisfy many review criteria designed for other areas. In this way, we hope VIS review processes will enable more VIS research with X factors (original, innovative, significant, impactful, rigorous, insightful, or inspirational) to be published promptly, allowing VIS researchers and practitioners to make even more impactful contributions to data sciences.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"45 6","pages":"101-111"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145497561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3597849
Yuheng Shao, Shiyi Liu, Gongyan Chen, Ruofei Ma, Xingbo Wang, Quan Li
Fashion e-commerce design requires the integration of creativity, functionality, and responsiveness to user preferences. While AI offers valuable support, generative models often miss the nuances of user experience, and task-specific models, although more accurate, lack transparency and real-world adaptability-especially with complex multimodal data. These issues reduce designers' trust and hinder effective AI integration. To address this, we present FashionCook, a visual analytics system designed to support human-AI collaboration in the context of fashion e-commerce. The system bridges communication among model builders, designers, and marketers by providing transparent model interpretations, "what-if" scenario exploration, and iterative feedback mechanisms. We validate the system through two real-world case studies and a user study, demonstrating how FashionCook enhances collaborative workflows and improves design outcomes in data-driven fashion e-commerce environments.
{"title":"FashionCook: A Visual Analytics System for Human-AI Collaboration in Fashion E-Commerce Design.","authors":"Yuheng Shao, Shiyi Liu, Gongyan Chen, Ruofei Ma, Xingbo Wang, Quan Li","doi":"10.1109/MCG.2025.3597849","DOIUrl":"10.1109/MCG.2025.3597849","url":null,"abstract":"<p><p>Fashion e-commerce design requires the integration of creativity, functionality, and responsiveness to user preferences. While AI offers valuable support, generative models often miss the nuances of user experience, and task-specific models, although more accurate, lack transparency and real-world adaptability-especially with complex multimodal data. These issues reduce designers' trust and hinder effective AI integration. To address this, we present FashionCook, a visual analytics system designed to support human-AI collaboration in the context of fashion e-commerce. The system bridges communication among model builders, designers, and marketers by providing transparent model interpretations, \"what-if\" scenario exploration, and iterative feedback mechanisms. We validate the system through two real-world case studies and a user study, demonstrating how FashionCook enhances collaborative workflows and improves design outcomes in data-driven fashion e-commerce environments.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":"61-75"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144979300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3605029
Alexander Bendeck, John Stasko, Rahul C Basole, Francesco Ferrise
Large language models (LLMs) are now being applied to the tasks of visualization generation and understanding, demonstrating these models' ability to be "visually literate." On the generation side, LLMs have shown promise in powering natural languages' interfaces for visualization authoring while also suffering from usability and inconsistency issues. On the interpretation side, models (especially vision-language models) can answer basic questions about visualizations, synthesize visual and textual information, and detect misleading visual designs. However, models also tend to struggle with certain analytic tasks, and their takeaways from reading visualizations often differ from those of humans. We aim to both illuminate the state of the art in LLMs' visualization literacy and speculate on where such work may, and perhaps ought to, take us next.
{"title":"How Visually Literate Are Large Language Models? Reflections on Recent Advances and Future Directions.","authors":"Alexander Bendeck, John Stasko, Rahul C Basole, Francesco Ferrise","doi":"10.1109/MCG.2025.3605029","DOIUrl":"10.1109/MCG.2025.3605029","url":null,"abstract":"<p><p>Large language models (LLMs) are now being applied to the tasks of visualization generation and understanding, demonstrating these models' ability to be \"visually literate.\" On the generation side, LLMs have shown promise in powering natural languages' interfaces for visualization authoring while also suffering from usability and inconsistency issues. On the interpretation side, models (especially vision-language models) can answer basic questions about visualizations, synthesize visual and textual information, and detect misleading visual designs. However, models also tend to struggle with certain analytic tasks, and their takeaways from reading visualizations often differ from those of humans. We aim to both illuminate the state of the art in LLMs' visualization literacy and speculate on where such work may, and perhaps ought to, take us next.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"45 6","pages":"120-129"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145497505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3598204
Reza Shahriari, Yichi Yang, Danish Nisar Ahmed Tamboli, Michael Perez, Yuheng Zha, Jinyu Hou, Mingkai Deng, Eric D Ragan, Jaime Ruiz, Daisy Zhe Wang, Zhiting Hu, Eric Xing
Object recognition is a fundamental challenge in computer vision, particularly for fine-grained object classification, where classes differ in minor features. Improved fine-grained object classification requires a teaching system with numerous classes and instances of data. As the number of hierarchical levels and instances grows, debugging these models becomes increasingly complex. Moreover, different types of debugging tasks require varying approaches, explanations, and levels of detail. We present MuCHEx, a multimodal conversational system that blends natural language and visual interaction for interactive debugging of hierarchical object classification. Natural language allows users to flexibly express high-level questions or debugging goals without needing to navigate complex interfaces, while adaptive explanations surface only the most relevant visual or textual details based on the user's current task. This multimodal approach combines the expressiveness of language with the precision of direct manipulation, enabling context-aware exploration during model debugging.
{"title":"MuCHEx: A Multimodal Conversational Debugging Tool for Interactive Visual Exploration of Hierarchical Object Classification.","authors":"Reza Shahriari, Yichi Yang, Danish Nisar Ahmed Tamboli, Michael Perez, Yuheng Zha, Jinyu Hou, Mingkai Deng, Eric D Ragan, Jaime Ruiz, Daisy Zhe Wang, Zhiting Hu, Eric Xing","doi":"10.1109/MCG.2025.3598204","DOIUrl":"10.1109/MCG.2025.3598204","url":null,"abstract":"<p><p>Object recognition is a fundamental challenge in computer vision, particularly for fine-grained object classification, where classes differ in minor features. Improved fine-grained object classification requires a teaching system with numerous classes and instances of data. As the number of hierarchical levels and instances grows, debugging these models becomes increasingly complex. Moreover, different types of debugging tasks require varying approaches, explanations, and levels of detail. We present MuCHEx, a multimodal conversational system that blends natural language and visual interaction for interactive debugging of hierarchical object classification. Natural language allows users to flexibly express high-level questions or debugging goals without needing to navigate complex interfaces, while adaptive explanations surface only the most relevant visual or textual details based on the user's current task. This multimodal approach combines the expressiveness of language with the precision of direct manipulation, enabling context-aware exploration during model debugging.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":"76-88"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3598262
Jingzhen Zhang, Hongjiang Lv, Zhibin Niu
Visual comparison of text embeddings is crucial for analyzing semantic differences and comparing embedding models. Existing methods fail to maintain visual consistency in comparative regions and lack AI-assisted analysis, leading to high cognitive loads and time-consuming exploration processes. In this article, we propose AnchorTextVis, a visual analytics approach based on AnchorMap-our dynamic projection algorithm balancing spatial quality and temporal coherence and large language models (LLMs) to preserve users' mental map and accelerate the exploration process. We introduce the use of comparable dimensionality reduction algorithms that maintain visual consistency, such as AnchorMap from our previous work and Joint t-SNE. Building on this foundation, we leverage LLMs to compare and summarize, offering users insights. For quantitative comparisons, we define two complementary metrics, Shared k-nearest neighbors (KNN) and Coordinate distance. Besides, we have also designed intuitive representation and rich interactive tools to compare clusters of texts and individual texts. We demonstrate the effectiveness and usefulness of our approach through three case studies and expert feedback.
{"title":"AnchorTextVis: A Visual Analytics Approach for Fast Comparison of Text Embeddings.","authors":"Jingzhen Zhang, Hongjiang Lv, Zhibin Niu","doi":"10.1109/MCG.2025.3598262","DOIUrl":"10.1109/MCG.2025.3598262","url":null,"abstract":"<p><p>Visual comparison of text embeddings is crucial for analyzing semantic differences and comparing embedding models. Existing methods fail to maintain visual consistency in comparative regions and lack AI-assisted analysis, leading to high cognitive loads and time-consuming exploration processes. In this article, we propose AnchorTextVis, a visual analytics approach based on AnchorMap-our dynamic projection algorithm balancing spatial quality and temporal coherence and large language models (LLMs) to preserve users' mental map and accelerate the exploration process. We introduce the use of comparable dimensionality reduction algorithms that maintain visual consistency, such as AnchorMap from our previous work and Joint t-SNE. Building on this foundation, we leverage LLMs to compare and summarize, offering users insights. For quantitative comparisons, we define two complementary metrics, Shared k-nearest neighbors (KNN) and Coordinate distance. Besides, we have also designed intuitive representation and rich interactive tools to compare clusters of texts and individual texts. We demonstrate the effectiveness and usefulness of our approach through three case studies and expert feedback.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":"29-43"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144849685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01DOI: 10.1109/MCG.2025.3607741
Vaishali Dhanoa, Anton Wolter, Gabriela Molina Leon, Hans-Jorg Schulz, Niklas Elmqvist
Autonomous agents powered by large language models are transforming artificial intelligence (AI), creating an imperative for the visualization area. However, our field's focus on a human in the sensemaking loop raises critical questions about autonomy, delegation, and coordination for such agentic visualization that preserve human agency while amplifying analytical capabilities. This article addresses these questions by reinterpreting existing visualization systems with semiautomated or fully automatic AI components through an agentic lens. Based on this analysis, we extract a collection of design patterns for agentic visualization, including agentic roles, communication, and coordination. These patterns provide a foundation for future agentic visualization systems that effectively harness AI agents while maintaining human insight and control.
{"title":"Agentic Visualization: Extracting Agent-Based Design Patterns From Visualization Systems.","authors":"Vaishali Dhanoa, Anton Wolter, Gabriela Molina Leon, Hans-Jorg Schulz, Niklas Elmqvist","doi":"10.1109/MCG.2025.3607741","DOIUrl":"10.1109/MCG.2025.3607741","url":null,"abstract":"<p><p>Autonomous agents powered by large language models are transforming artificial intelligence (AI), creating an imperative for the visualization area. However, our field's focus on a human in the sensemaking loop raises critical questions about autonomy, delegation, and coordination for such agentic visualization that preserve human agency while amplifying analytical capabilities. This article addresses these questions by reinterpreting existing visualization systems with semiautomated or fully automatic AI components through an agentic lens. Based on this analysis, we extract a collection of design patterns for agentic visualization, including agentic roles, communication, and coordination. These patterns provide a foundation for future agentic visualization systems that effectively harness AI agents while maintaining human insight and control.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":"89-100"},"PeriodicalIF":1.4,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-23DOI: 10.1109/MCG.2025.3624572
Eva Bones, Dawar Khan, Ciril Bohak, Benjamin A Barad, Danielle A Grotjahn, Ivan Viola, Thomas Theusl
This paper presents MidSurfer, a novel parameter-free method for extracting mid-surfaces from segmented volumetric data. The method generates uniformly triangulated, smooth meshes that accurately capture structural features. The process begins with the Ridge Field Transformation step that transforms the segmented input data, followed by the Mid-Polyline Extraction Algorithm that works on individual volume slices. Based on the connectivity of components, this step can result in either single or multiple polyline segments that represent the structural features. These segments form a coherent series, creating a backbone of regularly spaced points representing the mid-surface. Subsequently, we employ a Polyline Zipper Algorithm for triangulation that connects these polyline segments across neighboring slices, yielding a detailed triangulated mid-surface mesh. Results show that this method outperforms previous techniques in versatility, simplicity, and accuracy. Our approach is publicly available as a ParaView plugin at https://github.com/kaust-vislab/MidSurfer.
提出了一种从分割体数据中提取中间曲面的无参数方法MidSurfer。该方法生成均匀的三角化平滑网格,准确捕获结构特征。该过程从转换分割输入数据的Ridge Field Transformation步骤开始,然后是在单个体积切片上工作的Mid-Polyline Extraction Algorithm。基于组件的连通性,这一步可以产生单个或多个表示结构特征的多线段。这些部分形成了一个连贯的系列,形成了一个代表中表面的规则间隔点的主干。随后,我们采用一种多线段拉链算法(Polyline zippers Algorithm)进行三角剖分,将这些多线段跨相邻切片连接起来,生成详细的三角剖分中表面网格。结果表明,该方法在通用性、简单性和准确性方面优于以往的方法。我们的方法可以在https://github.com/kaust-vislab/MidSurfer上作为ParaView插件公开获得。
{"title":"MidSurfer: A Parameter-Free Approach for Mid-Surface Extraction From Segmented Volumetric Data.","authors":"Eva Bones, Dawar Khan, Ciril Bohak, Benjamin A Barad, Danielle A Grotjahn, Ivan Viola, Thomas Theusl","doi":"10.1109/MCG.2025.3624572","DOIUrl":"https://doi.org/10.1109/MCG.2025.3624572","url":null,"abstract":"<p><p>This paper presents MidSurfer, a novel parameter-free method for extracting mid-surfaces from segmented volumetric data. The method generates uniformly triangulated, smooth meshes that accurately capture structural features. The process begins with the Ridge Field Transformation step that transforms the segmented input data, followed by the Mid-Polyline Extraction Algorithm that works on individual volume slices. Based on the connectivity of components, this step can result in either single or multiple polyline segments that represent the structural features. These segments form a coherent series, creating a backbone of regularly spaced points representing the mid-surface. Subsequently, we employ a Polyline Zipper Algorithm for triangulation that connects these polyline segments across neighboring slices, yielding a detailed triangulated mid-surface mesh. Results show that this method outperforms previous techniques in versatility, simplicity, and accuracy. Our approach is publicly available as a ParaView plugin at https://github.com/kaust-vislab/MidSurfer.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145356842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-12DOI: 10.1109/MCG.2025.3609294
Frank Elavsky, Marita Vindedal, Ted Gies, Patrick Carrington, Dominik Moritz, Oystein Moseng
Accessible design for some may still produce barriers for others. This tension, called access friction, creates challenges for both designers and end-users with disabilities. To address this, we present the concept of softerware, a system design approach that provides end users with agency to meaningfully customize and adapt interfaces to their needs. To apply softerware to visualization, we assembled 195 data visualization customization options centered on the barriers we expect users with disabilities will experience. We built a prototype that applies a subset of these options and interviewed practitioners for feedback. Lastly, we conducted a design probe study with blind and low vision accessibility professionals to learn more about their challenges and visions for softerware. We observed access frictions between our participant's designs and they expressed that for softerware's success, current and future systems must be designed with accessible defaults, interoperability, persistence, and respect for a user's perceived effort-to-outcome ratio.
{"title":"Towards softerware: Enabling personalization of interactive data representations for users with disabilities.","authors":"Frank Elavsky, Marita Vindedal, Ted Gies, Patrick Carrington, Dominik Moritz, Oystein Moseng","doi":"10.1109/MCG.2025.3609294","DOIUrl":"https://doi.org/10.1109/MCG.2025.3609294","url":null,"abstract":"<p><p>Accessible design for some may still produce barriers for others. This tension, called access friction, creates challenges for both designers and end-users with disabilities. To address this, we present the concept of softerware, a system design approach that provides end users with agency to meaningfully customize and adapt interfaces to their needs. To apply softerware to visualization, we assembled 195 data visualization customization options centered on the barriers we expect users with disabilities will experience. We built a prototype that applies a subset of these options and interviewed practitioners for feedback. Lastly, we conducted a design probe study with blind and low vision accessibility professionals to learn more about their challenges and visions for softerware. We observed access frictions between our participant's designs and they expressed that for softerware's success, current and future systems must be designed with accessible defaults, interoperability, persistence, and respect for a user's perceived effort-to-outcome ratio.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145056047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-03DOI: 10.1109/MCG.2025.3605806
Uzair Shah, Sara Jashari, Muhammad Tukur, Mowafa Househ, Jens Schneider, Giovanni Pintore, Enrico Gobbetti, Marco Agus
Capturing indoor environments with 360° images provides a cost-effective method for creating immersive content. However, virtual staging - removing existing furniture and inserting new objects with realistic lighting - remains challenging. We present VISPI (Virtual Staging Pipeline for Single Indoor Panoramic Images), a framework that enables interactive restaging of indoor scenes from a single panoramic image. Our approach combines multi-task deep learning with real-time rendering to extract geometric, semantic, and material information from cluttered scenes. The system includes: i) a vision transformer that simultaneously predicts depth, normals, semantics, albedo, and material properties; ii) spherical Gaussian lighting estimation; iii) real-time editing for interactive object placement; iv) stereoscopic Multi-Center-Of-Projection generation for Head Mounted Display exploration. The framework processes input through two pathways: extracting clutter-free representations for virtual staging and estimating material properties including metallic and roughness signals. We evaluate VISPI on Structured3D and FutureHouse datasets, demonstrating applications in real estate visualization, interior design, and virtual environment creation.
{"title":"Virtual Staging of Indoor Panoramic Images via Multi-task Learning and Inverse Rendering.","authors":"Uzair Shah, Sara Jashari, Muhammad Tukur, Mowafa Househ, Jens Schneider, Giovanni Pintore, Enrico Gobbetti, Marco Agus","doi":"10.1109/MCG.2025.3605806","DOIUrl":"https://doi.org/10.1109/MCG.2025.3605806","url":null,"abstract":"<p><p>Capturing indoor environments with 360° images provides a cost-effective method for creating immersive content. However, virtual staging - removing existing furniture and inserting new objects with realistic lighting - remains challenging. We present VISPI (Virtual Staging Pipeline for Single Indoor Panoramic Images), a framework that enables interactive restaging of indoor scenes from a single panoramic image. Our approach combines multi-task deep learning with real-time rendering to extract geometric, semantic, and material information from cluttered scenes. The system includes: i) a vision transformer that simultaneously predicts depth, normals, semantics, albedo, and material properties; ii) spherical Gaussian lighting estimation; iii) real-time editing for interactive object placement; iv) stereoscopic Multi-Center-Of-Projection generation for Head Mounted Display exploration. The framework processes input through two pathways: extracting clutter-free representations for virtual staging and estimating material properties including metallic and roughness signals. We evaluate VISPI on Structured3D and FutureHouse datasets, demonstrating applications in real estate visualization, interior design, and virtual environment creation.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-02DOI: 10.1109/MCG.2025.3605266
Hao Yu, Longdu Liu, Shuangmin Chen, Shiqing Xin, Changhe Tu
In the field of digital orthodontics, dental models with complete roots are essential digital assets, particularly for visualization and treatment planning. However, intraoral scans typically capture only dental crowns, leaving roots missing. In this paper, we introduce a meticulously designed algorithmic pipeline to complete dental models while preserving crown geometry and mesh topology. Our pipeline begins with learning-based point cloud completion applied to existing dental crowns. We then reconstruct a complete tooth model, encompassing both the crown and root, to guide subsequent processing steps. Next, we restore the crown's original geometry and mesh topology using a strong Delaunay meshing structure; the correctness of this approach has been thoroughly established in existing literature. Finally, we optimize the transition region between crown and root using bi-harmonic smoothing. A key advantage of our approach is that the completed tooth model accurately maintains the geometry and mesh topology of the original crown, while also ensuring high-quality triangulation of dental roots.
{"title":"Tooth Completion and Reconstruction in Digital Orthodontics.","authors":"Hao Yu, Longdu Liu, Shuangmin Chen, Shiqing Xin, Changhe Tu","doi":"10.1109/MCG.2025.3605266","DOIUrl":"https://doi.org/10.1109/MCG.2025.3605266","url":null,"abstract":"<p><p>In the field of digital orthodontics, dental models with complete roots are essential digital assets, particularly for visualization and treatment planning. However, intraoral scans typically capture only dental crowns, leaving roots missing. In this paper, we introduce a meticulously designed algorithmic pipeline to complete dental models while preserving crown geometry and mesh topology. Our pipeline begins with learning-based point cloud completion applied to existing dental crowns. We then reconstruct a complete tooth model, encompassing both the crown and root, to guide subsequent processing steps. Next, we restore the crown's original geometry and mesh topology using a strong Delaunay meshing structure; the correctness of this approach has been thoroughly established in existing literature. Finally, we optimize the transition region between crown and root using bi-harmonic smoothing. A key advantage of our approach is that the completed tooth model accurately maintains the geometry and mesh topology of the original crown, while also ensuring high-quality triangulation of dental roots.</p>","PeriodicalId":55026,"journal":{"name":"IEEE Computer Graphics and Applications","volume":"PP ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144979308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}