首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
Choreographing multi-degree of freedom behaviors in large-scale crowd simulations 在大规模人群模拟中编排多自由度行为
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cag.2024.104051
Kexiang Huang , Gangyi Ding , Dapeng Yan , Ruida Tang , Tianyu Huang , Nuria Pelechano

This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.

本研究介绍了一种在大规模人群模拟中编排多自由度(MDoF)行为的新型框架。该框架将多目标优化与时空排序相结合,可有效生成和控制多种 MDoF 人群行为状态。我们提出了一套评估人群状态美学质量的评价标准,并采用多目标优化来生成符合这些标准的人群状态。此外,我们还引入了时间偏移函数和插值进度函数,以执行复杂多样的行为状态插值。此外,我们还设计了一个以用户为中心的交互模块,通过草图、样条曲线和其他交互方式,直观灵活地调整人群行为状态。对评估标准的定性测试和定量实验证明了这种方法在生成和控制人群 MDoF 行为方面的有效性。最后,包括 2022 年北京冬奥会开幕式实际应用在内的案例研究验证了这种方法的实用性和适应性。
{"title":"Choreographing multi-degree of freedom behaviors in large-scale crowd simulations","authors":"Kexiang Huang ,&nbsp;Gangyi Ding ,&nbsp;Dapeng Yan ,&nbsp;Ruida Tang ,&nbsp;Tianyu Huang ,&nbsp;Nuria Pelechano","doi":"10.1016/j.cag.2024.104051","DOIUrl":"10.1016/j.cag.2024.104051","url":null,"abstract":"<div><p>This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104051"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images US-Net:采用卷积注意机制的 U 型网络,用于超声医学图像
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cag.2024.104054
Xiaoyu Xie , Pingping Liu , Yijun Lang , Zhenjie Guo , Zhongxi Yang , Yuhao Zhao

Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.

超声成像的特点是对比度低、噪声大、周围组织干扰多,这给病变分割带来了巨大挑战。为了解决这些问题,我们引入了一种增强型 U 形网络,该网络结合了多种新功能,可实现精确的自动分割。首先,我们的模型利用基于卷积的自注意机制在特征图中建立长程依赖关系,这对小数据集应用至关重要,同时还采用了软阈值方法来降低噪声。其次,我们采用多大小卷积核来丰富特征处理,并结合曲率计算,通过软关注方法突出边缘细节。第三,在 UNet 架构中实施了先进的跳转连接策略,整合信息熵来评估和利用纹理丰富的通道,从而在与解码器输出合并之前改善编码器中的语义细节。我们使用了一个新开发的数据集 VPUSI(血管斑块超声图像),以及已有的数据集 BUSI、TN3K 和 DDTI,对我们的方法进行了验证。在这些数据集上进行的对比实验表明,我们的模型在分割准确性上优于现有的最先进技术。
{"title":"US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images","authors":"Xiaoyu Xie ,&nbsp;Pingping Liu ,&nbsp;Yijun Lang ,&nbsp;Zhenjie Guo ,&nbsp;Zhongxi Yang ,&nbsp;Yuhao Zhao","doi":"10.1016/j.cag.2024.104054","DOIUrl":"10.1016/j.cag.2024.104054","url":null,"abstract":"<div><p>Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104054"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ShapeBench: A new approach to benchmarking local 3D shape descriptors ShapeBench:为本地三维形状描述符设定基准的新方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104052
Bart Iver van Blokland

The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.

ShapeBench 评估方法是对常用的精确度-召回曲线下面积(PRC/AUC)的扩展,用于测量局部三维形状描述符的匹配性能。据观察,PRC 在确定候选匹配是否为真匹配时,没有充分考虑相同或不同对象中的其他类似表面。为解决这一局限性,我们引入了新颖的描述符距离指数(DDI)指标。与以往识别给定场景中整个物体的评估方法不同,DDI 指标通过分析点到点的距离来衡量描述符的性能。此外,ShapeBench 方法通过使用程序生成,比以前的方法更具可扩展性。该基准可用于评估新旧描述符。该基准实施所产生的结果完全可以复制,并已公开发布。
{"title":"ShapeBench: A new approach to benchmarking local 3D shape descriptors","authors":"Bart Iver van Blokland","doi":"10.1016/j.cag.2024.104052","DOIUrl":"10.1016/j.cag.2024.104052","url":null,"abstract":"<div><p>The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104052"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001870/pdfft?md5=5829ea110e365c2d20b6d416c88f685a&pid=1-s2.0-S0097849324001870-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Transformer for 3D point clouds classification and semantic segmentation 用于三维点云分类和语义分割的图形变换器
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104050
Wei Zhou , Qian Wang , Weiwei Jin , Xinzhe Shi , Ying He

Recently, graph-based and Transformer-based deep learning have demonstrated excellent performances on various point cloud tasks. Most of the existing graph-based methods rely on static graph, which take a fixed input to establish graph relations. Moreover, many graph-based methods apply maximizing and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points own the same influence on the centroid’s feature, which ignoring the correlation and difference between points. Most Transformer-based approaches extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the above issues of graph-based and Transformer-based models, we propose a new feature extraction block named Graph Transformer and construct a 3D point cloud learning network called GTNet to learn features of point clouds on local and global patterns. Graph Transformer integrates the advantages of graph-based and Transformer-based methods, and consists of Local Transformer that use intra-domain cross-attention and Global Transformer that use global self-attention. Finally, we use GTNet for shape classification, part segmentation and semantic segmentation tasks in this paper. The experimental results show that our model achieves good learning and prediction ability on most tasks. The source code and pre-trained model of GTNet will be released on https://github.com/NWUzhouwei/GTNet.

最近,基于图的深度学习和基于变换器的深度学习在各种点云任务中表现出色。现有的基于图的方法大多依赖于静态图,即通过固定输入来建立图关系。此外,很多基于图的方法会对相邻点的特征进行最大化和平均化处理,从而导致只有单个相邻点会影响中心点的特征,或者不同相邻点对中心点特征的影响相同,从而忽略了点与点之间的相关性和差异性。大多数基于变换器的方法都是基于全局注意力提取点云特征,缺乏对局部邻点的特征学习。为了解决基于图和基于变换器模型的上述问题,我们提出了一种名为 Graph Transformer 的新特征提取模块,并构建了一个名为 GTNet 的三维点云学习网络,以学习局部和全局模式的点云特征。图形变换器集成了基于图形和基于变换器的方法的优点,由使用域内交叉注意的局部变换器和使用全局自注意的全局变换器组成。最后,本文将 GTNet 用于形状分类、部件分割和语义分割任务。实验结果表明,我们的模型在大多数任务上都取得了良好的学习和预测能力。GTNet 的源代码和预训练模型将在 https://github.com/NWUzhouwei/GTNet 上发布。
{"title":"Graph Transformer for 3D point clouds classification and semantic segmentation","authors":"Wei Zhou ,&nbsp;Qian Wang ,&nbsp;Weiwei Jin ,&nbsp;Xinzhe Shi ,&nbsp;Ying He","doi":"10.1016/j.cag.2024.104050","DOIUrl":"10.1016/j.cag.2024.104050","url":null,"abstract":"<div><p>Recently, graph-based and Transformer-based deep learning have demonstrated excellent performances on various point cloud tasks. Most of the existing graph-based methods rely on static graph, which take a fixed input to establish graph relations. Moreover, many graph-based methods apply maximizing and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points own the same influence on the centroid’s feature, which ignoring the correlation and difference between points. Most Transformer-based approaches extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the above issues of graph-based and Transformer-based models, we propose a new feature extraction block named Graph Transformer and construct a 3D point cloud learning network called GTNet to learn features of point clouds on local and global patterns. Graph Transformer integrates the advantages of graph-based and Transformer-based methods, and consists of Local Transformer that use intra-domain cross-attention and Global Transformer that use global self-attention. Finally, we use GTNet for shape classification, part segmentation and semantic segmentation tasks in this paper. The experimental results show that our model achieves good learning and prediction ability on most tasks. The source code and pre-trained model of GTNet will be released on <span><span>https://github.com/NWUzhouwei/GTNet</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104050"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing the effect of undermining on suture forces during simulated skin flap surgeries with a three-dimensional finite element method 用三维有限元方法分析在模拟皮瓣手术中破坏对缝合力的影响
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104057
Wenzhangzhi Guo , Allison Tsz Kwan Lau , Joel C. Davies , Vito Forte , Eitan Grinspun , Lueder Alexander Kahrs

Skin flaps are common procedures used by surgeons to cover an excised area during the reconstruction of a defect. It is often a challenging task for a surgeon to come up with the most optimal design for a patient. In this paper, we set up a simulation system based on the finite element method for one of the most common flap types — the rhomboid flap. Instead of using the standard 2D planar patch, we constructed a 3D patch with multiple layers. This allowed us to investigate the impact of different undermining areas and depths. We compared the suture forces for each case and identified vertices with the largest suture force. The shape of the final suture line is also visualized for each case, which is an important clue when deciding on the most optimal skin flap orientation according to medical textbooks. We found that under the optimal undermining setup, the maximum suture force is around 0.7 N for top of the undermined layer and 1.0 N for bottom of the undermined layer. When measuring difference in final suture line shape, the maximum normalized Hausdorff distance is 0.099, which suggests that different undermining region can have significant impact on the shape of the suture line, especially in the tail region. After analyzing the suture force plots, we provided recommendations on the most optimal undermining region for rhomboid flaps.

皮瓣是外科医生在重建缺损时用来覆盖切除区域的常用程序。对于外科医生来说,为患者设计出最理想的皮瓣往往是一项极具挑战性的任务。在本文中,我们针对最常见的皮瓣类型之一--斜方形皮瓣,建立了一个基于有限元法的模拟系统。我们没有使用标准的二维平面补片,而是构建了一个具有多层的三维补片。这样,我们就能研究不同破坏区域和深度的影响。我们比较了每个病例的缝合力,并确定了缝合力最大的顶点。每个病例最终缝合线的形状也可视化,这是在根据医学教科书决定最佳皮瓣方向时的重要线索。我们发现,在最佳埋线设置下,埋线层顶部的最大缝合力约为 0.7 N,埋线层底部的最大缝合力约为 1.0 N。在测量最终缝合线形状的差异时,最大归一化豪斯多夫距离为 0.099,这表明不同的破坏区域会对缝合线的形状产生显著影响,尤其是在尾部区域。在分析了缝合力图之后,我们对斜方形皮瓣的最佳下拉区域提出了建议。
{"title":"Analyzing the effect of undermining on suture forces during simulated skin flap surgeries with a three-dimensional finite element method","authors":"Wenzhangzhi Guo ,&nbsp;Allison Tsz Kwan Lau ,&nbsp;Joel C. Davies ,&nbsp;Vito Forte ,&nbsp;Eitan Grinspun ,&nbsp;Lueder Alexander Kahrs","doi":"10.1016/j.cag.2024.104057","DOIUrl":"10.1016/j.cag.2024.104057","url":null,"abstract":"<div><p>Skin flaps are common procedures used by surgeons to cover an excised area during the reconstruction of a defect. It is often a challenging task for a surgeon to come up with the most optimal design for a patient. In this paper, we set up a simulation system based on the finite element method for one of the most common flap types — the rhomboid flap. Instead of using the standard 2D planar patch, we constructed a 3D patch with multiple layers. This allowed us to investigate the impact of different undermining areas and depths. We compared the suture forces for each case and identified vertices with the largest suture force. The shape of the final suture line is also visualized for each case, which is an important clue when deciding on the most optimal skin flap orientation according to medical textbooks. We found that under the optimal undermining setup, the maximum suture force is around 0.7 N for top of the undermined layer and 1.0 N for bottom of the undermined layer. When measuring difference in final suture line shape, the maximum normalized Hausdorff distance is 0.099, which suggests that different undermining region can have significant impact on the shape of the suture line, especially in the tail region. After analyzing the suture force plots, we provided recommendations on the most optimal undermining region for rhomboid flaps.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104057"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001924/pdfft?md5=2b5edd39561a506e03eb5a66dbf3e9fc&pid=1-s2.0-S0097849324001924-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foreword to the special section on Shape Modeling International 2024 (SMI2024) 2024 年国际建模会议(SMI2024)特别会议前言
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104047
Georges-Pierre Bonneau, Tao Ju, Zichun Zhong
{"title":"Foreword to the special section on Shape Modeling International 2024 (SMI2024)","authors":"Georges-Pierre Bonneau,&nbsp;Tao Ju,&nbsp;Zichun Zhong","doi":"10.1016/j.cag.2024.104047","DOIUrl":"10.1016/j.cag.2024.104047","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104047"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142050396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OpenECAD: An efficient visual language model for editable 3D-CAD design OpenECAD:可编辑 3D CAD 设计的高效视觉语言模型
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104048
Zhe Yuan , Jianqi Shi , Yanhong Huang

Computer-aided design (CAD) tools are utilized in the manufacturing industry for modeling everything from cups to spacecraft. These programs are complex to use and typically require years of training and experience to master. Structured and well-constrained 2D sketches and 3D constructions are crucial components of CAD modeling. A well-executed CAD model can be seamlessly integrated into the manufacturing process, thereby enhancing production efficiency. Deep generative models of 3D shapes and 3D object reconstruction models have garnered significant research interest. However, most of these models produce discrete forms of 3D objects that are not editable. Moreover, the few models based on CAD operations often have substantial input restrictions. In this work, we fine-tuned pre-trained models to create OpenECAD models (0.55B, 0.89B, 2.4B and 3.1B), leveraging the visual, logical, coding, and general capabilities of visual language models. OpenECAD models can process images of 3D designs as input and generate highly structured 2D sketches and 3D construction commands, ensuring that the designs are editable. These outputs can be directly used with existing CAD tools’ APIs to generate project files. To train our network, we created a series of OpenECAD datasets. These datasets are derived from existing public CAD datasets, adjusted and augmented to meet the specific requirements of vision language model (VLM) training. Additionally, we have introduced an approach that utilizes dependency relationships to define and generate sketches, further enriching the content and functionality of the datasets.

在制造业中,计算机辅助设计(CAD)工具被用于从杯子到航天器的各种建模。这些程序使用复杂,通常需要多年的培训和经验才能掌握。结构严谨、约束良好的二维草图和三维结构是 CAD 建模的重要组成部分。执行良好的 CAD 模型可以无缝集成到制造流程中,从而提高生产效率。三维形状的深度生成模型和三维物体重构模型已经引起了广泛的研究兴趣。然而,这些模型大多生成不可编辑的离散三维物体。此外,少数基于 CAD 操作的模型往往有很大的输入限制。在这项工作中,我们利用视觉语言模型的视觉、逻辑、编码和通用能力,对预训练模型进行了微调,创建了 OpenECAD 模型(0.55B、0.89B、2.4B 和 3.1B)。OpenECAD 模型可以处理作为输入的三维设计图像,并生成高度结构化的二维草图和三维施工指令,确保设计可编辑。这些输出可直接用于现有 CAD 工具的 API,以生成项目文件。为了训练我们的网络,我们创建了一系列 OpenECAD 数据集。这些数据集来自现有的公共 CAD 数据集,并经过调整和增强,以满足视觉语言模型 (VLM) 训练的特定要求。此外,我们还引入了一种利用依赖关系来定义和生成草图的方法,进一步丰富了数据集的内容和功能。
{"title":"OpenECAD: An efficient visual language model for editable 3D-CAD design","authors":"Zhe Yuan ,&nbsp;Jianqi Shi ,&nbsp;Yanhong Huang","doi":"10.1016/j.cag.2024.104048","DOIUrl":"10.1016/j.cag.2024.104048","url":null,"abstract":"<div><p>Computer-aided design (CAD) tools are utilized in the manufacturing industry for modeling everything from cups to spacecraft. These programs are complex to use and typically require years of training and experience to master. Structured and well-constrained 2D sketches and 3D constructions are crucial components of CAD modeling. A well-executed CAD model can be seamlessly integrated into the manufacturing process, thereby enhancing production efficiency. Deep generative models of 3D shapes and 3D object reconstruction models have garnered significant research interest. However, most of these models produce discrete forms of 3D objects that are not editable. Moreover, the few models based on CAD operations often have substantial input restrictions. In this work, we fine-tuned pre-trained models to create OpenECAD models (0.55B, 0.89B, 2.4B and 3.1B), leveraging the visual, logical, coding, and general capabilities of visual language models. OpenECAD models can process images of 3D designs as input and generate highly structured 2D sketches and 3D construction commands, ensuring that the designs are editable. These outputs can be directly used with existing CAD tools’ APIs to generate project files. To train our network, we created a series of OpenECAD datasets. These datasets are derived from existing public CAD datasets, adjusted and augmented to meet the specific requirements of vision language model (VLM) training. Additionally, we have introduced an approach that utilizes dependency relationships to define and generate sketches, further enriching the content and functionality of the datasets.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104048"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foreword to the Special Section on XR Technologies for Healthcare and Wellbeing 为 "XR 技术促进医疗保健和福祉 "专题撰写的前言
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-20 DOI: 10.1016/j.cag.2024.104046
Anderson Maciel, Matias Volonte, Helena Mentis
{"title":"Foreword to the Special Section on XR Technologies for Healthcare and Wellbeing","authors":"Anderson Maciel,&nbsp;Matias Volonte,&nbsp;Helena Mentis","doi":"10.1016/j.cag.2024.104046","DOIUrl":"10.1016/j.cag.2024.104046","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104046"},"PeriodicalIF":2.5,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LSGRNet: Local Spatial Latent Geometric Relation Learning Network for 3D point cloud semantic segmentation LSGRNet:用于三维点云语义分割的本地空间潜在几何关系学习网络
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-20 DOI: 10.1016/j.cag.2024.104053
Liguo Luo, Jian Lu, Xiaogai Chen, Kaibing Zhang, Jian Zhou

In recent years, remarkable ability has been demonstrated by the Transformer model in capturing remote dependencies and improving point cloud segmentation performance. However, localized regions separated from conventional sampling architectures have resulted in the destruction of structural information of instances and a lack of exploration of potential geometric relationships between localized regions. To address this issue, a Local Spatial Latent Geometric Relation Learning Network (LSGRNet) is proposed in this paper, with the geometric properties of point clouds serving as a reference. Specifically, spatial transformation and gradient computation are performed on the local point cloud to uncover potential geometric relationships within the local neighborhood. Furthermore, a local relationship aggregator based on semantic and geometric relationships is constructed to enable the interaction of spatial geometric structure and information within the local neighborhood. Simultaneously, boundary interaction feature learning module is employed to learn the boundary information of the point cloud, aiming to better describe the local structure. The experimental results indicate that excellent segmentation performance is exhibited by the proposed LSGRNet in benchmark tests on the indoor datasets S3DIS and ScanNetV2, as well as the outdoor datasets SemanticKITTI and Semantic3D.

近年来,Transformer 模型在捕捉远程依赖关系和提高点云分割性能方面表现出了卓越的能力。然而,从传统采样架构中分离出来的局部区域破坏了实例的结构信息,并且缺乏对局部区域之间潜在几何关系的探索。为解决这一问题,本文提出了一种局部空间潜在几何关系学习网络(LSGRNet),以点云的几何属性作为参考。具体来说,对本地点云进行空间变换和梯度计算,以发现本地邻域内的潜在几何关系。此外,还构建了一个基于语义和几何关系的本地关系聚合器,以实现空间几何结构与本地邻域内信息的交互。同时,边界交互特征学习模块用于学习点云的边界信息,旨在更好地描述局部结构。实验结果表明,在室内数据集 S3DIS 和 ScanNetV2 以及室外数据集 SemanticKITTI 和 Semantic3D 的基准测试中,所提出的 LSGRNet 表现出了出色的分割性能。
{"title":"LSGRNet: Local Spatial Latent Geometric Relation Learning Network for 3D point cloud semantic segmentation","authors":"Liguo Luo,&nbsp;Jian Lu,&nbsp;Xiaogai Chen,&nbsp;Kaibing Zhang,&nbsp;Jian Zhou","doi":"10.1016/j.cag.2024.104053","DOIUrl":"10.1016/j.cag.2024.104053","url":null,"abstract":"<div><p>In recent years, remarkable ability has been demonstrated by the Transformer model in capturing remote dependencies and improving point cloud segmentation performance. However, localized regions separated from conventional sampling architectures have resulted in the destruction of structural information of instances and a lack of exploration of potential geometric relationships between localized regions. To address this issue, a Local Spatial Latent Geometric Relation Learning Network (LSGRNet) is proposed in this paper, with the geometric properties of point clouds serving as a reference. Specifically, spatial transformation and gradient computation are performed on the local point cloud to uncover potential geometric relationships within the local neighborhood. Furthermore, a local relationship aggregator based on semantic and geometric relationships is constructed to enable the interaction of spatial geometric structure and information within the local neighborhood. Simultaneously, boundary interaction feature learning module is employed to learn the boundary information of the point cloud, aiming to better describe the local structure. The experimental results indicate that excellent segmentation performance is exhibited by the proposed LSGRNet in benchmark tests on the indoor datasets S3DIS and ScanNetV2, as well as the outdoor datasets SemanticKITTI and Semantic3D.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104053"},"PeriodicalIF":2.5,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An impartial framework to investigate demosaicking input embedding options 调查去马赛克输入嵌入选项的公正框架
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-16 DOI: 10.1016/j.cag.2024.104044
Yan Niu , Xuanchen Li , Yang Tao , Bo Zhao

Convolutional Neural Networks (CNNs) have proven highly effective for demosaicking, transforming raw Color Filter Array (CFA) sensor samples into standard RGB images. Directly applying convolution to the CFA tensor can lead to misinterpretation of the color context, so existing demosaicking networks typically embed the CFA tensor into the Euclidean space before convolution. The most prevalent embedding options are Reordering and Pre-interpolation. However, it remains unclear which option is more advantageous for demosaicking. Moreover, no existing demosaicking network is suitable for conducting a fair comparison. As a result, in practice, the selection of these two embedding options is often based on intuition and heuristic approaches. This paper addresses the non-comparability between the two options and investigates whether pre-interpolation contributes additional knowledge to the demosaicking network. Based on rigorous mathematical derivation, we design pairs of end-to-end fully convolutional evaluation networks, ensuring that the performance difference between each pair of networks can be solely attributed to their differing CFA embedding strategies. Under strictly fair comparison conditions, we measure the performance contrast between the two embedding options across various scenarios. Our comprehensive evaluation reveals that the prior knowledge introduced by pre-interpolation benefits lightweight models. Additionally, pre-interpolation enhances the robustness to imaging artifacts for larger models. Our findings offer practical guidelines for designing imaging software or Image Signal Processors (ISPs) for RGB cameras.

事实证明,卷积神经网络(CNN)在去马赛克、将原始彩色滤波阵列(CFA)传感器样本转换为标准 RGB 图像方面非常有效。直接对 CFA 张量进行卷积会导致对色彩背景的误读,因此现有的去马赛克网络通常会在卷积之前将 CFA 张量嵌入欧几里得空间。最常用的嵌入方法是重新排序和预插值。然而,目前还不清楚哪种方案对去马赛克更有利。此外,现有的去马赛克网络都不适合进行公平的比较。因此,在实践中,这两种嵌入方案的选择往往基于直觉和启发式方法。本文针对这两种方案之间的不可比性,研究了预插值是否为去马赛克网络贡献了额外的知识。基于严格的数学推导,我们设计了一对端到端全卷积评估网络,确保每对网络之间的性能差异可以完全归因于它们不同的 CFA 嵌入策略。在严格公平的比较条件下,我们测量了两种嵌入方案在各种情况下的性能对比。我们的综合评估显示,预插值引入的先验知识有利于轻量级模型。此外,预内插法还能增强大型模型对成像伪影的稳健性。我们的研究结果为设计 RGB 相机的成像软件或图像信号处理器(ISP)提供了实用指南。
{"title":"An impartial framework to investigate demosaicking input embedding options","authors":"Yan Niu ,&nbsp;Xuanchen Li ,&nbsp;Yang Tao ,&nbsp;Bo Zhao","doi":"10.1016/j.cag.2024.104044","DOIUrl":"10.1016/j.cag.2024.104044","url":null,"abstract":"<div><p>Convolutional Neural Networks (CNNs) have proven highly effective for demosaicking, transforming raw Color Filter Array (CFA) sensor samples into standard RGB images. Directly applying convolution to the CFA tensor can lead to misinterpretation of the color context, so existing demosaicking networks typically embed the CFA tensor into the Euclidean space before convolution. The most prevalent embedding options are <em>Reordering</em> and <em>Pre-interpolation</em>. However, it remains unclear which option is more advantageous for demosaicking. Moreover, no existing demosaicking network is suitable for conducting a fair comparison. As a result, in practice, the selection of these two embedding options is often based on intuition and heuristic approaches. This paper addresses the non-comparability between the two options and investigates whether pre-interpolation contributes additional knowledge to the demosaicking network. Based on rigorous mathematical derivation, we design pairs of end-to-end fully convolutional evaluation networks, ensuring that the performance difference between each pair of networks can be solely attributed to their differing CFA embedding strategies. Under strictly fair comparison conditions, we measure the performance contrast between the two embedding options across various scenarios. Our comprehensive evaluation reveals that the prior knowledge introduced by pre-interpolation benefits lightweight models. Additionally, pre-interpolation enhances the robustness to imaging artifacts for larger models. Our findings offer practical guidelines for designing imaging software or Image Signal Processors (ISPs) for RGB cameras.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104044"},"PeriodicalIF":2.5,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142041334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1