首页 > 最新文献

arXiv - CS - Computer Vision and Pattern Recognition最新文献

英文 中文
LPT++: Efficient Training on Mixture of Long-tailed Experts LPT++:长尾专家混合物的高效训练
Pub Date : 2024-09-17 DOI: arxiv-2409.11323
Bowen Dong, Pan Zhou, Wangmeng Zuo
We introduce LPT++, a comprehensive framework for long-tailed classificationthat combines parameter-efficient fine-tuning (PEFT) with a learnable modelensemble. LPT++ enhances frozen Vision Transformers (ViTs) through theintegration of three core components. The first is a universal long-tailedadaptation module, which aggregates long-tailed prompts and visual adapters toadapt the pretrained model to the target domain, meanwhile improving itsdiscriminative ability. The second is the mixture of long-tailed expertsframework with a mixture-of-experts (MoE) scorer, which adaptively calculatesreweighting coefficients for confidence scores from both visual-only andvisual-language (VL) model experts to generate more accurate predictions.Finally, LPT++ employs a three-phase training framework, wherein each criticalmodule is learned separately, resulting in a stable and effective long-tailedclassification training paradigm. Besides, we also propose the simple versionof LPT++ namely LPT, which only integrates visual-only pretrained ViT andlong-tailed prompts to formulate a single model method. LPT can clearlyillustrate how long-tailed prompts works meanwhile achieving comparableperformance without VL pretrained models. Experiments show that, with only ~1%extra trainable parameters, LPT++ achieves comparable accuracy against all thecounterparts.
我们介绍了 LPT++,这是一种用于长尾分类的综合框架,它将参数高效微调(PEFT)与可学习的模型组合结合在一起。LPT++ 通过整合三个核心组件,增强了冷冻视觉转换器(ViTs)。第一个是通用长尾适应模块,它将长尾提示和视觉适配器聚合在一起,使预训练模型适应目标领域,同时提高其识别能力。其次是长尾专家混合框架(mixed of long-tailed expertsframework)和专家混合评分器(mixed-of-experts,MoE),该评分器可以自适应地计算来自纯视觉和视觉语言(VL)模型专家的置信度评分的加权系数,从而生成更准确的预测。最后,LPT++ 采用了三阶段训练框架,其中每个关键模块都是单独学习的,从而形成了稳定有效的长尾分类训练范式。此外,我们还提出了LPT++的简易版本,即LPT,它只集成了纯视觉预训练的ViT和长尾提示,形成了单一的模型方法。LPT 可以清楚地展示长尾提示是如何工作的,同时在没有 VL 预训练模型的情况下也能取得相当的性能。实验表明,只需增加 ~1% 的可训练参数,LPT++ 就能达到与所有同行相当的准确率。
{"title":"LPT++: Efficient Training on Mixture of Long-tailed Experts","authors":"Bowen Dong, Pan Zhou, Wangmeng Zuo","doi":"arxiv-2409.11323","DOIUrl":"https://doi.org/arxiv-2409.11323","url":null,"abstract":"We introduce LPT++, a comprehensive framework for long-tailed classification\u0000that combines parameter-efficient fine-tuning (PEFT) with a learnable model\u0000ensemble. LPT++ enhances frozen Vision Transformers (ViTs) through the\u0000integration of three core components. The first is a universal long-tailed\u0000adaptation module, which aggregates long-tailed prompts and visual adapters to\u0000adapt the pretrained model to the target domain, meanwhile improving its\u0000discriminative ability. The second is the mixture of long-tailed experts\u0000framework with a mixture-of-experts (MoE) scorer, which adaptively calculates\u0000reweighting coefficients for confidence scores from both visual-only and\u0000visual-language (VL) model experts to generate more accurate predictions.\u0000Finally, LPT++ employs a three-phase training framework, wherein each critical\u0000module is learned separately, resulting in a stable and effective long-tailed\u0000classification training paradigm. Besides, we also propose the simple version\u0000of LPT++ namely LPT, which only integrates visual-only pretrained ViT and\u0000long-tailed prompts to formulate a single model method. LPT can clearly\u0000illustrate how long-tailed prompts works meanwhile achieving comparable\u0000performance without VL pretrained models. Experiments show that, with only ~1%\u0000extra trainable parameters, LPT++ achieves comparable accuracy against all the\u0000counterparts.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TopoMaskV2: Enhanced Instance-Mask-Based Formulation for the Road Topology Problem TopoMaskV2:基于实例掩码的道路拓扑问题增强公式
Pub Date : 2024-09-17 DOI: arxiv-2409.11325
M. Esat Kalfaoglu, Halil Ibrahim Ozturk, Ozsel Kilinc, Alptekin Temizel
Recently, the centerline has become a popular representation of lanes due toits advantages in solving the road topology problem. To enhance centerlineprediction, we have developed a new approach called TopoMask. Unlike previousmethods that rely on keypoints or parametric methods, TopoMask utilizes aninstance-mask-based formulation coupled with a masked-attention-basedtransformer architecture. We introduce a quad-direction label representation toenrich the mask instances with flow information and design a correspondingpost-processing technique for mask-to-centerline conversion. Additionally, wedemonstrate that the instance-mask formulation provides complementaryinformation to parametric Bezier regressions, and fusing both outputs leads toimproved detection and topology performance. Moreover, we analyze theshortcomings of the pillar assumption in the Lift Splat technique and adapt amulti-height bin configuration. Experimental results show that TopoMaskachieves state-of-the-art performance in the OpenLane-V2 dataset, increasingfrom 44.1 to 49.4 for Subset-A and 44.7 to 51.8 for Subset-B in the V1.1 OLSbaseline.
最近,由于中线在解决道路拓扑问题方面的优势,它已成为一种流行的车道表示方法。为了加强中心线预测,我们开发了一种名为 TopoMask 的新方法。与之前依赖关键点或参数方法的方法不同,TopoMask 采用了基于实例掩码的表述方式,并结合了基于掩码-注意力的变换器架构。我们引入了四方向标签表示法,用流动信息丰富掩模实例,并设计了相应的掩模到中心线转换的后处理技术。此外,我们还证明了实例-掩模表述为参数贝塞尔回归提供了互补信息,融合这两种输出可提高检测和拓扑性能。此外,我们还分析了 Lift Splat 技术中支柱假设的缺点,并调整了多高度 bin 配置。实验结果表明,TopoMask 在 OpenLane-V2 数据集中达到了最先进的性能,在 V1.1 OLS 基线上,子集-A 的性能从 44.1 提高到 49.4,子集-B 的性能从 44.7 提高到 51.8。
{"title":"TopoMaskV2: Enhanced Instance-Mask-Based Formulation for the Road Topology Problem","authors":"M. Esat Kalfaoglu, Halil Ibrahim Ozturk, Ozsel Kilinc, Alptekin Temizel","doi":"arxiv-2409.11325","DOIUrl":"https://doi.org/arxiv-2409.11325","url":null,"abstract":"Recently, the centerline has become a popular representation of lanes due to\u0000its advantages in solving the road topology problem. To enhance centerline\u0000prediction, we have developed a new approach called TopoMask. Unlike previous\u0000methods that rely on keypoints or parametric methods, TopoMask utilizes an\u0000instance-mask-based formulation coupled with a masked-attention-based\u0000transformer architecture. We introduce a quad-direction label representation to\u0000enrich the mask instances with flow information and design a corresponding\u0000post-processing technique for mask-to-centerline conversion. Additionally, we\u0000demonstrate that the instance-mask formulation provides complementary\u0000information to parametric Bezier regressions, and fusing both outputs leads to\u0000improved detection and topology performance. Moreover, we analyze the\u0000shortcomings of the pillar assumption in the Lift Splat technique and adapt a\u0000multi-height bin configuration. Experimental results show that TopoMask\u0000achieves state-of-the-art performance in the OpenLane-V2 dataset, increasing\u0000from 44.1 to 49.4 for Subset-A and 44.7 to 51.8 for Subset-B in the V1.1 OLS\u0000baseline.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction fMRI-3D:用于增强基于 fMRI 的三维重建的综合数据集
Pub Date : 2024-09-17 DOI: arxiv-2409.11315
Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu
Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI)data, introduced as Recon3DMind in our conference work, is of significantinterest to both cognitive neuroscience and computer vision. To advance thistask, we present the fMRI-3D dataset, which includes data from 15 participantsand showcases a total of 4768 3D objects. The dataset comprises two components:fMRI-Shape, previously introduced and accessible athttps://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse,proposed in this paper and available athttps://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverseincludes data from 5 subjects, 4 of whom are also part of the Core set infMRI-Shape, with each subject viewing 3142 3D objects across 117 categories,all accompanied by text captions. This significantly enhances the diversity andpotential applications of the dataset. Additionally, we propose MinD-3D, anovel framework designed to decode 3D visual information from fMRI signals. Theframework first extracts and aggregates features from fMRI data using aneuro-fusion encoder, then employs a feature-bridge diffusion model to generatevisual features, and finally reconstructs the 3D object using a generativetransformer decoder. We establish new benchmarks by designing metrics at bothsemantic and structural levels to evaluate model performance. Furthermore, weassess our model's effectiveness in an Out-of-Distribution setting and analyzethe attribution of the extracted features and the visual ROIs in fMRI signals.Our experiments demonstrate that MinD-3D not only reconstructs 3D objects withhigh semantic and spatial accuracy but also deepens our understanding of howhuman brain processes 3D visual information. Project page at:https://jianxgao.github.io/MinD-3D.
从功能性磁共振成像(fMRI)数据中重建三维视觉效果,在我们的会议工作中被称为Recon3DMind,对认知神经科学和计算机视觉都具有重要意义。为了推进这项任务,我们推出了 fMRI-3D 数据集,其中包括 15 名参与者的数据,并展示了总共 4768 个三维对象。该数据集由两部分组成:fMRI-Shape(之前已介绍过,可访问https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape)和fMRI-Objaverse(本文提出,可访问https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse)。fMRI-Objaverse包括来自5位受试者的数据,其中4位也是核心集fMRI-Shape的一部分,每位受试者观看了117个类别的3142个三维物体,所有物体都配有文字说明。这大大增强了数据集的多样性和潜在应用。此外,我们还提出了 MinD-3D,一个旨在从 fMRI 信号中解码 3D 视觉信息的高级框架。该框架首先使用神经融合编码器从 fMRI 数据中提取和聚合特征,然后使用特征桥扩散模型生成视觉特征,最后使用生成式变换器解码器重建三维物体。我们设计了语义和结构两个层面的指标来评估模型性能,从而建立了新的基准。我们的实验证明,MinD-3D 不仅能以较高的语义和空间准确性重建 3D 物体,还能加深我们对人脑如何处理 3D 视觉信息的理解。项目页面:https://jianxgao.github.io/MinD-3D。
{"title":"fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction","authors":"Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu","doi":"arxiv-2409.11315","DOIUrl":"https://doi.org/arxiv-2409.11315","url":null,"abstract":"Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI)\u0000data, introduced as Recon3DMind in our conference work, is of significant\u0000interest to both cognitive neuroscience and computer vision. To advance this\u0000task, we present the fMRI-3D dataset, which includes data from 15 participants\u0000and showcases a total of 4768 3D objects. The dataset comprises two components:\u0000fMRI-Shape, previously introduced and accessible at\u0000https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse,\u0000proposed in this paper and available at\u0000https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverse\u0000includes data from 5 subjects, 4 of whom are also part of the Core set in\u0000fMRI-Shape, with each subject viewing 3142 3D objects across 117 categories,\u0000all accompanied by text captions. This significantly enhances the diversity and\u0000potential applications of the dataset. Additionally, we propose MinD-3D, a\u0000novel framework designed to decode 3D visual information from fMRI signals. The\u0000framework first extracts and aggregates features from fMRI data using a\u0000neuro-fusion encoder, then employs a feature-bridge diffusion model to generate\u0000visual features, and finally reconstructs the 3D object using a generative\u0000transformer decoder. We establish new benchmarks by designing metrics at both\u0000semantic and structural levels to evaluate model performance. Furthermore, we\u0000assess our model's effectiveness in an Out-of-Distribution setting and analyze\u0000the attribution of the extracted features and the visual ROIs in fMRI signals.\u0000Our experiments demonstrate that MinD-3D not only reconstructs 3D objects with\u0000high semantic and spatial accuracy but also deepens our understanding of how\u0000human brain processes 3D visual information. Project page at:\u0000https://jianxgao.github.io/MinD-3D.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion 菲迪亚斯利用参考增强扩散从文本、图像和三维条件创建三维内容的生成模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11406
Zhenwei Wang, Tengfei Wang, Zexin He, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau
In 3D modeling, designers often use an existing 3D model as a reference tocreate new ones. This practice has inspired the development of Phidias, a novelgenerative model that uses diffusion for reference-augmented 3D generation.Given an image, our method leverages a retrieved or user-provided 3D referencemodel to guide the generation process, thereby enhancing the generationquality, generalization ability, and controllability. Our model integratesthree key components: 1) meta-ControlNet that dynamically modulates theconditioning strength, 2) dynamic reference routing that mitigates misalignmentbetween the input image and 3D reference, and 3) self-reference augmentationsthat enable self-supervised training with a progressive curriculum.Collectively, these designs result in a clear improvement over existingmethods. Phidias establishes a unified framework for 3D generation using text,image, and 3D conditions with versatile applications.
在三维建模中,设计师经常使用现有的三维模型作为参考来创建新模型。我们的方法利用检索到的或用户提供的三维参考模型来指导生成过程,从而提高生成质量、泛化能力和可控性。我们的模型集成了三个关键组件:1)元控制网(meta-ControlNet)可动态调节条件强度;2)动态参考路由(dynamic reference routing)可减轻输入图像与三维参考之间的错位;3)自我参考增强(self-reference augmentation)可通过渐进式课程实现自我监督训练。Phidias 建立了一个利用文本、图像和三维条件进行三维生成的统一框架,具有广泛的应用前景。
{"title":"Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion","authors":"Zhenwei Wang, Tengfei Wang, Zexin He, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau","doi":"arxiv-2409.11406","DOIUrl":"https://doi.org/arxiv-2409.11406","url":null,"abstract":"In 3D modeling, designers often use an existing 3D model as a reference to\u0000create new ones. This practice has inspired the development of Phidias, a novel\u0000generative model that uses diffusion for reference-augmented 3D generation.\u0000Given an image, our method leverages a retrieved or user-provided 3D reference\u0000model to guide the generation process, thereby enhancing the generation\u0000quality, generalization ability, and controllability. Our model integrates\u0000three key components: 1) meta-ControlNet that dynamically modulates the\u0000conditioning strength, 2) dynamic reference routing that mitigates misalignment\u0000between the input image and 3D reference, and 3) self-reference augmentations\u0000that enable self-supervised training with a progressive curriculum.\u0000Collectively, these designs result in a clear improvement over existing\u0000methods. Phidias establishes a unified framework for 3D generation using text,\u0000image, and 3D conditions with versatile applications.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RenderWorld: World Model with Self-Supervised 3D Label RenderWorld:带有自监督 3D 标签的世界模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11356
Ziyang Yan, Wenzhen Dong, Yihua Shao, Yuhang Lu, Liu Haiyang, Jingwen Liu, Haozhe Wang, Zhe Wang, Yan Wang, Fabio Remondino, Yuexin Ma
End-to-end autonomous driving with vision-only is not only morecost-effective compared to LiDAR-vision fusion but also more reliable thantraditional methods. To achieve a economical and robust purely visualautonomous driving system, we propose RenderWorld, a vision-only end-to-endautonomous driving framework, which generates 3D occupancy labels using aself-supervised gaussian-based Img2Occ Module, then encodes the labels byAM-VAE, and uses world model for forecasting and planning. RenderWorld employsGaussian Splatting to represent 3D scenes and render 2D images greatly improvessegmentation accuracy and reduces GPU memory consumption compared withNeRF-based methods. By applying AM-VAE to encode air and non-air separately,RenderWorld achieves more fine-grained scene element representation, leading tostate-of-the-art performance in both 4D occupancy forecasting and motionplanning from autoregressive world model.
与激光雷达-视觉融合相比,纯视觉端到端自动驾驶不仅更具成本效益,而且比传统方法更可靠。为了实现经济、稳健的纯视觉自动驾驶系统,我们提出了纯视觉端到端自动驾驶框架 RenderWorld,它使用基于高斯的自我监督 Img2Occ 模块生成三维占位标签,然后通过AM-VAE 对标签进行编码,并使用世界模型进行预测和规划。与基于核射频的方法相比,RenderWorld 采用高斯拼接法来表示三维场景和渲染二维图像,大大提高了分割精度并减少了 GPU 内存消耗。通过应用 AM-VAE 对空气和非空气进行单独编码,RenderWorld 实现了更精细的场景元素表示,从而在 4D 占用率预测和自回归世界模型的运动规划方面都达到了最先进的性能。
{"title":"RenderWorld: World Model with Self-Supervised 3D Label","authors":"Ziyang Yan, Wenzhen Dong, Yihua Shao, Yuhang Lu, Liu Haiyang, Jingwen Liu, Haozhe Wang, Zhe Wang, Yan Wang, Fabio Remondino, Yuexin Ma","doi":"arxiv-2409.11356","DOIUrl":"https://doi.org/arxiv-2409.11356","url":null,"abstract":"End-to-end autonomous driving with vision-only is not only more\u0000cost-effective compared to LiDAR-vision fusion but also more reliable than\u0000traditional methods. To achieve a economical and robust purely visual\u0000autonomous driving system, we propose RenderWorld, a vision-only end-to-end\u0000autonomous driving framework, which generates 3D occupancy labels using a\u0000self-supervised gaussian-based Img2Occ Module, then encodes the labels by\u0000AM-VAE, and uses world model for forecasting and planning. RenderWorld employs\u0000Gaussian Splatting to represent 3D scenes and render 2D images greatly improves\u0000segmentation accuracy and reduces GPU memory consumption compared with\u0000NeRF-based methods. By applying AM-VAE to encode air and non-air separately,\u0000RenderWorld achieves more fine-grained scene element representation, leading to\u0000state-of-the-art performance in both 4D occupancy forecasting and motion\u0000planning from autoregressive world model.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module GS-Net:可通用的即插即用 3D 高斯拼接模块
Pub Date : 2024-09-17 DOI: arxiv-2409.11307
Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li
3D Gaussian Splatting (3DGS) integrates the strengths of primitive-basedrepresentations and volumetric rendering techniques, enabling real-time,high-quality rendering. However, 3DGS models typically overfit to single-scenetraining and are highly sensitive to the initialization of Gaussian ellipsoids,heuristically derived from Structure from Motion (SfM) point clouds, whichlimits both generalization and practicality. To address these limitations, wepropose GS-Net, a generalizable, plug-and-play 3DGS module that densifiesGaussian ellipsoids from sparse SfM point clouds, enhancing geometric structurerepresentation. To the best of our knowledge, GS-Net is the first plug-and-play3DGS module with cross-scene generalization capabilities. Additionally, weintroduce the CARLA-NVS dataset, which incorporates additional cameraviewpoints to thoroughly evaluate reconstruction and rendering quality.Extensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNRimprovement of 2.08 dB for conventional viewpoints and 1.86 dB for novelviewpoints, confirming the method's effectiveness and robustness.
3D Gaussian Splatting(3DGS)集成了基于基元的表示和体积渲染技术的优势,可实现实时、高质量的渲染。然而,3DGS 模型通常会过度适应单场景训练,并且对高斯椭圆的初始化高度敏感,而高斯椭圆是从运动结构(SfM)点云中启发式导出的,这限制了其通用性和实用性。为了解决这些局限性,我们提出了 GS-Net,这是一种可通用、即插即用的 3DGS 模块,可从稀疏的 SfM 点云中密集化高斯椭圆,从而增强几何结构表示。据我们所知,GS-Net 是第一个具有跨场景通用能力的即插即用 3DGS 模块。此外,我们还引入了 CARLA-NVS 数据集,该数据集包含了额外的摄像机视点,可全面评估重建和渲染质量。大量实验证明,将 GS-Net 应用于 3DGS 可使传统视点的 PSNR 提高 2.08 dB,新视点的 PSNR 提高 1.86 dB,从而证实了该方法的有效性和鲁棒性。
{"title":"GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module","authors":"Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li","doi":"arxiv-2409.11307","DOIUrl":"https://doi.org/arxiv-2409.11307","url":null,"abstract":"3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based\u0000representations and volumetric rendering techniques, enabling real-time,\u0000high-quality rendering. However, 3DGS models typically overfit to single-scene\u0000training and are highly sensitive to the initialization of Gaussian ellipsoids,\u0000heuristically derived from Structure from Motion (SfM) point clouds, which\u0000limits both generalization and practicality. To address these limitations, we\u0000propose GS-Net, a generalizable, plug-and-play 3DGS module that densifies\u0000Gaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure\u0000representation. To the best of our knowledge, GS-Net is the first plug-and-play\u00003DGS module with cross-scene generalization capabilities. Additionally, we\u0000introduce the CARLA-NVS dataset, which incorporates additional camera\u0000viewpoints to thoroughly evaluate reconstruction and rendering quality.\u0000Extensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR\u0000improvement of 2.08 dB for conventional viewpoints and 1.86 dB for novel\u0000viewpoints, confirming the method's effectiveness and robustness.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping MSDNet:通过变压器引导的原型设计实现少镜头语义分割的多尺度解码器
Pub Date : 2024-09-17 DOI: arxiv-2409.11316
Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh
Few-shot Semantic Segmentation addresses the challenge of segmenting objectsin query images with only a handful of annotated examples. However, manyprevious state-of-the-art methods either have to discard intricate localsemantic features or suffer from high computational complexity. To addressthese challenges, we propose a new Few-shot Semantic Segmentation frameworkbased on the transformer architecture. Our approach introduces the spatialtransformer decoder and the contextual mask generation module to improve therelational understanding between support and query images. Moreover, weintroduce a multi-scale decoder to refine the segmentation mask byincorporating features from different resolutions in a hierarchical manner.Additionally, our approach integrates global features from intermediate encoderstages to improve contextual understanding, while maintaining a lightweightstructure to reduce complexity. This balance between performance and efficiencyenables our method to achieve state-of-the-art results on benchmark datasetssuch as $PASCAL-5^i$ and $COCO-20^i$ in both 1-shot and 5-shot settings.Notably, our model with only 1.5 million parameters demonstrates competitiveperformance while overcoming limitations of existing methodologies.https://github.com/amirrezafateh/MSDNet
少镜头语义分割(Few-shot Semantic Segmentation)解决了在只有少量注释示例的情况下对查询图像中的物体进行分割的难题。然而,许多先前的先进方法要么不得不放弃错综复杂的局部语义特征,要么存在计算复杂度高的问题。为了应对这些挑战,我们提出了一种基于变换器架构的全新 "少镜头语义分割 "框架。我们的方法引入了空间变换器解码器和上下文掩码生成模块,以提高支持图像和查询图像之间的关联理解。此外,我们还引入了多尺度解码器,以分层的方式整合来自不同分辨率的特征,从而完善分割掩码。此外,我们的方法还整合了来自中间编码器阶段的全局特征,以提高上下文理解能力,同时保持轻量级结构以降低复杂性。这种性能与效率之间的平衡使我们的方法在基准数据集上取得了最先进的结果,例如在 1 次拍摄和 5 次拍摄设置中的 $PASCAL-5^i$ 和 $COCO-20^i$ 。值得注意的是,我们的模型只有 150 万个参数,在克服现有方法局限性的同时,还展示了具有竞争力的性能。https://github.com/amirrezafateh/MSDNet。
{"title":"MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping","authors":"Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh","doi":"arxiv-2409.11316","DOIUrl":"https://doi.org/arxiv-2409.11316","url":null,"abstract":"Few-shot Semantic Segmentation addresses the challenge of segmenting objects\u0000in query images with only a handful of annotated examples. However, many\u0000previous state-of-the-art methods either have to discard intricate local\u0000semantic features or suffer from high computational complexity. To address\u0000these challenges, we propose a new Few-shot Semantic Segmentation framework\u0000based on the transformer architecture. Our approach introduces the spatial\u0000transformer decoder and the contextual mask generation module to improve the\u0000relational understanding between support and query images. Moreover, we\u0000introduce a multi-scale decoder to refine the segmentation mask by\u0000incorporating features from different resolutions in a hierarchical manner.\u0000Additionally, our approach integrates global features from intermediate encoder\u0000stages to improve contextual understanding, while maintaining a lightweight\u0000structure to reduce complexity. This balance between performance and efficiency\u0000enables our method to achieve state-of-the-art results on benchmark datasets\u0000such as $PASCAL-5^i$ and $COCO-20^i$ in both 1-shot and 5-shot settings.\u0000Notably, our model with only 1.5 million parameters demonstrates competitive\u0000performance while overcoming limitations of existing methodologies.\u0000https://github.com/amirrezafateh/MSDNet","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty and Prediction Quality Estimation for Semantic Segmentation via Graph Neural Networks 通过图神经网络估计语义分割的不确定性和预测质量
Pub Date : 2024-09-17 DOI: arxiv-2409.11373
Edgar Heinert, Stephan Tilgner, Timo Palm, Matthias Rottmann
When employing deep neural networks (DNNs) for semantic segmentation insafety-critical applications like automotive perception or medical imaging, itis important to estimate their performance at runtime, e.g. via uncertaintyestimates or prediction quality estimates. Previous works mostly performeduncertainty estimation on pixel-level. In a line of research, aconnected-component-wise (segment-wise) perspective was taken, approachinguncertainty estimation on an object-level by performing so-called metaclassification and regression to estimate uncertainty and prediction quality,respectively. In those works, each predicted segment is considered individuallyto estimate its uncertainty or prediction quality. However, the neighboringsegments may provide additional hints on whether a given predicted segment isof high quality, which we study in the present work. On the basis ofuncertainty indicating metrics on segment-level, we use graph neural networks(GNNs) to model the relationship of a given segment's quality as a function ofthe given segment's metrics as well as those of its neighboring segments. Wecompare different GNN architectures and achieve a notable performanceimprovement.
在汽车感知或医疗成像等对安全至关重要的应用中使用深度神经网络(DNN)进行语义分割时,必须在运行时对其性能进行估计,例如通过不确定性估计或预测质量估计。以往的研究大多在像素级进行不确定性估计。有研究从连接组件(段)的角度出发,通过执行所谓的元分类和回归来分别估计不确定性和预测质量,从而接近对象级的不确定性估计。在这些研究中,每个预测段都被单独考虑,以估计其不确定性或预测质量。然而,相邻的片段可能会提供额外的提示,说明给定的预测片段是否具有高质量,我们在本研究中将对此进行研究。在分段级不确定性指示度量的基础上,我们使用图神经网络(GNN)来模拟给定分段质量与给定分段度量及其相邻分段度量之间的关系。我们对不同的图神经网络架构进行了比较,并取得了显著的性能提升。
{"title":"Uncertainty and Prediction Quality Estimation for Semantic Segmentation via Graph Neural Networks","authors":"Edgar Heinert, Stephan Tilgner, Timo Palm, Matthias Rottmann","doi":"arxiv-2409.11373","DOIUrl":"https://doi.org/arxiv-2409.11373","url":null,"abstract":"When employing deep neural networks (DNNs) for semantic segmentation in\u0000safety-critical applications like automotive perception or medical imaging, it\u0000is important to estimate their performance at runtime, e.g. via uncertainty\u0000estimates or prediction quality estimates. Previous works mostly performed\u0000uncertainty estimation on pixel-level. In a line of research, a\u0000connected-component-wise (segment-wise) perspective was taken, approaching\u0000uncertainty estimation on an object-level by performing so-called meta\u0000classification and regression to estimate uncertainty and prediction quality,\u0000respectively. In those works, each predicted segment is considered individually\u0000to estimate its uncertainty or prediction quality. However, the neighboring\u0000segments may provide additional hints on whether a given predicted segment is\u0000of high quality, which we study in the present work. On the basis of\u0000uncertainty indicating metrics on segment-level, we use graph neural networks\u0000(GNNs) to model the relationship of a given segment's quality as a function of\u0000the given segment's metrics as well as those of its neighboring segments. We\u0000compare different GNN architectures and achieve a notable performance\u0000improvement.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 微调图像条件扩散模型比想象中更容易
Pub Date : 2024-09-17 DOI: arxiv-2409.11355
Gonzalo Martin Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans, Bastian Leibe
Recent work showed that large diffusion models can be reused as highlyprecise monocular depth estimators by casting depth estimation as animage-conditional image generation task. While the proposed model achievedstate-of-the-art results, high computational demands due to multi-stepinference limited its use in many scenarios. In this paper, we show that theperceived inefficiency was caused by a flaw in the inference pipeline that hasso far gone unnoticed. The fixed model performs comparably to the bestpreviously reported configuration while being more than 200$times$ faster. Tooptimize for downstream task performance, we perform end-to-end fine-tuning ontop of the single-step model with task-specific losses and get a deterministicmodel that outperforms all other diffusion-based depth and normal estimationmodels on common zero-shot benchmarks. We surprisingly find that thisfine-tuning protocol also works directly on Stable Diffusion and achievescomparable performance to current state-of-the-art diffusion-based depth andnormal estimation models, calling into question some of the conclusions drawnfrom prior works.
最近的研究表明,通过将深度估算作为动画条件图像生成任务,大型扩散模型可作为高精度单目深度估算器重复使用。虽然提出的模型取得了最先进的结果,但多步推理导致的高计算需求限制了它在许多场景中的应用。在本文中,我们证明了人们认为的低效率是由推理流水线中的缺陷造成的,而这一缺陷至今未被注意到。固定模型的性能与之前报道的最佳配置相当,而速度却快了 200 多倍。为了优化下游任务的性能,我们在具有特定任务损失的单步模型基础上进行了端到端的微调,得到了一个确定性模型,该模型在常见的零点基准上优于所有其他基于扩散的深度和法线估计模型。我们惊奇地发现,这种微调协议也能直接在稳定扩散模型上运行,并取得了与当前最先进的基于扩散的深度和法线估计模型相媲美的性能,这让我们对之前研究得出的一些结论产生了质疑。
{"title":"Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think","authors":"Gonzalo Martin Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans, Bastian Leibe","doi":"arxiv-2409.11355","DOIUrl":"https://doi.org/arxiv-2409.11355","url":null,"abstract":"Recent work showed that large diffusion models can be reused as highly\u0000precise monocular depth estimators by casting depth estimation as an\u0000image-conditional image generation task. While the proposed model achieved\u0000state-of-the-art results, high computational demands due to multi-step\u0000inference limited its use in many scenarios. In this paper, we show that the\u0000perceived inefficiency was caused by a flaw in the inference pipeline that has\u0000so far gone unnoticed. The fixed model performs comparably to the best\u0000previously reported configuration while being more than 200$times$ faster. To\u0000optimize for downstream task performance, we perform end-to-end fine-tuning on\u0000top of the single-step model with task-specific losses and get a deterministic\u0000model that outperforms all other diffusion-based depth and normal estimation\u0000models on common zero-shot benchmarks. We surprisingly find that this\u0000fine-tuning protocol also works directly on Stable Diffusion and achieves\u0000comparable performance to current state-of-the-art diffusion-based depth and\u0000normal estimation models, calling into question some of the conclusions drawn\u0000from prior works.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OmniGen: Unified Image Generation OmniGen:统一图像生成
Pub Date : 2024-09-17 DOI: arxiv-2409.11340
Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan, Xingrun Xing, Ruiran Yan, Shuting Wang, Tiejun Huang, Zheng Liu
In this work, we introduce OmniGen, a new diffusion model for unified imagegeneration. Unlike popular diffusion models (e.g., Stable Diffusion), OmniGenno longer requires additional modules such as ControlNet or IP-Adapter toprocess diverse control conditions. OmniGenis characterized by the followingfeatures: 1) Unification: OmniGen not only demonstrates text-to-imagegeneration capabilities but also inherently supports other downstream tasks,such as image editing, subject-driven generation, and visual-conditionalgeneration. Additionally, OmniGen can handle classical computer vision tasks bytransforming them into image generation tasks, such as edge detection and humanpose recognition. 2) Simplicity: The architecture of OmniGen is highlysimplified, eliminating the need for additional text encoders. Moreover, it ismore user-friendly compared to existing diffusion models, enabling complextasks to be accomplished through instructions without the need for extrapreprocessing steps (e.g., human pose estimation), thereby significantlysimplifying the workflow of image generation. 3) Knowledge Transfer: Throughlearning in a unified format, OmniGen effectively transfers knowledge acrossdifferent tasks, manages unseen tasks and domains, and exhibits novelcapabilities. We also explore the model's reasoning capabilities and potentialapplications of chain-of-thought mechanism. This work represents the firstattempt at a general-purpose image generation model, and there remain severalunresolved issues. We will open-source the related resources athttps://github.com/VectorSpaceLab/OmniGen to foster advancements in this field.
在这项工作中,我们介绍了用于统一图像生成的新型扩散模型 OmniGen。与流行的扩散模型(如稳定扩散)不同,OmniGen 不再需要控制网或 IP 适配器等附加模块来处理不同的控制条件。OmniGen 具有以下特点:1) 统一性:OmniGen 不仅展示了文本到图像的生成功能,而且还固有地支持其他下游任务,如图像编辑、主题驱动生成和视觉条件生成。此外,OmniGen 还能处理经典的计算机视觉任务,将其转换为图像生成任务,如边缘检测和人脸识别。2) 简单性:OmniGen 的架构高度简化,无需额外的文本编码器。此外,与现有的扩散模型相比,OmniGen 对用户更加友好,通过指令即可完成全部任务,无需额外的预处理步骤(如人体姿态估计),从而大大简化了图像生成的工作流程。3) 知识转移:通过以统一格式进行学习,OmniGen 可有效地在不同任务间转移知识,管理未见过的任务和领域,并展现出新颖的能力。我们还探索了模型的推理能力和思维链机制的潜在应用。这项工作是对通用图像生成模型的首次尝试,目前仍有几个问题尚未解决。我们将在 https://github.com/VectorSpaceLab/OmniGen 上开源相关资源,以促进该领域的进步。
{"title":"OmniGen: Unified Image Generation","authors":"Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan, Xingrun Xing, Ruiran Yan, Shuting Wang, Tiejun Huang, Zheng Liu","doi":"arxiv-2409.11340","DOIUrl":"https://doi.org/arxiv-2409.11340","url":null,"abstract":"In this work, we introduce OmniGen, a new diffusion model for unified image\u0000generation. Unlike popular diffusion models (e.g., Stable Diffusion), OmniGen\u0000no longer requires additional modules such as ControlNet or IP-Adapter to\u0000process diverse control conditions. OmniGenis characterized by the following\u0000features: 1) Unification: OmniGen not only demonstrates text-to-image\u0000generation capabilities but also inherently supports other downstream tasks,\u0000such as image editing, subject-driven generation, and visual-conditional\u0000generation. Additionally, OmniGen can handle classical computer vision tasks by\u0000transforming them into image generation tasks, such as edge detection and human\u0000pose recognition. 2) Simplicity: The architecture of OmniGen is highly\u0000simplified, eliminating the need for additional text encoders. Moreover, it is\u0000more user-friendly compared to existing diffusion models, enabling complex\u0000tasks to be accomplished through instructions without the need for extra\u0000preprocessing steps (e.g., human pose estimation), thereby significantly\u0000simplifying the workflow of image generation. 3) Knowledge Transfer: Through\u0000learning in a unified format, OmniGen effectively transfers knowledge across\u0000different tasks, manages unseen tasks and domains, and exhibits novel\u0000capabilities. We also explore the model's reasoning capabilities and potential\u0000applications of chain-of-thought mechanism. This work represents the first\u0000attempt at a general-purpose image generation model, and there remain several\u0000unresolved issues. We will open-source the related resources at\u0000https://github.com/VectorSpaceLab/OmniGen to foster advancements in this field.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1