首页 > 最新文献

arXiv - CS - Multimedia最新文献

英文 中文
PPVF: An Efficient Privacy-Preserving Online Video Fetching Framework with Correlated Differential Privacy PPVF:具有相关差异隐私的高效隐私保护在线视频获取框架
Pub Date : 2024-08-27 DOI: arxiv-2408.14735
Xianzhi Zhang, Yipeng Zhou, Di Wu, Quan Z. Sheng, Miao Hu, Linchang Xiao
Online video streaming has evolved into an integral component of thecontemporary Internet landscape. Yet, the disclosure of user requests presentsformidable privacy challenges. As users stream their preferred online videos,their requests are automatically seized by video content providers, potentiallyleaking users' privacy. Unfortunately, current protection methods are not well-suited to preservinguser request privacy from content providers while maintaining high-qualityonline video services. To tackle this challenge, we introduce a novelPrivacy-Preserving Video Fetching (PPVF) framework, which utilizes trusted edgedevices to pre-fetch and cache videos, ensuring the privacy of users' requestswhile optimizing the efficiency of edge caching. More specifically, we designPPVF with three core components: (1) textit{Online privacy budget scheduler},which employs a theoretically guaranteed online algorithm to selectnon-requested videos as candidates with assigned privacy budgets. Alternativevideos are chosen by an online algorithm that is theoretically guaranteed toconsider both video utilities and available privacy budgets. (2) textit{Noisyvideo request generator}, which generates redundant video requests (in additionto original ones) utilizing correlated differential privacy to obfuscaterequest privacy. (3) textit{Online video utility predictor}, which leveragesfederated learning to collaboratively evaluate video utility in an onlinefashion, aiding in video selection in (1) and noise generation in (2). Finally,we conduct extensive experiments using real-world video request traces fromTencent Video. The results demonstrate that PPVF effectively safeguards userrequest privacy while upholding high video caching performance.
在线视频流已经发展成为当代互联网不可或缺的组成部分。然而,用户请求的披露带来了棘手的隐私挑战。当用户流式传输他们喜欢的在线视频时,他们的请求会被视频内容提供商自动获取,从而可能泄露用户的隐私。遗憾的是,目前的保护方法无法在保持高质量在线视频服务的同时,从内容提供商处保护用户请求隐私。为了应对这一挑战,我们推出了一种新颖的隐私保护视频获取(PPVF)框架,它利用可信边缘设备预先获取和缓存视频,在优化边缘缓存效率的同时确保用户请求的隐私性。更具体地说,我们设计的 PPVF 有三个核心组件:(1) textit{在线隐私预算调度器},它采用理论上有保证的在线算法来选择非请求视频作为分配隐私预算的候选视频。替代视频由在线算法选择,该算法在理论上保证同时考虑视频效用和可用隐私预算。(2)textit{噪声视频请求生成器},利用相关差分隐私来混淆请求隐私,从而生成冗余视频请求(除原始请求外)。(3) textit{在线视频效用预测器},它利用联合学习以在线方式协作评估视频效用,帮助(1)中的视频选择和(2)中的噪声生成。最后,我们使用腾讯视频的真实视频请求跟踪进行了大量实验。结果表明,PPVF 能有效保护用户请求隐私,同时保持较高的视频缓存性能。
{"title":"PPVF: An Efficient Privacy-Preserving Online Video Fetching Framework with Correlated Differential Privacy","authors":"Xianzhi Zhang, Yipeng Zhou, Di Wu, Quan Z. Sheng, Miao Hu, Linchang Xiao","doi":"arxiv-2408.14735","DOIUrl":"https://doi.org/arxiv-2408.14735","url":null,"abstract":"Online video streaming has evolved into an integral component of the\u0000contemporary Internet landscape. Yet, the disclosure of user requests presents\u0000formidable privacy challenges. As users stream their preferred online videos,\u0000their requests are automatically seized by video content providers, potentially\u0000leaking users' privacy. Unfortunately, current protection methods are not well-suited to preserving\u0000user request privacy from content providers while maintaining high-quality\u0000online video services. To tackle this challenge, we introduce a novel\u0000Privacy-Preserving Video Fetching (PPVF) framework, which utilizes trusted edge\u0000devices to pre-fetch and cache videos, ensuring the privacy of users' requests\u0000while optimizing the efficiency of edge caching. More specifically, we design\u0000PPVF with three core components: (1) textit{Online privacy budget scheduler},\u0000which employs a theoretically guaranteed online algorithm to select\u0000non-requested videos as candidates with assigned privacy budgets. Alternative\u0000videos are chosen by an online algorithm that is theoretically guaranteed to\u0000consider both video utilities and available privacy budgets. (2) textit{Noisy\u0000video request generator}, which generates redundant video requests (in addition\u0000to original ones) utilizing correlated differential privacy to obfuscate\u0000request privacy. (3) textit{Online video utility predictor}, which leverages\u0000federated learning to collaboratively evaluate video utility in an online\u0000fashion, aiding in video selection in (1) and noise generation in (2). Finally,\u0000we conduct extensive experiments using real-world video request traces from\u0000Tencent Video. The results demonstrate that PPVF effectively safeguards user\u0000request privacy while upholding high video caching performance.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization 通过基于 CLIP 的直接优化重新审视图像字幕培训范式
Pub Date : 2024-08-26 DOI: arxiv-2408.14547
Nicholas Moratelli, Davide Caffagni, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
The conventional training approach for image captioning involves pre-traininga network using teacher forcing and subsequent fine-tuning with Self-CriticalSequence Training to maximize hand-crafted captioning metrics. However, whenattempting to optimize modern and higher-quality metrics like CLIP-Score andPAC-Score, this training method often encounters instability and fails toacquire the genuine descriptive capabilities needed to produce fluent andinformative captions. In this paper, we propose a new training paradigm termedDirect CLIP-Based Optimization (DiCO). Our approach jointly learns andoptimizes a reward model that is distilled from a learnable captioningevaluator with high human correlation. This is done by solving a weightedclassification problem directly inside the captioner. At the same time, DiCOprevents divergence from the original model, ensuring that fluency ismaintained. DiCO not only exhibits improved stability and enhanced quality inthe generated captions but also aligns more closely with human preferencescompared to existing methods, especially in modern metrics. Additionally, itmaintains competitive performance in traditional metrics. Our source code andtrained models are publicly available at https://github.com/aimagelab/DiCO.
图像字幕的传统训练方法包括使用教师强迫对网络进行预训练,然后使用自批判序列训练(Self-CriticalSequence Training)进行微调,以最大限度地提高手工制作的字幕指标。然而,当试图优化 CLIP-Score 和 PAC-Score 等现代和更高质量的指标时,这种训练方法往往会遇到不稳定性,无法获得制作流畅和有信息量的字幕所需的真正描述能力。在本文中,我们提出了一种新的训练范式,称为基于直接 CLIP 的优化(DiCO)。我们的方法可以共同学习和优化一个奖励模型,该模型是从具有高度人类相关性的可学习字幕评估器中提炼出来的。这是通过直接在字幕机中解决加权分类问题来实现的。与此同时,DiCO 还能避免与原始模型产生分歧,确保流畅性得以保持。DiCO 不仅提高了生成字幕的稳定性和质量,而且与现有方法相比,特别是在现代指标方面,更符合人类的偏好。此外,它在传统指标方面也保持了有竞争力的性能。我们的源代码和训练模型可在 https://github.com/aimagelab/DiCO 网站上公开获取。
{"title":"Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization","authors":"Nicholas Moratelli, Davide Caffagni, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara","doi":"arxiv-2408.14547","DOIUrl":"https://doi.org/arxiv-2408.14547","url":null,"abstract":"The conventional training approach for image captioning involves pre-training\u0000a network using teacher forcing and subsequent fine-tuning with Self-Critical\u0000Sequence Training to maximize hand-crafted captioning metrics. However, when\u0000attempting to optimize modern and higher-quality metrics like CLIP-Score and\u0000PAC-Score, this training method often encounters instability and fails to\u0000acquire the genuine descriptive capabilities needed to produce fluent and\u0000informative captions. In this paper, we propose a new training paradigm termed\u0000Direct CLIP-Based Optimization (DiCO). Our approach jointly learns and\u0000optimizes a reward model that is distilled from a learnable captioning\u0000evaluator with high human correlation. This is done by solving a weighted\u0000classification problem directly inside the captioner. At the same time, DiCO\u0000prevents divergence from the original model, ensuring that fluency is\u0000maintained. DiCO not only exhibits improved stability and enhanced quality in\u0000the generated captions but also aligns more closely with human preferences\u0000compared to existing methods, especially in modern metrics. Additionally, it\u0000maintains competitive performance in traditional metrics. Our source code and\u0000trained models are publicly available at https://github.com/aimagelab/DiCO.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital Fingerprinting on Multimedia: A Survey 多媒体数字指纹:调查
Pub Date : 2024-08-26 DOI: arxiv-2408.14155
Wendi Chen, Wensheng Gan, Philip S. Yu
The explosive growth of multimedia content in the digital economy era hasbrought challenges in content recognition, copyright protection, and datamanagement. As an emerging content management technology, perceptual hash-baseddigital fingerprints, serving as compact summaries of multimedia content, havebeen widely adopted for efficient multimedia content identification andretrieval across different modalities (e.g., text, image, video, audio),attracting significant attention from both academia and industry. Despite theincreasing applications of digital fingerprints, there is a lack of systematicand comprehensive literature review on multimedia digital fingerprints. Thissurvey aims to fill this gap and provide an important resource for researchersstudying the details and related advancements of multimedia digitalfingerprints. The survey first introduces the definition, characteristics, andrelated concepts (including hash functions, granularity, similarity measures,etc.) of digital fingerprints. It then focuses on analyzing and summarizing thealgorithms for extracting unimodal fingerprints of different types of digitalcontent, including text fingerprints, image fingerprints, video fingerprints,and audio fingerprints. Particularly, it provides an in-depth review andsummary of deep learning-based fingerprints. Additionally, the surveyelaborates on the various practical applications of digital fingerprints andoutlines the challenges and potential future research directions. The goal isto promote the continued development of multimedia digital fingerprintresearch.
数字经济时代多媒体内容的爆炸式增长给内容识别、版权保护和数据管理带来了挑战。作为一种新兴的内容管理技术,基于感知哈希的数字指纹作为多媒体内容的紧凑摘要,已被广泛应用于不同模式(如文本、图像、视频、音频)的高效多媒体内容识别和检索,引起了学术界和产业界的极大关注。尽管数字指纹的应用日益广泛,但目前还缺乏关于多媒体数字指纹的系统而全面的文献综述。本调查旨在填补这一空白,为研究多媒体数字指纹的细节和相关进展的研究人员提供重要资源。调查首先介绍了数字指纹的定义、特征和相关概念(包括哈希函数、粒度、相似性度量等)。然后重点分析和总结了提取不同类型数字内容(包括文本指纹、图像指纹、视频指纹和音频指纹)的单模态指纹的算法。特别是对基于深度学习的指纹进行了深入评述和总结。此外,调查报告还阐述了数字指纹的各种实际应用,并概述了面临的挑战和潜在的未来研究方向。目标是促进多媒体数字指纹研究的持续发展。
{"title":"Digital Fingerprinting on Multimedia: A Survey","authors":"Wendi Chen, Wensheng Gan, Philip S. Yu","doi":"arxiv-2408.14155","DOIUrl":"https://doi.org/arxiv-2408.14155","url":null,"abstract":"The explosive growth of multimedia content in the digital economy era has\u0000brought challenges in content recognition, copyright protection, and data\u0000management. As an emerging content management technology, perceptual hash-based\u0000digital fingerprints, serving as compact summaries of multimedia content, have\u0000been widely adopted for efficient multimedia content identification and\u0000retrieval across different modalities (e.g., text, image, video, audio),\u0000attracting significant attention from both academia and industry. Despite the\u0000increasing applications of digital fingerprints, there is a lack of systematic\u0000and comprehensive literature review on multimedia digital fingerprints. This\u0000survey aims to fill this gap and provide an important resource for researchers\u0000studying the details and related advancements of multimedia digital\u0000fingerprints. The survey first introduces the definition, characteristics, and\u0000related concepts (including hash functions, granularity, similarity measures,\u0000etc.) of digital fingerprints. It then focuses on analyzing and summarizing the\u0000algorithms for extracting unimodal fingerprints of different types of digital\u0000content, including text fingerprints, image fingerprints, video fingerprints,\u0000and audio fingerprints. Particularly, it provides an in-depth review and\u0000summary of deep learning-based fingerprints. Additionally, the survey\u0000elaborates on the various practical applications of digital fingerprints and\u0000outlines the challenges and potential future research directions. The goal is\u0000to promote the continued development of multimedia digital fingerprint\u0000research.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HABD: a houma alliance book ancient handwritten character recognition database HABD:侯马盟书古文字识别数据库
Pub Date : 2024-08-26 DOI: arxiv-2408.14084
Xiaoyu Yuan, Xiaohua Huang, Zibo Zhang, Yabo Sun
The Houma Alliance Book, one of history's earliest calligraphic examples, wasunearthed in the 1970s. These artifacts were meticulously organized,reproduced, and copied by the Shanxi Provincial Institute of Cultural Relics.However, because of their ancient origins and severe ink erosion, identifyingcharacters in the Houma Alliance Book is challenging, necessitating the use ofdigital technology. In this paper, we propose a new ancient handwrittencharacter recognition database for the Houma alliance book, along with a novelbenchmark based on deep learning architectures. More specifically, a collectionof 26,732 characters samples from the Houma Alliance Book were gathered,encompassing 327 different types of ancient characters through iterativeannotation. Furthermore, benchmark algorithms were proposed by combining fourdeep neural network classifiers with two data augmentation methods. Thisresearch provides valuable resources and technical support for further studieson the Houma Alliance Book and other ancient characters. This contributes toour understanding of ancient culture and history, as well as the preservationand inheritance of humanity's cultural heritage.
侯马盟书是历史上最早的书法作品之一,于 20 世纪 70 年代出土。山西省文物研究所对这些文物进行了精心的整理、复制和抄写。然而,由于侯马盟书出土年代久远,墨迹侵蚀严重,识别其中的字符具有很大的挑战性,因此有必要使用数字技术。在本文中,我们提出了一个新的侯马盟书古文字识别数据库,以及一个基于深度学习架构的新型基准。具体来说,通过迭代标注,我们从《后马盟书》中收集了 26732 个字符样本,涵盖了 327 种不同类型的古文字。此外,通过将四个深度神经网络分类器与两种数据增强方法相结合,提出了基准算法。这项研究为进一步研究《侯马盟书》及其他古文字提供了宝贵的资源和技术支持。这有助于我们了解古代文化和历史,保护和继承人类文化遗产。
{"title":"HABD: a houma alliance book ancient handwritten character recognition database","authors":"Xiaoyu Yuan, Xiaohua Huang, Zibo Zhang, Yabo Sun","doi":"arxiv-2408.14084","DOIUrl":"https://doi.org/arxiv-2408.14084","url":null,"abstract":"The Houma Alliance Book, one of history's earliest calligraphic examples, was\u0000unearthed in the 1970s. These artifacts were meticulously organized,\u0000reproduced, and copied by the Shanxi Provincial Institute of Cultural Relics.\u0000However, because of their ancient origins and severe ink erosion, identifying\u0000characters in the Houma Alliance Book is challenging, necessitating the use of\u0000digital technology. In this paper, we propose a new ancient handwritten\u0000character recognition database for the Houma alliance book, along with a novel\u0000benchmark based on deep learning architectures. More specifically, a collection\u0000of 26,732 characters samples from the Houma Alliance Book were gathered,\u0000encompassing 327 different types of ancient characters through iterative\u0000annotation. Furthermore, benchmark algorithms were proposed by combining four\u0000deep neural network classifiers with two data augmentation methods. This\u0000research provides valuable resources and technical support for further studies\u0000on the Houma Alliance Book and other ancient characters. This contributes to\u0000our understanding of ancient culture and history, as well as the preservation\u0000and inheritance of humanity's cultural heritage.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Localization of Synthetic Manipulations in Western Blot Images 在 Western 印迹图像中定位合成操作
Pub Date : 2024-08-25 DOI: arxiv-2408.13786
Anmol Manjunath, Viola Negroni, Sara Mandelli, Daniel Moreira, Paolo Bestagini
Recent breakthroughs in deep learning and generative systems havesignificantly fostered the creation of synthetic media, as well as the localalteration of real content via the insertion of highly realistic syntheticmanipulations. Local image manipulation, in particular, poses seriouschallenges to the integrity of digital content and societal trust. This problemis not only confined to multimedia data, but also extends to biological imagesincluded in scientific publications, like images depicting Western blots. Inthis work, we address the task of localizing synthetic manipulations in Westernblot images. To discriminate between pristine and synthetic pixels of ananalyzed image, we propose a synthetic detector that operates on small patchesextracted from the image. We aggregate patch contributions to estimate atampering heatmap, highlighting synthetic pixels out of pristine ones. Ourmethodology proves effective when tested over two manipulated Western blotimage datasets, one altered automatically and the other manually by exploitingadvanced AI-based image manipulation tools that are unknown at our trainingstage. We also explore the robustness of our method over an external dataset ofother scientific images depicting different semantics, manipulated throughunseen generation techniques.
最近在深度学习和生成系统方面取得的突破极大地促进了合成媒体的创建,以及通过插入高度逼真的合成处理对真实内容进行局部篡改。局部图像处理尤其对数字内容的完整性和社会信任构成了严重挑战。这个问题不仅局限于多媒体数据,还延伸到科学出版物中的生物图像,如描述 Western 印迹的图像。在这项工作中,我们解决了在 Western 印迹图像中定位合成操作的任务。为了区分分析图像中的原始像素和合成像素,我们提出了一种合成检测器,该检测器对从图像中提取的小补丁进行检测。我们汇集补丁贡献来估算篡改热图,从原始像素中突出合成像素。我们的方法在两个经过处理的 Western 印迹图像数据集上进行了测试,证明是有效的,其中一个数据集是自动修改的,另一个数据集是利用先进的人工智能图像处理工具手动修改的,而这些工具在我们的训练阶段是未知的。我们还探索了我们的方法在外部数据集上的鲁棒性,这些外部数据集包含了通过未知生成技术处理的描述不同语义的其他科学图像。
{"title":"Localization of Synthetic Manipulations in Western Blot Images","authors":"Anmol Manjunath, Viola Negroni, Sara Mandelli, Daniel Moreira, Paolo Bestagini","doi":"arxiv-2408.13786","DOIUrl":"https://doi.org/arxiv-2408.13786","url":null,"abstract":"Recent breakthroughs in deep learning and generative systems have\u0000significantly fostered the creation of synthetic media, as well as the local\u0000alteration of real content via the insertion of highly realistic synthetic\u0000manipulations. Local image manipulation, in particular, poses serious\u0000challenges to the integrity of digital content and societal trust. This problem\u0000is not only confined to multimedia data, but also extends to biological images\u0000included in scientific publications, like images depicting Western blots. In\u0000this work, we address the task of localizing synthetic manipulations in Western\u0000blot images. To discriminate between pristine and synthetic pixels of an\u0000analyzed image, we propose a synthetic detector that operates on small patches\u0000extracted from the image. We aggregate patch contributions to estimate a\u0000tampering heatmap, highlighting synthetic pixels out of pristine ones. Our\u0000methodology proves effective when tested over two manipulated Western blot\u0000image datasets, one altered automatically and the other manually by exploiting\u0000advanced AI-based image manipulation tools that are unknown at our training\u0000stage. We also explore the robustness of our method over an external dataset of\u0000other scientific images depicting different semantics, manipulated through\u0000unseen generation techniques.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description SpeechCraft:具有自然语言描述的精细表达语音数据集
Pub Date : 2024-08-24 DOI: arxiv-2408.13608
Zeyu Jin, Jia Jia, Qixin Wang, Kehan Li, Shuoyi Zhou, Songtao Zhou, Xiaoyu Qin, Zhiyong Wu
Speech-language multi-modal learning presents a significant challenge due tothe fine nuanced information inherent in speech styles. Therefore, alarge-scale dataset providing elaborate comprehension of speech style isurgently needed to facilitate insightful interplay between speech audio andnatural language. However, constructing such datasets presents a majortrade-off between large-scale data collection and high-quality annotation. Totackle this challenge, we propose an automatic speech annotation system forexpressiveness interpretation that annotates in-the-wild speech clips withexpressive and vivid human language descriptions. Initially, speech audios areprocessed by a series of expert classifiers and captioning models to capturediverse speech characteristics, followed by a fine-tuned LLaMA for customizedannotation generation. Unlike previous tag/templet-based annotation frameworkswith limited information and diversity, our system provides in-depthunderstandings of speech style through tailored natural language descriptions,thereby enabling accurate and voluminous data generation for large modeltraining. With this system, we create SpeechCraft, a fine-grained bilingualexpressive speech dataset. It is distinguished by highly descriptive naturallanguage style prompts, containing approximately 2,000 hours of audio data andencompassing over two million speech clips. Extensive experiments demonstratethat the proposed dataset significantly boosts speech-language task performancein stylist speech synthesis and speech style understanding.
由于语音风格中固有的细微信息,语音语言多模态学习面临着巨大的挑战。因此,我们迫切需要一个大规模的数据集来提供对语音风格的精细理解,以促进语音音频与自然语言之间的深入互动。然而,构建这样的数据集在大规模数据收集和高质量注释之间存在着巨大的矛盾。为了应对这一挑战,我们提出了一种用于表达力解释的自动语音注释系统,该系统可为野外语音片段注释表达力强且生动的人类语言描述。首先,通过一系列专家分类器和字幕模型对语音音频进行处理,以捕捉语音的各种特征,然后通过微调 LLaMA 生成定制的注释。与以往信息有限、种类繁多的基于标签/模板的注释框架不同,我们的系统通过量身定制的自然语言描述深入理解语音风格,从而为大型模型训练生成准确、大量的数据。有了这个系统,我们创建了 SpeechCraft,一个精细的双语表达语音数据集。该数据集具有高度描述性的自然语言风格提示,包含约 2000 小时的音频数据和 200 多万个语音片段。广泛的实验证明,所提出的数据集能显著提高风格语音合成和语音风格理解方面的语音语言任务性能。
{"title":"SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description","authors":"Zeyu Jin, Jia Jia, Qixin Wang, Kehan Li, Shuoyi Zhou, Songtao Zhou, Xiaoyu Qin, Zhiyong Wu","doi":"arxiv-2408.13608","DOIUrl":"https://doi.org/arxiv-2408.13608","url":null,"abstract":"Speech-language multi-modal learning presents a significant challenge due to\u0000the fine nuanced information inherent in speech styles. Therefore, a\u0000large-scale dataset providing elaborate comprehension of speech style is\u0000urgently needed to facilitate insightful interplay between speech audio and\u0000natural language. However, constructing such datasets presents a major\u0000trade-off between large-scale data collection and high-quality annotation. To\u0000tackle this challenge, we propose an automatic speech annotation system for\u0000expressiveness interpretation that annotates in-the-wild speech clips with\u0000expressive and vivid human language descriptions. Initially, speech audios are\u0000processed by a series of expert classifiers and captioning models to capture\u0000diverse speech characteristics, followed by a fine-tuned LLaMA for customized\u0000annotation generation. Unlike previous tag/templet-based annotation frameworks\u0000with limited information and diversity, our system provides in-depth\u0000understandings of speech style through tailored natural language descriptions,\u0000thereby enabling accurate and voluminous data generation for large model\u0000training. With this system, we create SpeechCraft, a fine-grained bilingual\u0000expressive speech dataset. It is distinguished by highly descriptive natural\u0000language style prompts, containing approximately 2,000 hours of audio data and\u0000encompassing over two million speech clips. Extensive experiments demonstrate\u0000that the proposed dataset significantly boosts speech-language task performance\u0000in stylist speech synthesis and speech style understanding.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Loc4Plan: Locating Before Planning for Outdoor Vision and Language Navigation Loc4Plan:先定位后规划,实现户外视觉和语言导航
Pub Date : 2024-08-09 DOI: arxiv-2408.05090
Huilin Tian, Jingke Meng, Wei-Shi Zheng, Yuan-Ming Li, Junkai Yan, Yunong Zhang
Vision and Language Navigation (VLN) is a challenging task that requiresagents to understand instructions and navigate to the destination in a visualenvironment.One of the key challenges in outdoor VLN is keeping track of whichpart of the instruction was completed. To alleviate this problem, previousworks mainly focus on grounding the natural language to the visual input, butneglecting the crucial role of the agent's spatial position information in thegrounding process. In this work, we first explore the substantial effect ofspatial position locating on the grounding of outdoor VLN, drawing inspirationfrom human navigation. In real-world navigation scenarios, before planning apath to the destination, humans typically need to figure out their currentlocation. This observation underscores the pivotal role of spatial localizationin the navigation process. In this work, we introduce a novel framework,Locating be for Planning (Loc4Plan), designed to incorporate spatial perceptionfor action planning in outdoor VLN tasks. The main idea behind Loc4Plan is toperform the spatial localization before planning a decision action based oncorresponding guidance, which comprises a block-aware spatial locating (BAL)module and a spatial-aware action planning (SAP) module. Specifically, to helpthe agent perceive its spatial location in the environment, we propose to learna position predictor that measures how far the agent is from the nextintersection for reflecting its position, which is achieved by the BAL module.After the locating process, we propose the SAP module to incorporate spatialinformation to ground the corresponding guidance and enhance the precision ofaction planning. Extensive experiments on the Touchdown and map2seq datasetsshow that the proposed Loc4Plan outperforms the SOTA methods.
视觉与语言导航(VLN)是一项具有挑战性的任务,它要求机器人在视觉环境中理解指令并导航到目的地。为了缓解这一问题,前人的研究主要集中在将自然语言与视觉输入接地,但忽略了代理的空间位置信息在接地过程中的关键作用。在这项工作中,我们首先从人类导航中汲取灵感,探索空间位置定位对室外 VLN 落地的实质性影响。在现实世界的导航场景中,在规划前往目的地的路径之前,人类通常需要弄清楚自己当前的位置。这一观察结果强调了空间定位在导航过程中的关键作用。在这项工作中,我们引入了一个新颖的框架--定位规划(Locating be for Planning,Loc4Plan),旨在将空间感知纳入户外 VLN 任务的行动规划中。Loc4Plan 背后的主要思想是在根据相应的指导规划决策行动之前进行空间定位,它包括一个块感知空间定位(BAL)模块和一个空间感知行动规划(SAP)模块。具体来说,为了帮助机器人感知其在环境中的空间位置,我们建议学习一个位置预测器,测量机器人距离下一个交叉路口有多远,以反映其位置,这由 BAL 模块实现。在 Touchdown 和 map2seq 数据集上的大量实验表明,所提出的 Loc4Plan 优于 SOTA 方法。
{"title":"Loc4Plan: Locating Before Planning for Outdoor Vision and Language Navigation","authors":"Huilin Tian, Jingke Meng, Wei-Shi Zheng, Yuan-Ming Li, Junkai Yan, Yunong Zhang","doi":"arxiv-2408.05090","DOIUrl":"https://doi.org/arxiv-2408.05090","url":null,"abstract":"Vision and Language Navigation (VLN) is a challenging task that requires\u0000agents to understand instructions and navigate to the destination in a visual\u0000environment.One of the key challenges in outdoor VLN is keeping track of which\u0000part of the instruction was completed. To alleviate this problem, previous\u0000works mainly focus on grounding the natural language to the visual input, but\u0000neglecting the crucial role of the agent's spatial position information in the\u0000grounding process. In this work, we first explore the substantial effect of\u0000spatial position locating on the grounding of outdoor VLN, drawing inspiration\u0000from human navigation. In real-world navigation scenarios, before planning a\u0000path to the destination, humans typically need to figure out their current\u0000location. This observation underscores the pivotal role of spatial localization\u0000in the navigation process. In this work, we introduce a novel framework,\u0000Locating be for Planning (Loc4Plan), designed to incorporate spatial perception\u0000for action planning in outdoor VLN tasks. The main idea behind Loc4Plan is to\u0000perform the spatial localization before planning a decision action based on\u0000corresponding guidance, which comprises a block-aware spatial locating (BAL)\u0000module and a spatial-aware action planning (SAP) module. Specifically, to help\u0000the agent perceive its spatial location in the environment, we propose to learn\u0000a position predictor that measures how far the agent is from the next\u0000intersection for reflecting its position, which is achieved by the BAL module.\u0000After the locating process, we propose the SAP module to incorporate spatial\u0000information to ground the corresponding guidance and enhance the precision of\u0000action planning. Extensive experiments on the Touchdown and map2seq datasets\u0000show that the proposed Loc4Plan outperforms the SOTA methods.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141941822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep joint source-channel coding for wireless point cloud transmission 用于无线点云传输的深度联合源信道编码
Pub Date : 2024-08-09 DOI: arxiv-2408.04889
Cixiao Zhang, Mufan Liu, Wenjie Huang, Yin Xu, Yiling Xu, Dazhi He
The growing demand for high-quality point cloud transmission over wirelessnetworks presents significant challenges, primarily due to the large data sizesand the need for efficient encoding techniques. In response to thesechallenges, we introduce a novel system named Deep Point Cloud SemanticTransmission (PCST), designed for end-to-end wireless point cloud transmission.Our approach employs a progressive resampling framework using sparseconvolution to project point cloud data into a semantic latent space. Thesesemantic features are subsequently encoded through a deep joint source-channel(JSCC) encoder, generating the channel-input sequence. To enhance transmissionefficiency, we use an adaptive entropy-based approach to assess the importanceof each semantic feature, allowing transmission lengths to vary according totheir predicted entropy. PCST is robust across diverse Signal-to-Noise Ratio(SNR) levels and supports an adjustable rate-distortion (RD) trade-off,ensuring flexible and efficient transmission. Experimental results indicatethat PCST significantly outperforms traditional separate source-channel coding(SSCC) schemes, delivering superior reconstruction quality while achieving overa 50% reduction in bandwidth usage.
通过无线网络传输高质量点云的需求日益增长,这带来了巨大的挑战,主要是因为数据量大,需要高效的编码技术。为了应对这些挑战,我们推出了一种名为 "深度点云语义传输(PCST)"的新型系统,专为端到端无线点云传输而设计。语义特征随后通过深度联合源-信道(JSCC)编码器进行编码,生成信道-输入序列。为了提高传输效率,我们使用基于熵的自适应方法来评估每个语义特征的重要性,允许传输长度根据其预测熵而变化。PCST 在不同的信噪比(SNR)水平下都很稳健,并支持可调节的速率-失真(RD)权衡,从而确保了灵活高效的传输。实验结果表明,PCST 的性能明显优于传统的独立源信道编码(SSCC)方案,在提供卓越的重构质量的同时,还能减少 50% 以上的带宽使用。
{"title":"Deep joint source-channel coding for wireless point cloud transmission","authors":"Cixiao Zhang, Mufan Liu, Wenjie Huang, Yin Xu, Yiling Xu, Dazhi He","doi":"arxiv-2408.04889","DOIUrl":"https://doi.org/arxiv-2408.04889","url":null,"abstract":"The growing demand for high-quality point cloud transmission over wireless\u0000networks presents significant challenges, primarily due to the large data sizes\u0000and the need for efficient encoding techniques. In response to these\u0000challenges, we introduce a novel system named Deep Point Cloud Semantic\u0000Transmission (PCST), designed for end-to-end wireless point cloud transmission.\u0000Our approach employs a progressive resampling framework using sparse\u0000convolution to project point cloud data into a semantic latent space. These\u0000semantic features are subsequently encoded through a deep joint source-channel\u0000(JSCC) encoder, generating the channel-input sequence. To enhance transmission\u0000efficiency, we use an adaptive entropy-based approach to assess the importance\u0000of each semantic feature, allowing transmission lengths to vary according to\u0000their predicted entropy. PCST is robust across diverse Signal-to-Noise Ratio\u0000(SNR) levels and supports an adjustable rate-distortion (RD) trade-off,\u0000ensuring flexible and efficient transmission. Experimental results indicate\u0000that PCST significantly outperforms traditional separate source-channel coding\u0000(SSCC) schemes, delivering superior reconstruction quality while achieving over\u0000a 50% reduction in bandwidth usage.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141941938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotional Cues Extraction and Fusion for Multi-modal Emotion Prediction and Recognition in Conversation 提取和融合情感线索,实现对话中的多模态情感预测和识别
Pub Date : 2024-08-08 DOI: arxiv-2408.04547
Haoxiang Shi, Ziqi Liang, Jun Yu
Emotion Prediction in Conversation (EPC) aims to forecast the emotions offorthcoming utterances by utilizing preceding dialogues. Previous EPCapproaches relied on simple context modeling for emotion extraction,overlooking fine-grained emotion cues at the word level. Additionally, priorworks failed to account for the intrinsic differences between modalities,resulting in redundant information. To overcome these limitations, we proposean emotional cues extraction and fusion network, which consists of two stages:a modality-specific learning stage that utilizes word-level labels and prosodylearning to construct emotion embedding spaces for each modality, and atwo-step fusion stage for integrating multi-modal features. Moreover, theemotion features extracted by our model are also applicable to the EmotionRecognition in Conversation (ERC) task. Experimental results validate theefficacy of the proposed method, demonstrating superior performance on bothIEMOCAP and MELD datasets.
对话中的情绪预测(EPC)旨在通过利用之前的对话来预测接下来话语中的情绪。以前的情感预测方法依赖于简单的语境建模来提取情感,忽略了单词层面的细粒度情感线索。此外,之前的研究未能考虑到不同模态之间的内在差异,导致信息冗余。为了克服这些局限性,我们提出了一种情感线索提取和融合网络,它由两个阶段组成:一个是特定模态学习阶段,利用词级标签和拟声学习来构建每种模态的情感嵌入空间;另一个是两步融合阶段,用于整合多模态特征。此外,我们的模型提取的情感特征也适用于对话中的情感识别(ERC)任务。实验结果验证了所提方法的有效性,在 IEMOCAP 和 MELD 数据集上都表现出了卓越的性能。
{"title":"Emotional Cues Extraction and Fusion for Multi-modal Emotion Prediction and Recognition in Conversation","authors":"Haoxiang Shi, Ziqi Liang, Jun Yu","doi":"arxiv-2408.04547","DOIUrl":"https://doi.org/arxiv-2408.04547","url":null,"abstract":"Emotion Prediction in Conversation (EPC) aims to forecast the emotions of\u0000forthcoming utterances by utilizing preceding dialogues. Previous EPC\u0000approaches relied on simple context modeling for emotion extraction,\u0000overlooking fine-grained emotion cues at the word level. Additionally, prior\u0000works failed to account for the intrinsic differences between modalities,\u0000resulting in redundant information. To overcome these limitations, we propose\u0000an emotional cues extraction and fusion network, which consists of two stages:\u0000a modality-specific learning stage that utilizes word-level labels and prosody\u0000learning to construct emotion embedding spaces for each modality, and a\u0000two-step fusion stage for integrating multi-modal features. Moreover, the\u0000emotion features extracted by our model are also applicable to the Emotion\u0000Recognition in Conversation (ERC) task. Experimental results validate the\u0000efficacy of the proposed method, demonstrating superior performance on both\u0000IEMOCAP and MELD datasets.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141941819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiColor: Image Colorization by Learning from Multiple Color Spaces 多色:通过学习多种色彩空间实现图像着色
Pub Date : 2024-08-08 DOI: arxiv-2408.04172
Xiangcheng Du, Zhao Zhou, Yanlong Wang, Zhuoyao Wang, Yingbin Zheng, Cheng Jin
Deep networks have shown impressive performance in the image restorationtasks, such as image colorization. However, we find that previous approachesrely on the digital representation from single color model with a specificmapping function, a.k.a., color space, during the colorization pipeline. Inthis paper, we first investigate the modeling of different color spaces, andfind each of them exhibiting distinctive characteristics with uniquedistribution of colors. The complementarity among multiple color spaces leadsto benefits for the image colorization task. We present MultiColor, a new learning-based approach to automaticallycolorize grayscale images that combines clues from multiple color spaces.Specifically, we employ a set of dedicated colorization modules for individualcolor space. Within each module, a transformer decoder is first employed torefine color query embeddings and then a color mapper produces color channelprediction using the embeddings and semantic features. With these predictedcolor channels representing various color spaces, a complementary network isdesigned to exploit the complementarity and generate pleasing and reasonablecolorized images. We conduct extensive experiments on real-world datasets, andthe results demonstrate superior performance over the state-of-the-arts.
深度网络在图像修复任务(如图像着色)中表现出令人印象深刻的性能。然而,我们发现,以往的方法主要是在着色过程中,用特定的映射函数(又称色彩空间)从单一色彩模型中进行数字表示。在本文中,我们首先研究了不同色彩空间的建模,发现每个色彩空间都具有独特的色彩分布特征。多种色彩空间之间的互补性为图像着色任务带来了好处。我们提出的 MultiColor 是一种新的基于学习的灰度图像自动着色方法,它结合了来自多个色彩空间的线索。在每个模块中,首先使用变压器解码器来完善颜色查询嵌入,然后使用颜色映射器利用嵌入和语义特征生成颜色通道预测。有了这些代表不同色彩空间的预测色彩通道,就可以设计一个互补网络,利用互补性生成悦目、合理的彩色图像。我们在真实世界的数据集上进行了广泛的实验,结果表明其性能优于同行。
{"title":"MultiColor: Image Colorization by Learning from Multiple Color Spaces","authors":"Xiangcheng Du, Zhao Zhou, Yanlong Wang, Zhuoyao Wang, Yingbin Zheng, Cheng Jin","doi":"arxiv-2408.04172","DOIUrl":"https://doi.org/arxiv-2408.04172","url":null,"abstract":"Deep networks have shown impressive performance in the image restoration\u0000tasks, such as image colorization. However, we find that previous approaches\u0000rely on the digital representation from single color model with a specific\u0000mapping function, a.k.a., color space, during the colorization pipeline. In\u0000this paper, we first investigate the modeling of different color spaces, and\u0000find each of them exhibiting distinctive characteristics with unique\u0000distribution of colors. The complementarity among multiple color spaces leads\u0000to benefits for the image colorization task. We present MultiColor, a new learning-based approach to automatically\u0000colorize grayscale images that combines clues from multiple color spaces.\u0000Specifically, we employ a set of dedicated colorization modules for individual\u0000color space. Within each module, a transformer decoder is first employed to\u0000refine color query embeddings and then a color mapper produces color channel\u0000prediction using the embeddings and semantic features. With these predicted\u0000color channels representing various color spaces, a complementary network is\u0000designed to exploit the complementarity and generate pleasing and reasonable\u0000colorized images. We conduct extensive experiments on real-world datasets, and\u0000the results demonstrate superior performance over the state-of-the-arts.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141941825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1