International Journal of Speech Technology最新文献

英文中文

An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks 一种基于优化的机器学习框架的语音识别增强方法

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-02-21 DOI: 10.1007/s10772-023-10019-y

Bhuvaneshwari Jolad, Rajashri Khanai

引用次数: 2

An empirical study on analysis window functions for text-independent speaker recognition 独立文本说话人识别分析窗口函数的实证研究

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-02-19 DOI: 10.1007/s10772-023-10024-1

Bidhan Barai, N. Das, Subhadip Basu, M. Nasipuri

引用次数: 0

A framework for quality assessment of synthesised speech using learning-based objective evaluation 基于学习的客观评价的合成语音质量评估框架

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10021-4

Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, U. Tiwary

引用次数: 0

Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement 基于Weibull和Nakagami语音先验的正则化NMF自适应维纳滤波器语音增强

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10020-5

Chaitanya Jannu, S. Vanambathina

引用次数: 3

Speaker identification and localization using shuffled MFCC features and deep learning 使用洗牌MFCC特征和深度学习的说话人识别和定位

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-01-29 DOI: 10.1007/s10772-023-10023-2

Mahdi Barhoush, Ahmed Hallawa, A. Schmeink

引用次数: 2

A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech 一种融合半径的局部多核学习算法用于语音抑制检测

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-01-23 DOI: 10.1007/s10772-023-10017-0

Haihua Jiang, Bin Hu, Zhenyu Liu, G. Wang, Lan Zhang

引用次数: 0

An automated speech analysis system for the detection of cognitive decline in elderly 一种用于检测老年人认知能力下降的自动语音分析系统

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-01-19 DOI: 10.1007/s10772-023-10016-1

C. Loizou, M. Pantzaris

引用次数: 0

Plain-to-clear speech video conversion for enhanced intelligibility. 普通到清晰的语音视频转换，提高清晰度。

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2023-01-01 DOI: 10.1007/s10772-023-10018-z

Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M Behne, Allard Jongman, Joan A Sereno, Yue Wang

Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features and improve intelligibility. We extract clear-speech visual features of English words varying in vowels produced by multiple male and female talkers. Via a frame-by-frame image-warping based video generation method with a controllable parameter (displacement factor), we apply the extracted clear-speech visual features to videos of plain speech to synthesize clear speech videos. We evaluate the generated videos using a robust, state of the art AI Lip Reader as well as human intelligibility testing. The contributions of this study are: (1) we successfully extract relevant visual cues for video modifications across speech styles, and have achieved enhanced intelligibility for AI; (2) this work suggests that universal talker-independent clear-speech features may be utilized to modify any talker's visual speech style; (3) we introduce "displacement factor" as a way of systematically scaling the magnitude of displacement modifications between speech styles; and (4) the high definition generated videos make them ideal candidates for human-centric intelligibility and perceptual training studies.

清晰的语言，相对于平淡的语言，已被证明可以提高可理解性。我们研究了视频中可见的语音线索是否可以系统地修改以增强清晰的语音视觉特征并提高可理解性。我们提取了多个男性和女性说话者所产生的元音不同的英语单词的清晰视觉特征。通过一种基于逐帧图像变形的视频生成方法，在参数(位移因子)可控的情况下，将提取的清晰语音视觉特征应用到普通语音视频中，合成清晰语音视频。我们使用强大的，最先进的人工智能读唇器以及人类可理解性测试来评估生成的视频。本研究的贡献在于:(1)我们成功地提取了跨语音风格视频修改的相关视觉线索，并提高了人工智能的可理解性;(2)该研究表明，普遍的与说话人无关的清晰语音特征可以用来修改任何说话人的视觉语言风格;(3)我们引入了“位移因子”作为一种系统地衡量语音风格之间位移变化幅度的方法;(4)生成的高清晰度视频使其成为以人为中心的可理解性和感知训练研究的理想候选者。

{"title":"Plain-to-clear speech video conversion for enhanced intelligibility.","authors":"Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M Behne, Allard Jongman, Joan A Sereno, Yue Wang","doi":"10.1007/s10772-023-10018-z","DOIUrl":"https://doi.org/10.1007/s10772-023-10018-z","url":null,"abstract":"<p><p>Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features and improve intelligibility. We extract clear-speech visual features of English words varying in vowels produced by multiple male and female talkers. Via a frame-by-frame image-warping based video generation method with a controllable parameter (displacement factor), we apply the extracted clear-speech visual features to videos of plain speech to synthesize clear speech videos. We evaluate the generated videos using a robust, state of the art AI Lip Reader as well as human intelligibility testing. The contributions of this study are: (1) we successfully extract relevant visual cues for video modifications across speech styles, and have achieved enhanced intelligibility for AI; (2) this work suggests that universal talker-independent clear-speech features may be utilized to modify any talker's visual speech style; (3) we introduce \"displacement factor\" as a way of systematically scaling the magnitude of displacement modifications between speech styles; and (4) the high definition generated videos make them ideal candidates for human-centric intelligibility and perceptual training studies.</p>","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"163-184"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10042924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9611085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retraction Note: Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks 注:基于卷积神经网络的非线性声学噪声消除自动语音识别系统(nancasr)

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09991-8

R. Ramadan, Kusum Yadav

引用次数: 0

Retraction Note: Preserving learnability and intelligibility at the point of care with assimilation of different speech recognition techniques 撤回注:保留学习性和可理解性，在点与不同的语音识别技术的同化

Q1 Arts and Humanities

International Journal of Speech Technology

Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09996-3

Sukumar Rajendran, P. Jayagopal

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Speech Technology

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀