首页 > 最新文献

International Journal of Speech Technology最新文献

英文 中文
Assessing American presidential candidates using principles of ontological engineering, word sense disambiguation, data envelope analysis and qualitative comparative analysis 运用本体工程学、词义消歧、数据包络分析和定性比较分析的原理对美国总统候选人进行评估
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10043-y
James A. Rodger, Justin Piper
{"title":"Assessing American presidential candidates using principles of ontological engineering, word sense disambiguation, data envelope analysis and qualitative comparative analysis","authors":"James A. Rodger, Justin Piper","doi":"10.1007/s10772-023-10043-y","DOIUrl":"https://doi.org/10.1007/s10772-023-10043-y","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135686476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monaural speech separation using WT-Conv-TasNet for hearing aids 使用wt - convt - tasnet进行助听器单耳语音分离
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10045-w
Jharna Agrawal, Manish Gupta, Hitendra Garg
{"title":"Monaural speech separation using WT-Conv-TasNet for hearing aids","authors":"Jharna Agrawal, Manish Gupta, Hitendra Garg","doi":"10.1007/s10772-023-10045-w","DOIUrl":"https://doi.org/10.1007/s10772-023-10045-w","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135640922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time frequency domain deep CNN for automatic background classification in speech signals 基于时频域深度CNN的语音信号背景自动分类
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10042-z
Rakesh Reddy Yakkati, Sreenivasa Reddy Yeduri, Rajesh Kumar Tripathy, Linga Reddy Cenkeramaddi
Abstract Many application areas, such as background identification, predictive maintenance in industrial applications, smart home applications, assisting deaf people with their daily activities and indexing and retrieval of content-based multimedia, etc., use automatic background classification using speech signals. It is challenging to predict the background environment accurately from speech signal information. Thus, a novel synchrosqueezed wavelet transform (SWT)-based deep learning (DL) approach is proposed in this paper for automatically classifying background information embedded in speech signals. Here, SWT is incorporated to obtain the time-frequency plot from the speech signals. These time-frequency signals are then fed to a deep convolutional neural network (DCNN) to classify background information embedded in speech signals. The proposed DCNN model consists of three convolution layers, one batch-normalization layer, three max-pooling layers, one dropout layer, and one fully connected layer. The proposed method is tested using various background signals embedded in speech signals, such as airport, airplane, drone, street, babble, car, helicopter, exhibition, station, restaurant, and train sounds. According to the results, the proposed SWT-based DCNN approach has an overall classification accuracy of 97.96 (± 0.53)% to classify background information embedded in speech signals. Finally, the performance of the proposed approach is compared to the existing methods.
许多应用领域,如背景识别、工业应用中的预测性维护、智能家居应用、辅助聋人日常活动、基于内容的多媒体的索引和检索等,都使用语音信号进行自动背景分类。从语音信号信息中准确预测背景环境是一个挑战。为此,本文提出了一种基于同步压缩小波变换(SWT)的深度学习方法,用于语音信号背景信息的自动分类。在这里,结合SWT从语音信号中获得时频图。然后将这些时频信号送入深度卷积神经网络(DCNN),对嵌入语音信号中的背景信息进行分类。提出的DCNN模型由三个卷积层、一个批量归一化层、三个最大池化层、一个dropout层和一个全连接层组成。利用嵌入在语音信号中的各种背景信号,如机场、飞机、无人机、街道、咿呀学语、汽车、直升机、展览、车站、餐馆和火车声音,对所提出的方法进行了测试。结果表明,基于swt的DCNN方法对嵌入在语音信号中的背景信息进行分类的总体准确率为97.96(±0.53)%。最后,将该方法的性能与现有方法进行了比较。
{"title":"Time frequency domain deep CNN for automatic background classification in speech signals","authors":"Rakesh Reddy Yakkati, Sreenivasa Reddy Yeduri, Rajesh Kumar Tripathy, Linga Reddy Cenkeramaddi","doi":"10.1007/s10772-023-10042-z","DOIUrl":"https://doi.org/10.1007/s10772-023-10042-z","url":null,"abstract":"Abstract Many application areas, such as background identification, predictive maintenance in industrial applications, smart home applications, assisting deaf people with their daily activities and indexing and retrieval of content-based multimedia, etc., use automatic background classification using speech signals. It is challenging to predict the background environment accurately from speech signal information. Thus, a novel synchrosqueezed wavelet transform (SWT)-based deep learning (DL) approach is proposed in this paper for automatically classifying background information embedded in speech signals. Here, SWT is incorporated to obtain the time-frequency plot from the speech signals. These time-frequency signals are then fed to a deep convolutional neural network (DCNN) to classify background information embedded in speech signals. The proposed DCNN model consists of three convolution layers, one batch-normalization layer, three max-pooling layers, one dropout layer, and one fully connected layer. The proposed method is tested using various background signals embedded in speech signals, such as airport, airplane, drone, street, babble, car, helicopter, exhibition, station, restaurant, and train sounds. According to the results, the proposed SWT-based DCNN approach has an overall classification accuracy of 97.96 (± 0.53)% to classify background information embedded in speech signals. Finally, the performance of the proposed approach is compared to the existing methods.","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"2641 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135346603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voice user interfaces in manufacturing logistics: a literature review 制造业物流中的语音用户界面:文献综述
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10036-x
Heiner Ludwig, Thorsten Schmidt, Mathias Kühn
{"title":"Voice user interfaces in manufacturing logistics: a literature review","authors":"Heiner Ludwig, Thorsten Schmidt, Mathias Kühn","doi":"10.1007/s10772-023-10036-x","DOIUrl":"https://doi.org/10.1007/s10772-023-10036-x","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46875649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust automatic accent identification based on the acoustic evidence 基于声学证据的鲁棒自动重音识别
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10031-2
Eiman Alsharhan, Allan Ramsay
{"title":"Robust automatic accent identification based on the acoustic evidence","authors":"Eiman Alsharhan, Allan Ramsay","doi":"10.1007/s10772-023-10031-2","DOIUrl":"https://doi.org/10.1007/s10772-023-10031-2","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135248804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using combined features to improve speaker verification in the face of limited reverberant data 在混响数据有限的情况下,利用组合特征改进说话人验证
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10048-7
Khamis A. Al-Karawi, Duraid Y. Mohammed
{"title":"Using combined features to improve speaker verification in the face of limited reverberant data","authors":"Khamis A. Al-Karawi, Duraid Y. Mohammed","doi":"10.1007/s10772-023-10048-7","DOIUrl":"https://doi.org/10.1007/s10772-023-10048-7","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135690314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binary classifier for identification of stammering instances in Hindi speech data 印地语语音数据中口吃实例识别的二元分类器
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10046-9
Shivam Dwivedi, Sanjukta Ghosh, Satyam Dwivedi
{"title":"Binary classifier for identification of stammering instances in Hindi speech data","authors":"Shivam Dwivedi, Sanjukta Ghosh, Satyam Dwivedi","doi":"10.1007/s10772-023-10046-9","DOIUrl":"https://doi.org/10.1007/s10772-023-10046-9","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135686866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture 基于1D的孤立Amazigh词自动语音识别系统2D CNN-LSTM架构
Q1 Arts and Humanities Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10054-9
Mohamed Daouad, Fadoua Ataa Allah, El Wardani Dadi
{"title":"An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture","authors":"Mohamed Daouad, Fadoua Ataa Allah, El Wardani Dadi","doi":"10.1007/s10772-023-10054-9","DOIUrl":"https://doi.org/10.1007/s10772-023-10054-9","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135688149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition 语内/跨语言语音情感识别中的说话人与性别依赖
Q1 Arts and Humanities Pub Date : 2023-08-25 DOI: 10.1007/s10772-023-10038-9
Adil Chakhtouna, Sara Sekkate, A. Adib
{"title":"Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition","authors":"Adil Chakhtouna, Sara Sekkate, A. Adib","doi":"10.1007/s10772-023-10038-9","DOIUrl":"https://doi.org/10.1007/s10772-023-10038-9","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45961353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient speech emotion recognition based on a dual-stream CNN-transformer fusion network 基于双流CNN变换器融合网络的高效语音情感识别
Q1 Arts and Humanities Pub Date : 2023-07-01 DOI: 10.1007/s10772-023-10035-y
Mohammed Tellai, L. Gao, Qi-rong Mao
{"title":"An efficient speech emotion recognition based on a dual-stream CNN-transformer fusion network","authors":"Mohammed Tellai, L. Gao, Qi-rong Mao","doi":"10.1007/s10772-023-10035-y","DOIUrl":"https://doi.org/10.1007/s10772-023-10035-y","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"541-557"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43680205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Speech Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1