‘All possible sounds’: speech, music, and the emergence of machine listening

IF 0.4 0 HUMANITIES, MULTIDISCIPLINARY Sound Studies Pub Date : 2023-04-10 DOI:10.1080/20551940.2023.2195057
James E K Parker, Sean Dockray
{"title":"‘All possible sounds’: speech, music, and the emergence of machine listening","authors":"James E K Parker, Sean Dockray","doi":"10.1080/20551940.2023.2195057","DOIUrl":null,"url":null,"abstract":"ABSTRACT “Machine listening” is one common term for a fast-growing interdisciplinary field of science and engineering that “uses signal processing and machine learning to extract useful information from sound”. This article contributes to the critical literature on machine listening by presenting some of its history as a field. From the 1940s to the 1990s, work on artificial intelligence and audio developed along two streams. There was work on speech recognition/understanding, and work in computer music. In the early 1990s, another stream began to emerge. At institutions such as MIT Media Lab and Stanford’s CCRMA, researchers started turning towards “more fundamental problems of audition”. Propelled by work being done by and alongside musicians, speech and music would increasingly be understood by computer scientists as particular sounds within a broader “auditory scene”. Researchers began to develop machine listening systems for a more diverse range of sounds and classification tasks: often in the service of speech recognition, but also increasingly for their own sake. The soundscape itself was becoming an object of computational concern. Today, the ambition is “to cover all possible sounds”. That is the aspiration with which we must now contend politically, and which this article sets out to historicise and understand.","PeriodicalId":53207,"journal":{"name":"Sound Studies","volume":"17 1","pages":"253 - 281"},"PeriodicalIF":0.4000,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sound Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/20551940.2023.2195057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"HUMANITIES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

ABSTRACT “Machine listening” is one common term for a fast-growing interdisciplinary field of science and engineering that “uses signal processing and machine learning to extract useful information from sound”. This article contributes to the critical literature on machine listening by presenting some of its history as a field. From the 1940s to the 1990s, work on artificial intelligence and audio developed along two streams. There was work on speech recognition/understanding, and work in computer music. In the early 1990s, another stream began to emerge. At institutions such as MIT Media Lab and Stanford’s CCRMA, researchers started turning towards “more fundamental problems of audition”. Propelled by work being done by and alongside musicians, speech and music would increasingly be understood by computer scientists as particular sounds within a broader “auditory scene”. Researchers began to develop machine listening systems for a more diverse range of sounds and classification tasks: often in the service of speech recognition, but also increasingly for their own sake. The soundscape itself was becoming an object of computational concern. Today, the ambition is “to cover all possible sounds”. That is the aspiration with which we must now contend politically, and which this article sets out to historicise and understand.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
“所有可能的声音”:语音、音乐和机器听力的出现
“机器听力”是一个快速发展的跨学科科学和工程领域的常见术语,“使用信号处理和机器学习从声音中提取有用的信息”。本文通过介绍机器听力作为一个领域的一些历史,为批评性文献做出了贡献。从20世纪40年代到90年代,人工智能和音频的研究沿着两条方向发展。有关于语音识别/理解的工作,也有关于计算机音乐的工作。在20世纪90年代初,另一股潮流开始出现。在麻省理工学院媒体实验室(MIT Media Lab)和斯坦福大学(Stanford)的CCRMA等机构,研究人员开始转向“听力中更基本的问题”。在音乐家的推动下,语言和音乐将越来越多地被计算机科学家理解为更广泛的“听觉场景”中的特定声音。研究人员开始为更多样化的声音和分类任务开发机器听音系统:通常用于语音识别,但也越来越多地用于它们自己。音景本身正在成为计算关注的对象。如今,他们的目标是“覆盖所有可能的声音”。这就是我们现在必须在政治上与之斗争的愿望,这篇文章旨在将其历史化并加以理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Sound Studies
Sound Studies HUMANITIES, MULTIDISCIPLINARY-
CiteScore
1.30
自引率
0.00%
发文量
24
期刊最新文献
Buzzing like a bee: simulation and cross-species empathy in sound installations Castrato as cyborg: premodern posthumanism & the epistemology of voice Voice machines: the castrato, the cat piano, and other strange sounds , by Bonnie Gordon, Chicago and London, University of Chicago Press, 2023, 432 pp., $55.00 (cloth), ISBN 9780226825144 Connective listening practices and technologies Quantum Listening , by Pauline Oliveros, London, Ignota, 2022, 70 pp., $8.99 (pb), ISBN 9781838003944 Machine aurality: uncanny resonances and the sonic anxieties of surveillance capitalism Listening beyond the cochlear Of sound mind , by Nina Kraus, Cambridge, Massachusetts, MIT Press, 2022, 359 pp., $19.95 (paperback), ISBN 9780262545075
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1