Automated Speech Recognition: Spurring Artificial Intelligence Innovation [Circuits from a Systems Perspective]

Farhana Sheikh
{"title":"Automated Speech Recognition: Spurring Artificial Intelligence Innovation [Circuits from a Systems Perspective]","authors":"Farhana Sheikh","doi":"10.1109/MSSC.2024.3473747","DOIUrl":null,"url":null,"abstract":"Roughly 25 years ago, it was rare for an individual to interact with a voice-activated service on a daily basis. Since then, there has been an exponential rise in devices that use the human voice as input. A study published in \n<italic>Forbes</i>\n magazine \n<xref>[1]</xref>\n estimates that by 2025 the global voice recognition market will reach US\n<inline-formula><tex-math>${\\$}$</tex-math></inline-formula>\n26.8 billion. Today, more than one in four people regularly use voice search, and by the end of 2024, it is estimated that the number of digital voice assistants in the world will reach 8.4 billion \n<xref>[1]</xref>\n, slightly greater than the world’s total population. As the use of voice-activated assistants exponentially rises, it is also expected that commercial transactions made through such devices will increase, reaching US\n<inline-formula><tex-math>${\\$}$</tex-math></inline-formula>\n164 billion by 2025 \n<xref>[1]</xref>\n. Interestingly enough, the technologies that enabled the recent skyrocketing acceptance of voice rather than text as input to a computing machine are closely tied to the recent natural language processing (NLP) phenomena responsible for artificial intelligence (AI) engines such as ChatGPT. In this installment of “Circuits From a Systems Perspective,” we briefly review the history of automatic speech recognition (ASR) and show how intertwined it is with NLP that has led to large language models (LLMs) which have spurred the new age of AI. We review a modern speech recognition system and some of the circuits that could possibly enhance ASR systems that we may see in the future.","PeriodicalId":100636,"journal":{"name":"IEEE Solid-State Circuits Magazine","volume":"16 4","pages":"29-116"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Solid-State Circuits Magazine","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10752805/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Roughly 25 years ago, it was rare for an individual to interact with a voice-activated service on a daily basis. Since then, there has been an exponential rise in devices that use the human voice as input. A study published in Forbes magazine [1] estimates that by 2025 the global voice recognition market will reach US ${\$}$ 26.8 billion. Today, more than one in four people regularly use voice search, and by the end of 2024, it is estimated that the number of digital voice assistants in the world will reach 8.4 billion [1] , slightly greater than the world’s total population. As the use of voice-activated assistants exponentially rises, it is also expected that commercial transactions made through such devices will increase, reaching US ${\$}$ 164 billion by 2025 [1] . Interestingly enough, the technologies that enabled the recent skyrocketing acceptance of voice rather than text as input to a computing machine are closely tied to the recent natural language processing (NLP) phenomena responsible for artificial intelligence (AI) engines such as ChatGPT. In this installment of “Circuits From a Systems Perspective,” we briefly review the history of automatic speech recognition (ASR) and show how intertwined it is with NLP that has led to large language models (LLMs) which have spurred the new age of AI. We review a modern speech recognition system and some of the circuits that could possibly enhance ASR systems that we may see in the future.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自动语音识别:促进人工智能创新 [从系统角度看电路]
大约 25 年前,人们还很少每天与声控服务进行交互。从那时起,使用人声作为输入的设备呈指数级增长。福布斯》杂志[1] 发表的一项研究估计,到 2025 年,全球语音识别市场规模将达到 268 亿美元。如今,每四个人中就有一个以上经常使用语音搜索,预计到 2024 年底,全球数字语音助手的数量将达到 84 亿[1],略高于全球总人口。随着声控助手的使用呈指数级增长,预计通过此类设备进行的商业交易也将增加,到 2025 年将达到 1,640 亿美元[1]。有趣的是,最近人们对语音而非文字作为计算机输入的接受程度急剧上升,这与最近出现的自然语言处理(NLP)现象密切相关,而这些现象正是 ChatGPT 等人工智能(AI)引擎的功劳。在本期的 "系统视角下的电路 "中,我们将简要回顾自动语音识别(ASR)的历史,并说明它与 NLP 是如何交织在一起的,而 NLP 又导致了大型语言模型(LLM)的出现,从而推动了人工智能的新时代。我们回顾了现代语音识别系统和一些电路,这些电路有可能增强我们未来可能看到的 ASR 系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
0
期刊最新文献
“Noise and Distortion, Part II” [Circuit Intuitions] Errata Conference Calendar Masthead Table of Contents
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1