{"title":"Automated Speech Recognition: Spurring Artificial Intelligence Innovation [Circuits from a Systems Perspective]","authors":"Farhana Sheikh","doi":"10.1109/MSSC.2024.3473747","DOIUrl":null,"url":null,"abstract":"Roughly 25 years ago, it was rare for an individual to interact with a voice-activated service on a daily basis. Since then, there has been an exponential rise in devices that use the human voice as input. A study published in \n<italic>Forbes</i>\n magazine \n<xref>[1]</xref>\n estimates that by 2025 the global voice recognition market will reach US\n<inline-formula><tex-math>${\\$}$</tex-math></inline-formula>\n26.8 billion. Today, more than one in four people regularly use voice search, and by the end of 2024, it is estimated that the number of digital voice assistants in the world will reach 8.4 billion \n<xref>[1]</xref>\n, slightly greater than the world’s total population. As the use of voice-activated assistants exponentially rises, it is also expected that commercial transactions made through such devices will increase, reaching US\n<inline-formula><tex-math>${\\$}$</tex-math></inline-formula>\n164 billion by 2025 \n<xref>[1]</xref>\n. Interestingly enough, the technologies that enabled the recent skyrocketing acceptance of voice rather than text as input to a computing machine are closely tied to the recent natural language processing (NLP) phenomena responsible for artificial intelligence (AI) engines such as ChatGPT. In this installment of “Circuits From a Systems Perspective,” we briefly review the history of automatic speech recognition (ASR) and show how intertwined it is with NLP that has led to large language models (LLMs) which have spurred the new age of AI. We review a modern speech recognition system and some of the circuits that could possibly enhance ASR systems that we may see in the future.","PeriodicalId":100636,"journal":{"name":"IEEE Solid-State Circuits Magazine","volume":"16 4","pages":"29-116"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Solid-State Circuits Magazine","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10752805/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Roughly 25 years ago, it was rare for an individual to interact with a voice-activated service on a daily basis. Since then, there has been an exponential rise in devices that use the human voice as input. A study published in
Forbes
magazine
[1]
estimates that by 2025 the global voice recognition market will reach US
${\$}$
26.8 billion. Today, more than one in four people regularly use voice search, and by the end of 2024, it is estimated that the number of digital voice assistants in the world will reach 8.4 billion
[1]
, slightly greater than the world’s total population. As the use of voice-activated assistants exponentially rises, it is also expected that commercial transactions made through such devices will increase, reaching US
${\$}$
164 billion by 2025
[1]
. Interestingly enough, the technologies that enabled the recent skyrocketing acceptance of voice rather than text as input to a computing machine are closely tied to the recent natural language processing (NLP) phenomena responsible for artificial intelligence (AI) engines such as ChatGPT. In this installment of “Circuits From a Systems Perspective,” we briefly review the history of automatic speech recognition (ASR) and show how intertwined it is with NLP that has led to large language models (LLMs) which have spurred the new age of AI. We review a modern speech recognition system and some of the circuits that could possibly enhance ASR systems that we may see in the future.