一个利用机器学习再现音频信息的模型

R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar
{"title":"一个利用机器学习再现音频信息的模型","authors":"R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar","doi":"10.1109/incet49848.2020.9154064","DOIUrl":null,"url":null,"abstract":"This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.","PeriodicalId":174411,"journal":{"name":"2020 International Conference for Emerging Technology (INCET)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Model For Recapitulating Audio Messages Using Machine Learning\",\"authors\":\"R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar\",\"doi\":\"10.1109/incet49848.2020.9154064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.\",\"PeriodicalId\":174411,\"journal\":{\"name\":\"2020 International Conference for Emerging Technology (INCET)\",\"volume\":\"94 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference for Emerging Technology (INCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/incet49848.2020.9154064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/incet49848.2020.9154064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

该模型旨在开发一种有效的方法来概括大型音频信息或片段,以获得有价值的见解。随着音频/视频数据利用率的日益增加,需要更智能地处理音频文件。在本文中,提出了一种新的方法来为给定的长音频文件构建摘要音频。该方法主要由三个模块组成:语音到文本的转换、文本摘要和文本到语音的转换。每个模块由另一个模块的输出提供,但语音到文本转换除外,其中输入是必须为其形成摘要的给定音频文件。音频再现的第一步是将给定的音频转换为文本。这可以通过向Google Cloud语音API发送异步请求来实现。下一个模块使用Text Rank算法完成从文本中提取重要句子的任务。最后一个模块是将文本摘要模块输出生成的摘要文本转换为音频文件。使用flask为整个方法提供了一个合适的用户界面,从而形成了一个web应用程序来帮助用户与该模型进行交互。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Model For Recapitulating Audio Messages Using Machine Learning
This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Investigation of DC Parameters of Double Gate Tunnel Field Effect Transistor (DG- TFET) for different Gate Dielectrics An Open-source Framework for Robust Portable Cellular Network Efficiency Comparison of Supervised and Unsupervised Classifier on Content Based Classification using Shape, Color, Texture Improved Divorce Prediction Using Machine learning- Particle Swarm Optimization (PSO) Machine Learning Based Synchrophasor Data Analysis for Islanding Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1