一个利用机器学习再现音频信息的模型

2020 International Conference for Emerging Technology (INCET) Pub Date : 2020-06-01 DOI:10.1109/incet49848.2020.9154064

R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar

{"title":"一个利用机器学习再现音频信息的模型","authors":"R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar","doi":"10.1109/incet49848.2020.9154064","DOIUrl":null,"url":null,"abstract":"This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.","PeriodicalId":174411,"journal":{"name":"2020 International Conference for Emerging Technology (INCET)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Model For Recapitulating Audio Messages Using Machine Learning\",\"authors\":\"R. Yadav, R. Bharti, R. Nagar, Sanchit Kumar\",\"doi\":\"10.1109/incet49848.2020.9154064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.\",\"PeriodicalId\":174411,\"journal\":{\"name\":\"2020 International Conference for Emerging Technology (INCET)\",\"volume\":\"94 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference for Emerging Technology (INCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/incet49848.2020.9154064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/incet49848.2020.9154064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

该模型旨在开发一种有效的方法来概括大型音频信息或片段，以获得有价值的见解。随着音频/视频数据利用率的日益增加，需要更智能地处理音频文件。在本文中，提出了一种新的方法来为给定的长音频文件构建摘要音频。该方法主要由三个模块组成:语音到文本的转换、文本摘要和文本到语音的转换。每个模块由另一个模块的输出提供，但语音到文本转换除外，其中输入是必须为其形成摘要的给定音频文件。音频再现的第一步是将给定的音频转换为文本。这可以通过向Google Cloud语音API发送异步请求来实现。下一个模块使用Text Rank算法完成从文本中提取重要句子的任务。最后一个模块是将文本摘要模块输出生成的摘要文本转换为音频文件。使用flask为整个方法提供了一个合适的用户界面，从而形成了一个web应用程序来帮助用户与该模型进行交互。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Model For Recapitulating Audio Messages Using Machine Learning

This model aims to develop an efficient way to recapitulate large audio messages or clips for valuable insights. With increase in utilization of audio/visual data day by day, there is a need to handle audio files more intelligently. In this document, a novel approach is presented to build a summarized audio for a given long audio file. This method is composed primarily of three modules namely: Conversion of Speech into Text, Text summarization, and lastly conversion of text into speech. Each module is fed by the output of another module except speech to text conversion where input is the given audio file for which summary has to be formed. The first step in audio recapitulation is conversion of given audio to text. This is made possible by sending asynchronous requests to Google Cloud speech API. The next module accomplishes its task of extracting important sentences from the transcript by using the Text Rank algorithm. The last module is to convert the summarized text generated from the output of text summarization module to an audio file. This whole method is given a suitable User Interface using flask and thus a web application is formed for helping users to interact with this model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference for Emerging Technology (INCET)

自引率

0.00%

发文量