S. Rosset, Olivier Galibert, G. Adda, Eric Bilinski
{"title":"The LIMSI QAst systems: Comparison between human and automatic rules generation for question-answering on speech transcriptions","authors":"S. Rosset, Olivier Galibert, G. Adda, Eric Bilinski","doi":"10.1109/ASRU.2007.4430188","DOIUrl":null,"url":null,"abstract":"In this paper, we present two different question-answering systems on speech transcripts. These two systems are based on a complete and multi-level analysis of both queries and documents. The first system uses handcrafted rules for small text fragments (snippet) selection and answer extraction. The second one replaces the handcrafting with an automatically generated research descriptor. A score based on those descriptors is used to select documents and snippets. The extraction and scoring of candidate answers is based on proximity measurements within the research descriptor elements and a number of secondary factors. The preliminary results obtained on QAst (QA on speech transcripts) development data are promising ranged from 72% correct answer at 1 st rank on manually transcribed meeting data to 94% on manually transcribed lecture data.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
In this paper, we present two different question-answering systems on speech transcripts. These two systems are based on a complete and multi-level analysis of both queries and documents. The first system uses handcrafted rules for small text fragments (snippet) selection and answer extraction. The second one replaces the handcrafting with an automatically generated research descriptor. A score based on those descriptors is used to select documents and snippets. The extraction and scoring of candidate answers is based on proximity measurements within the research descriptor elements and a number of secondary factors. The preliminary results obtained on QAst (QA on speech transcripts) development data are promising ranged from 72% correct answer at 1 st rank on manually transcribed meeting data to 94% on manually transcribed lecture data.