{"title":"Probabilistic unsupervised Chinese sentence compression","authors":"Jinguang Chen, Tingting He, Zhuoming Gui, Fang Li","doi":"10.1109/GRC.2009.5255158","DOIUrl":null,"url":null,"abstract":"Research on sentence compression has been undergoing for many years in other languages, especially in English, but research on Chinese sentence compression is rarely found. In this paper, we describe an efficient probabilistic and syntactic approach to Chinese sentence compression. We introduce the classical noisy-channel approach into Chinese sentence compression and improve it in many ways. Since there is no parallel training corpus in Chinese, we use the unsupervised learning method. This paper also presents a novel bottom-up optimizing algorithm which considers both bigram and syntactic probabilities for generating candidate compressed sentences. We evaluate results against manual compressions and a simple baseline. The experiments show the effectiveness of the proposed approach.","PeriodicalId":388774,"journal":{"name":"2009 IEEE International Conference on Granular Computing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Granular Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRC.2009.5255158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Research on sentence compression has been undergoing for many years in other languages, especially in English, but research on Chinese sentence compression is rarely found. In this paper, we describe an efficient probabilistic and syntactic approach to Chinese sentence compression. We introduce the classical noisy-channel approach into Chinese sentence compression and improve it in many ways. Since there is no parallel training corpus in Chinese, we use the unsupervised learning method. This paper also presents a novel bottom-up optimizing algorithm which considers both bigram and syntactic probabilities for generating candidate compressed sentences. We evaluate results against manual compressions and a simple baseline. The experiments show the effectiveness of the proposed approach.