Ridwan Ilyas, M. L. Khodra, R. Munir, Rila Mandala, D. H. Widyantoro
{"title":"用模拟退火法生成引用句子释义","authors":"Ridwan Ilyas, M. L. Khodra, R. Munir, Rila Mandala, D. H. Widyantoro","doi":"10.3390/informatics10020034","DOIUrl":null,"url":null,"abstract":"The paraphrase generator for citation sentences is used to produce several sentence alternatives to avoid plagiarism. Furthermore, the generation results need to pay attention to semantic similarity and lexical divergence standards. This study proposed the StoPGEN model as an algorithm for generating citation paraphrase sentences with stochastic output. The generation process is guided by an objective function using a simulated annealing algorithm to maintain the properties of semantic similarity and lexical divergence. The objective function is created by combining the two factors that maintain these properties. This study combined METEOR and PINC Scores in a linear weighting function that can be adjusted for its value tendency in one of the matrix functions. The dataset of citation sentences that had been labeled with paraphrases was used to test StoPGEN and other models for comparison. The StoPGEN model, with the citation sentences dataset, produced a BLEU score of 55.37, outperforming the bidirectional LSTM method with a value of 28.93. StoPGEN was also tested using Quora data by changing the language source in the architecture section resulting in a BLEU score of 22.37, outperforming UPSA 18.21. In addition, the qualitative evaluation results of the citation sentence generation based on respondents obtained an acceptance value of 50.80.","PeriodicalId":37100,"journal":{"name":"Informatics","volume":"10 1","pages":"34"},"PeriodicalIF":3.4000,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generating Paraphrase Using Simulated Annealing for Citation Sentences\",\"authors\":\"Ridwan Ilyas, M. L. Khodra, R. Munir, Rila Mandala, D. H. Widyantoro\",\"doi\":\"10.3390/informatics10020034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paraphrase generator for citation sentences is used to produce several sentence alternatives to avoid plagiarism. Furthermore, the generation results need to pay attention to semantic similarity and lexical divergence standards. This study proposed the StoPGEN model as an algorithm for generating citation paraphrase sentences with stochastic output. The generation process is guided by an objective function using a simulated annealing algorithm to maintain the properties of semantic similarity and lexical divergence. The objective function is created by combining the two factors that maintain these properties. This study combined METEOR and PINC Scores in a linear weighting function that can be adjusted for its value tendency in one of the matrix functions. The dataset of citation sentences that had been labeled with paraphrases was used to test StoPGEN and other models for comparison. The StoPGEN model, with the citation sentences dataset, produced a BLEU score of 55.37, outperforming the bidirectional LSTM method with a value of 28.93. StoPGEN was also tested using Quora data by changing the language source in the architecture section resulting in a BLEU score of 22.37, outperforming UPSA 18.21. In addition, the qualitative evaluation results of the citation sentence generation based on respondents obtained an acceptance value of 50.80.\",\"PeriodicalId\":37100,\"journal\":{\"name\":\"Informatics\",\"volume\":\"10 1\",\"pages\":\"34\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2023-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/informatics10020034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/informatics10020034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Generating Paraphrase Using Simulated Annealing for Citation Sentences
The paraphrase generator for citation sentences is used to produce several sentence alternatives to avoid plagiarism. Furthermore, the generation results need to pay attention to semantic similarity and lexical divergence standards. This study proposed the StoPGEN model as an algorithm for generating citation paraphrase sentences with stochastic output. The generation process is guided by an objective function using a simulated annealing algorithm to maintain the properties of semantic similarity and lexical divergence. The objective function is created by combining the two factors that maintain these properties. This study combined METEOR and PINC Scores in a linear weighting function that can be adjusted for its value tendency in one of the matrix functions. The dataset of citation sentences that had been labeled with paraphrases was used to test StoPGEN and other models for comparison. The StoPGEN model, with the citation sentences dataset, produced a BLEU score of 55.37, outperforming the bidirectional LSTM method with a value of 28.93. StoPGEN was also tested using Quora data by changing the language source in the architecture section resulting in a BLEU score of 22.37, outperforming UPSA 18.21. In addition, the qualitative evaluation results of the citation sentence generation based on respondents obtained an acceptance value of 50.80.