Gutti Venkata Ranga Priyanka, A. T, Niktha Malladi
{"title":"使用Siamese Bert和Ma-LSTM的重复Quora问题对检测","authors":"Gutti Venkata Ranga Priyanka, A. T, Niktha Malladi","doi":"10.1109/ACCESS57397.2023.10199873","DOIUrl":null,"url":null,"abstract":"One of the most well-known online communities for question and answer exchanges is the Quora platform, with millions of users asking and answering questions on a wide range of topics. However, a major issue faced by the Quora community is the high quantity of questions that are duplicates that are posted on the platform. These duplicate questions not only clutter the platform but also affect the quality of content, making it difficult for users to find relevant information. Hence, there is a need to automatically identify and remove duplicate question pairs in the Quora community. Duplicate question pair detection is a a difficult issue because of the considerable fluctuation and complexity of natural language. Traditional rule-based approaches are often insufficient for capturing the nuanced meaning and context of questions. Therefore, machine learning-based methods have gained popularity in recent years for detecting duplicate question pairs. This paper proposes a framework for detecting duplicate question pairs on the Quora platform using Siamese Neural Network, BERT, MaLSTM, and BiLSTM models. Each model's effectiveness is evaluated using a variety of evaluation criteria, including accuracy, precision, recall, and F1-score, on a dataset of Quora question pairs. The experimental outcomes demonstrate that the proposed framework detects duplicate question pairs with high accuracy. with the BERT model outperforming the other models in terms of overall performance. This suggests that pretrained transformer networks can effectively capture the semantic meaning of questions and enhance the performance of duplicate question pair detection","PeriodicalId":345351,"journal":{"name":"2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS)","volume":"174 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Duplicate Quora Questions Pair Detection using Siamese Bert and Ma-LSTM\",\"authors\":\"Gutti Venkata Ranga Priyanka, A. T, Niktha Malladi\",\"doi\":\"10.1109/ACCESS57397.2023.10199873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the most well-known online communities for question and answer exchanges is the Quora platform, with millions of users asking and answering questions on a wide range of topics. However, a major issue faced by the Quora community is the high quantity of questions that are duplicates that are posted on the platform. These duplicate questions not only clutter the platform but also affect the quality of content, making it difficult for users to find relevant information. Hence, there is a need to automatically identify and remove duplicate question pairs in the Quora community. Duplicate question pair detection is a a difficult issue because of the considerable fluctuation and complexity of natural language. Traditional rule-based approaches are often insufficient for capturing the nuanced meaning and context of questions. Therefore, machine learning-based methods have gained popularity in recent years for detecting duplicate question pairs. This paper proposes a framework for detecting duplicate question pairs on the Quora platform using Siamese Neural Network, BERT, MaLSTM, and BiLSTM models. Each model's effectiveness is evaluated using a variety of evaluation criteria, including accuracy, precision, recall, and F1-score, on a dataset of Quora question pairs. The experimental outcomes demonstrate that the proposed framework detects duplicate question pairs with high accuracy. with the BERT model outperforming the other models in terms of overall performance. This suggests that pretrained transformer networks can effectively capture the semantic meaning of questions and enhance the performance of duplicate question pair detection\",\"PeriodicalId\":345351,\"journal\":{\"name\":\"2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS)\",\"volume\":\"174 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACCESS57397.2023.10199873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACCESS57397.2023.10199873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Duplicate Quora Questions Pair Detection using Siamese Bert and Ma-LSTM
One of the most well-known online communities for question and answer exchanges is the Quora platform, with millions of users asking and answering questions on a wide range of topics. However, a major issue faced by the Quora community is the high quantity of questions that are duplicates that are posted on the platform. These duplicate questions not only clutter the platform but also affect the quality of content, making it difficult for users to find relevant information. Hence, there is a need to automatically identify and remove duplicate question pairs in the Quora community. Duplicate question pair detection is a a difficult issue because of the considerable fluctuation and complexity of natural language. Traditional rule-based approaches are often insufficient for capturing the nuanced meaning and context of questions. Therefore, machine learning-based methods have gained popularity in recent years for detecting duplicate question pairs. This paper proposes a framework for detecting duplicate question pairs on the Quora platform using Siamese Neural Network, BERT, MaLSTM, and BiLSTM models. Each model's effectiveness is evaluated using a variety of evaluation criteria, including accuracy, precision, recall, and F1-score, on a dataset of Quora question pairs. The experimental outcomes demonstrate that the proposed framework detects duplicate question pairs with high accuracy. with the BERT model outperforming the other models in terms of overall performance. This suggests that pretrained transformer networks can effectively capture the semantic meaning of questions and enhance the performance of duplicate question pair detection