Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu
{"title":"Tell Them Apart: Distilling Technology Differences from Crowd-Scale Comparison Discussions","authors":"Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu","doi":"10.1145/3238147.3238208","DOIUrl":null,"url":null,"abstract":"Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"34 1","pages":"214-224"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3238147.3238208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26
Abstract
Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.