Thomas Karanikiotis, Themistoklis G. Diamantopoulos, A. Symeonidis
{"title":"利用源代码质量分析丰富代码片段数据","authors":"Thomas Karanikiotis, Themistoklis G. Diamantopoulos, A. Symeonidis","doi":"10.3390/data8090140","DOIUrl":null,"url":null,"abstract":"The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"474 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Employing Source Code Quality Analytics for Enriching Code Snippets Data\",\"authors\":\"Thomas Karanikiotis, Themistoklis G. Diamantopoulos, A. Symeonidis\",\"doi\":\"10.3390/data8090140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.\",\"PeriodicalId\":55580,\"journal\":{\"name\":\"Atomic Data and Nuclear Data Tables\",\"volume\":\"474 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Atomic Data and Nuclear Data Tables\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/data8090140\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, ATOMIC, MOLECULAR & CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atomic Data and Nuclear Data Tables","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/data8090140","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, ATOMIC, MOLECULAR & CHEMICAL","Score":null,"Total":0}
Employing Source Code Quality Analytics for Enriching Code Snippets Data
The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.
期刊介绍:
Atomic Data and Nuclear Data Tables presents compilations of experimental and theoretical information in atomic physics, nuclear physics, and closely related fields. The journal is devoted to the publication of tables and graphs of general usefulness to researchers in both basic and applied areas. Extensive ... click here for full Aims & Scope
Atomic Data and Nuclear Data Tables presents compilations of experimental and theoretical information in atomic physics, nuclear physics, and closely related fields. The journal is devoted to the publication of tables and graphs of general usefulness to researchers in both basic and applied areas. Extensive and comprehensive compilations of experimental and theoretical results are featured.