源代码作者识别的层次神经网络方法

Pranali Bora, Tulika Awalgaonkar, Himanshu Palve, Raviraj Joshi, Purvi Goel
{"title":"源代码作者识别的层次神经网络方法","authors":"Pranali Bora, Tulika Awalgaonkar, Himanshu Palve, Raviraj Joshi, Purvi Goel","doi":"10.1145/3457682.3457709","DOIUrl":null,"url":null,"abstract":"With the open-source revolution, source codes are now more easily accessible than ever. This has, however, made it easier for malicious users and institutions to copy the code without giving regards to the license, or credit to the original author. Therefore, source code author identification is a critical task with paramount importance. In this paper, we propose ICodeNet - a hierarchical neural network that can be used for source code file-level tasks. The ICodeNet processes source code in image format and is employed for the task of per file author identification. The ICodeNet consists of an ImageNet trained VGG encoder followed by a shallow neural network. The shallow network is based either on CNN or LSTM. Different variations of models are evaluated on a source code author classification dataset. We have also compared our image-based hierarchical neural network model with simple image-based CNN architecture and text-based CNN and LSTM models to highlight its novelty and efficiency.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"ICodeNet - A Hierarchical Neural Network Approach For Source Code Author Identification\",\"authors\":\"Pranali Bora, Tulika Awalgaonkar, Himanshu Palve, Raviraj Joshi, Purvi Goel\",\"doi\":\"10.1145/3457682.3457709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the open-source revolution, source codes are now more easily accessible than ever. This has, however, made it easier for malicious users and institutions to copy the code without giving regards to the license, or credit to the original author. Therefore, source code author identification is a critical task with paramount importance. In this paper, we propose ICodeNet - a hierarchical neural network that can be used for source code file-level tasks. The ICodeNet processes source code in image format and is employed for the task of per file author identification. The ICodeNet consists of an ImageNet trained VGG encoder followed by a shallow neural network. The shallow network is based either on CNN or LSTM. Different variations of models are evaluated on a source code author classification dataset. We have also compared our image-based hierarchical neural network model with simple image-based CNN architecture and text-based CNN and LSTM models to highlight its novelty and efficiency.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

随着开放源代码的革命,源代码现在比以往任何时候都更容易获得。然而,这使得恶意用户和机构更容易复制代码,而无需考虑许可证或原始作者的信用。因此,源代码作者识别是一项至关重要的关键任务。在本文中,我们提出了ICodeNet -一个可用于源代码文件级任务的分层神经网络。ICodeNet以图像格式处理源代码,并用于每个文件作者识别的任务。ICodeNet由ImageNet训练的VGG编码器和浅神经网络组成。浅层网络基于CNN或LSTM。在源代码作者分类数据集上评估模型的不同变体。我们还将基于图像的分层神经网络模型与简单的基于图像的CNN架构和基于文本的CNN和LSTM模型进行了比较,以突出其新颖性和效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ICodeNet - A Hierarchical Neural Network Approach For Source Code Author Identification
With the open-source revolution, source codes are now more easily accessible than ever. This has, however, made it easier for malicious users and institutions to copy the code without giving regards to the license, or credit to the original author. Therefore, source code author identification is a critical task with paramount importance. In this paper, we propose ICodeNet - a hierarchical neural network that can be used for source code file-level tasks. The ICodeNet processes source code in image format and is employed for the task of per file author identification. The ICodeNet consists of an ImageNet trained VGG encoder followed by a shallow neural network. The shallow network is based either on CNN or LSTM. Different variations of models are evaluated on a source code author classification dataset. We have also compared our image-based hierarchical neural network model with simple image-based CNN architecture and text-based CNN and LSTM models to highlight its novelty and efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Corpus Construction and Entity Recognition for the Field of Industrial Robot Fault Diagnosis GCN2-NAA: Two-stage Graph Convolutional Networks with Node-Aware Attention for Joint Entity and Relation Extraction A Practical Indoor and Outdoor Seamless Navigation System Based on Electronic Map and Geomagnetism SC-DGCN: Sentiment Classification Based on Densely Connected Graph Convolutional Network Bird Songs Recognition Based on Ensemble Extreme Learning Machine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1