{"title":"On the feasibility of machine learning as a tool for automatic security classification: A position paper","authors":"A. Yazidi, H. Hammer, A. Bai, P. Engelstad","doi":"10.1109/ICCNC.2016.7440588","DOIUrl":null,"url":null,"abstract":"With the proliferation of threats of leakage of sensitive information such as military classified documents, information guards have recently gained increased interest. An information guard is merely a filter than controls the content of the exchanged information between two domains where one of them has a higher confidentiality level than the other one. The main role of an information guard is to block leakage of the sensitive information from the higher confidentiality domain to the lower confidentiality domain. An example of a higher confidentiality domain is a military network while a subcontractor network is an example of a lower confidentiality domain. The common practice is to use an automatic information guard based on predefined list of words that is called \"dirty word list\" in order to decide the security level of a document and consequently release it to the lower confidentially domain or block it. Traditional information guards are configured manually based on the notion of \"Dirty Lists\". The classification logic of traditional information guards uses the occurrence of words from the \"Dirty Lists\". In this paper, we advocate the use of machine learning as a corner stone for building advanced information guards. Machine learning can also be used as a supplement to the decision obtained based on \"Dirty Lists\" classification. Machine learning has hardly been analysed for this problem, and the analysis on topical classification presented here provides new knowledge and a basis for further work within this area. Ten different machine learning algorithms were applied on real life data from a military context. Presented results are promising and demonstrates that machine learning can become a useful tool to assist humans in determining the appropriate security label of an information object.","PeriodicalId":308458,"journal":{"name":"2016 International Conference on Computing, Networking and Communications (ICNC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Computing, Networking and Communications (ICNC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCNC.2016.7440588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
With the proliferation of threats of leakage of sensitive information such as military classified documents, information guards have recently gained increased interest. An information guard is merely a filter than controls the content of the exchanged information between two domains where one of them has a higher confidentiality level than the other one. The main role of an information guard is to block leakage of the sensitive information from the higher confidentiality domain to the lower confidentiality domain. An example of a higher confidentiality domain is a military network while a subcontractor network is an example of a lower confidentiality domain. The common practice is to use an automatic information guard based on predefined list of words that is called "dirty word list" in order to decide the security level of a document and consequently release it to the lower confidentially domain or block it. Traditional information guards are configured manually based on the notion of "Dirty Lists". The classification logic of traditional information guards uses the occurrence of words from the "Dirty Lists". In this paper, we advocate the use of machine learning as a corner stone for building advanced information guards. Machine learning can also be used as a supplement to the decision obtained based on "Dirty Lists" classification. Machine learning has hardly been analysed for this problem, and the analysis on topical classification presented here provides new knowledge and a basis for further work within this area. Ten different machine learning algorithms were applied on real life data from a military context. Presented results are promising and demonstrates that machine learning can become a useful tool to assist humans in determining the appropriate security label of an information object.