Chan Hsu, Chan-Tung Ku, Yuwen Wang, Minchen Hsieh, Jun-Ting Wu, Yunhsiang Hsieh, PoFeng Chang, Yimin Lu, Yihuang Kang
{"title":"A Teacher-Student Knowledge Distillation Framework for Enhanced Detection of Anomalous User Activity","authors":"Chan Hsu, Chan-Tung Ku, Yuwen Wang, Minchen Hsieh, Jun-Ting Wu, Yunhsiang Hsieh, PoFeng Chang, Yimin Lu, Yihuang Kang","doi":"10.1109/iri58017.2023.00011","DOIUrl":null,"url":null,"abstract":"As information systems continuously produce high volumes of user event log data, efficient detection of anomalous activities indicative of insider threats becomes crucial. Typical supervised Machine Learning (ML) methods are often labor-intensive and suffer from the constraints of costly labeled data with unknown anomaly dependencies. Here we introduce a knowledge distillation ML framework, using multiple binary classifiers as teacher models and a multi-label model as the student. Leveraging the soft targets of teacher models, we demonstrate that the student model significantly improves performance.","PeriodicalId":290818,"journal":{"name":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iri58017.2023.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As information systems continuously produce high volumes of user event log data, efficient detection of anomalous activities indicative of insider threats becomes crucial. Typical supervised Machine Learning (ML) methods are often labor-intensive and suffer from the constraints of costly labeled data with unknown anomaly dependencies. Here we introduce a knowledge distillation ML framework, using multiple binary classifiers as teacher models and a multi-label model as the student. Leveraging the soft targets of teacher models, we demonstrate that the student model significantly improves performance.