首页 > 最新文献

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...最新文献

英文 中文
Using Deep Learning To Assign Rheumatoid Arthritis Scores 使用深度学习分配类风湿关节炎评分
S. Dang, L. Allison
In this work, we report the performance of the deep learning model in automatically assigning joint scores and overall patients scores for Rheumatoid Arthritis patients’ X-ray images. The dataset is from RA2 DREAM Challenge https://www.synapse.org/#!Synapse:syn20545111/wiki/594083. Overall, we achieve good predictive performance with an average accuracy of 0.908.
在这项工作中,我们报告了深度学习模型在自动分配类风湿关节炎患者x射线图像的关节评分和总体患者评分方面的性能。数据集来自RA2 DREAM Challenge https://www.synapse.org/#!Synapse:syn20545111/wiki/594083。总体而言,我们获得了良好的预测性能,平均准确率为0.908。
{"title":"Using Deep Learning To Assign Rheumatoid Arthritis Scores","authors":"S. Dang, L. Allison","doi":"10.1109/IRI49571.2020.00065","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00065","url":null,"abstract":"In this work, we report the performance of the deep learning model in automatically assigning joint scores and overall patients scores for Rheumatoid Arthritis patients’ X-ray images. The dataset is from RA2 DREAM Challenge https://www.synapse.org/#!Synapse:syn20545111/wiki/594083. Overall, we achieve good predictive performance with an average accuracy of 0.908.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90803574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
AD4ML: Axiomatic Design to Specify Machine Learning Solutions for Manufacturing AD4ML:为制造业指定机器学习解决方案的公理设计
Alejandro Gabriel Villanueva Zacarias, Rachaa Ghabri, P. Reimann
Machine learning is increasingly adopted in manufacturing use cases, e.g., for fault detection in a production line. Each new use case requires developing its own machine learning (ML) solution. A ML solution integrates different software components to read, process, and analyze all use case data, as well as to finally generate the output that domain experts need for their decision-making. The process to design a system specification for a ML solution is not straight-forward. It entails two types of complexity: (1) The technical complexity of selecting combinations of ML algorithms and software components that suit a use case; (2) the organizational complexity of integrating different requirements from a multidisciplinary team of, e.g., domain experts, data scientists, and IT specialists. In this paper, we propose several adaptations to Axiomatic Design in order to design ML solution specifications that handle these complexities. We call this Axiomatic Design for Machine Learning (AD4ML). We apply AD4ML to specify a ML solution for a fault detection use case and discuss to what extent our approach conquers the above-mentioned complexities. We also discuss how AD4ML facilitates the agile design of ML solutions.
机器学习越来越多地应用于制造用例中,例如用于生产线中的故障检测。每个新的用例都需要开发自己的机器学习(ML)解决方案。ML解决方案集成了不同的软件组件来读取、处理和分析所有用例数据,并最终生成领域专家决策所需的输出。为ML解决方案设计系统规范的过程并不是直截了当的。它包含两种类型的复杂性:(1)选择适合用例的ML算法和软件组件组合的技术复杂性;(2)整合来自多学科团队(如领域专家、数据科学家和IT专家)的不同需求的组织复杂性。在本文中,我们提出了对公理设计的一些调整,以便设计处理这些复杂性的ML解决方案规范。我们称之为机器学习公理设计(AD4ML)。我们应用AD4ML为故障检测用例指定ML解决方案,并讨论我们的方法在多大程度上克服了上述复杂性。我们还讨论了AD4ML如何促进ML解决方案的敏捷设计。
{"title":"AD4ML: Axiomatic Design to Specify Machine Learning Solutions for Manufacturing","authors":"Alejandro Gabriel Villanueva Zacarias, Rachaa Ghabri, P. Reimann","doi":"10.1109/IRI49571.2020.00029","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00029","url":null,"abstract":"Machine learning is increasingly adopted in manufacturing use cases, e.g., for fault detection in a production line. Each new use case requires developing its own machine learning (ML) solution. A ML solution integrates different software components to read, process, and analyze all use case data, as well as to finally generate the output that domain experts need for their decision-making. The process to design a system specification for a ML solution is not straight-forward. It entails two types of complexity: (1) The technical complexity of selecting combinations of ML algorithms and software components that suit a use case; (2) the organizational complexity of integrating different requirements from a multidisciplinary team of, e.g., domain experts, data scientists, and IT specialists. In this paper, we propose several adaptations to Axiomatic Design in order to design ML solution specifications that handle these complexities. We call this Axiomatic Design for Machine Learning (AD4ML). We apply AD4ML to specify a ML solution for a fault detection use case and discuss to what extent our approach conquers the above-mentioned complexities. We also discuss how AD4ML facilitates the agile design of ML solutions.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80984663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Distribution-based Regression for Real-time COVID-19 Cases Detection from Chest X-ray and CT Images 基于分布的回归方法在胸部x线和CT图像中实时检测COVID-19病例
Nuha Zamzami, Pantea Koochemeshkian, N. Bouguila
The novel coronavirus (COVID-19) that started last December in Wuhan, Hubei Province, China has become a serious healthcare threat with over five million confirmed cases in 215 countries around the world as on May 20. The World Health Organization recommends a rapid diagnosis and immediate isolation of suspected cases. Thus, there is an imminent need to develop an automatic real-time detection system as a quick alternative diagnosis option to control the virus spread. In this work, we propose a regression model based on a flexible distribution called shifted-scaled Dirichlet for real-time detection of coronavirus pneumonia infected patient using chest X-ray radiographs. To derive the parameters of our proposed model, we adopt the maximum likelihood method, where we update the parameters based on the stochastic gradient descent. The experimental results demonstrate that our approach is highly effective for detecting COVID-19 cases and understand the infection on a real-time basis with high accuracy up to 97%.
去年12月在中国湖北省武汉市爆发的新型冠状病毒感染症(COVID-19),截至5月20日,在全球215个国家确诊病例超过500万例,已成为严重的医疗威胁。世界卫生组织建议迅速诊断并立即隔离疑似病例。因此,迫切需要开发一种自动实时检测系统,作为控制病毒传播的快速替代诊断选择。在这项工作中,我们提出了一种基于移位尺度Dirichlet灵活分布的回归模型,用于胸部x线片实时检测冠状病毒肺炎感染者。为了得到我们所提出的模型的参数,我们采用了极大似然方法,其中我们基于随机梯度下降更新参数。实验结果表明,该方法对检测COVID-19病例非常有效,实时了解感染情况,准确率高达97%。
{"title":"A Distribution-based Regression for Real-time COVID-19 Cases Detection from Chest X-ray and CT Images","authors":"Nuha Zamzami, Pantea Koochemeshkian, N. Bouguila","doi":"10.1109/IRI49571.2020.00023","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00023","url":null,"abstract":"The novel coronavirus (COVID-19) that started last December in Wuhan, Hubei Province, China has become a serious healthcare threat with over five million confirmed cases in 215 countries around the world as on May 20. The World Health Organization recommends a rapid diagnosis and immediate isolation of suspected cases. Thus, there is an imminent need to develop an automatic real-time detection system as a quick alternative diagnosis option to control the virus spread. In this work, we propose a regression model based on a flexible distribution called shifted-scaled Dirichlet for real-time detection of coronavirus pneumonia infected patient using chest X-ray radiographs. To derive the parameters of our proposed model, we adopt the maximum likelihood method, where we update the parameters based on the stochastic gradient descent. The experimental results demonstrate that our approach is highly effective for detecting COVID-19 cases and understand the infection on a real-time basis with high accuracy up to 97%.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79913622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Background Subtraction with a Hierarchical Pitman-Yor Process Mixture Model of Generalized Gaussian Distributions 广义高斯分布的分层Pitman-Yor过程混合模型的背景减法
Srikanth Amudala, Samr Ali, N. Bouguila
This paper presents hierarchical Pitman-Yor process mixture of generalized Gaussian distributions for background subtraction. The motivation behind choosing generalized Gaussian distribution is its flexibility as compared to the widely used Gaussian. We also integrate the Pitman-Yor process into our proposed model for an infinite extension that leads to better performance in the task of background subtraction. Our model is learned via a variational Bayes approach and is applied on the challenging Change Detection dataset. Experimental results on background subtraction show the effectiveness of the proposed algorithm.
本文提出了一种基于分层Pitman-Yor混合过程的广义高斯分布背景减法。选择广义高斯分布的动机是与广泛使用的高斯分布相比,它的灵活性。我们还将Pitman-Yor过程集成到我们提出的模型中,以实现无限扩展,从而在背景减法任务中获得更好的性能。我们的模型是通过变分贝叶斯方法学习的,并应用于具有挑战性的变化检测数据集。背景减法的实验结果表明了该算法的有效性。
{"title":"Background Subtraction with a Hierarchical Pitman-Yor Process Mixture Model of Generalized Gaussian Distributions","authors":"Srikanth Amudala, Samr Ali, N. Bouguila","doi":"10.1109/IRI49571.2020.00024","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00024","url":null,"abstract":"This paper presents hierarchical Pitman-Yor process mixture of generalized Gaussian distributions for background subtraction. The motivation behind choosing generalized Gaussian distribution is its flexibility as compared to the widely used Gaussian. We also integrate the Pitman-Yor process into our proposed model for an infinite extension that leads to better performance in the task of background subtraction. Our model is learned via a variational Bayes approach and is applied on the challenging Change Detection dataset. Experimental results on background subtraction show the effectiveness of the proposed algorithm.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78706174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Adaptive and Dynamic Biosensor Epidemic Model for COVID-19 新型冠状病毒肺炎自适应动态生物传感器流行模型
Salvador V. Balkus, Joshua Rumbut, Honggang Wang, Hua Fang
The impact of the COVID-19 global pandemic has required governments across the world to develop effective public health policies using epidemiological models. Unfortunately, as a result of limited testing ability, these models often rely on lagged rather than real-time data, and cannot be adapted to small geographies to provide localized forecasts. This study proposes ADBio, a multi-level adaptive and dynamic biosensor-based model that can be used to predict the risk of infection with COVID-19 from the individual level to the county level, providing more timely and accurate estimates of virus exposure at all levels. The model is evaluated using diagnosis simulation based on current COVID-19 cases as well as GPS movement data for Massachusetts and New York, where COVID-19 hotspots had previously been observed. Results demonstrate that lagged testing data is indeed a major detriment to current modeling efforts, and that unlike the standard SEIR model, ADBio is able to adapt to arbitrarily small geographic regions and provide reasonable forecasts of COVID-19 cases. The features of this model enable greater national pandemic preparedness and provide local town and county governments a valuable tool for decision-making during a pandemic.
COVID-19全球大流行的影响要求世界各国政府利用流行病学模型制定有效的公共卫生政策。不幸的是,由于测试能力有限,这些模型往往依赖滞后数据而不是实时数据,不能适应小区域以提供局部预测。本研究提出了基于生物传感器的多层次自适应动态ADBio模型,该模型可用于从个体到县域的COVID-19感染风险预测,为各级病毒暴露提供更及时、准确的估计。该模型基于当前COVID-19病例以及马萨诸塞州和纽约州的GPS移动数据进行诊断模拟,这两个地区此前曾观察到COVID-19热点。结果表明,滞后的测试数据确实是当前建模工作的主要损害,与标准的SEIR模型不同,ADBio能够适应任意小的地理区域,并提供合理的COVID-19病例预测。这一模式的特点有助于加强国家大流行防范,并为地方镇县政府在大流行期间提供宝贵的决策工具。
{"title":"An Adaptive and Dynamic Biosensor Epidemic Model for COVID-19","authors":"Salvador V. Balkus, Joshua Rumbut, Honggang Wang, Hua Fang","doi":"10.1109/IRI49571.2020.00051","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00051","url":null,"abstract":"The impact of the COVID-19 global pandemic has required governments across the world to develop effective public health policies using epidemiological models. Unfortunately, as a result of limited testing ability, these models often rely on lagged rather than real-time data, and cannot be adapted to small geographies to provide localized forecasts. This study proposes ADBio, a multi-level adaptive and dynamic biosensor-based model that can be used to predict the risk of infection with COVID-19 from the individual level to the county level, providing more timely and accurate estimates of virus exposure at all levels. The model is evaluated using diagnosis simulation based on current COVID-19 cases as well as GPS movement data for Massachusetts and New York, where COVID-19 hotspots had previously been observed. Results demonstrate that lagged testing data is indeed a major detriment to current modeling efforts, and that unlike the standard SEIR model, ADBio is able to adapt to arbitrarily small geographic regions and provide reasonable forecasts of COVID-19 cases. The features of this model enable greater national pandemic preparedness and provide local town and county governments a valuable tool for decision-making during a pandemic.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72636099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-Guided Generative Adversarial Network to Address Atypical Anatomy in Synthetic CT Generation. 注意引导生成对抗网络在合成CT生成中解决非典型解剖问题。
Hajar Emami, Ming Dong, Carri K Glide-Hurst

Recently, interest in MR-only treatment planning using synthetic CTs (synCTs) has grown rapidly in radiation therapy. However, developing class solutions for medical images that contain atypical anatomy remains a major limitation. In this paper, we propose a novel spatial attention-guided generative adversarial network (attention-GAN) model to generate accurate synCTs using T1-weighted MRI images as the input to address atypical anatomy. Experimental results on fifteen brain cancer patients show that attention-GAN outperformed existing synCT models and achieved an average MAE of 85.223±12.08, 232.41±60.86, 246.38±42.67 Hounsfield units between synCT and CT-SIM across the entire head, bone and air regions, respectively. Qualitative analysis shows that attention-GAN has the ability to use spatially focused areas to better handle outliers, areas with complex anatomy or post-surgical regions, and thus offer strong potential for supporting near real-time MR-only treatment planning.

最近,在放射治疗中,对使用合成ct (synct)的MR-only治疗计划的兴趣迅速增长。然而,开发类解决方案的医学图像,包含非典型解剖仍然是一个主要的限制。在本文中,我们提出了一种新的空间注意引导生成对抗网络(attention-GAN)模型,该模型使用t1加权MRI图像作为输入来生成准确的同步ct,以解决非典型解剖问题。15例脑癌患者的实验结果表明,注意- gan优于现有的synCT模型,synCT与CT-SIM在整个头部、骨骼和空气区域的平均MAE分别为85.223±12.08、232.41±60.86、246.38±42.67 Hounsfield单位。定性分析表明,注意力gan具有利用空间聚焦区域更好地处理异常值、复杂解剖区域或术后区域的能力,因此为支持近实时的仅磁共振治疗计划提供了强大的潜力。
{"title":"Attention-Guided Generative Adversarial Network to Address Atypical Anatomy in Synthetic CT Generation.","authors":"Hajar Emami, Ming Dong, Carri K Glide-Hurst","doi":"10.1109/iri49571.2020.00034","DOIUrl":"10.1109/iri49571.2020.00034","url":null,"abstract":"<p><p>Recently, interest in MR-only treatment planning using synthetic CTs (synCTs) has grown rapidly in radiation therapy. However, developing class solutions for medical images that contain atypical anatomy remains a major limitation. In this paper, we propose a novel spatial attention-guided generative adversarial network (attention-GAN) model to generate accurate synCTs using T1-weighted MRI images as the input to address atypical anatomy. Experimental results on fifteen brain cancer patients show that attention-GAN outperformed existing synCT models and achieved an average MAE of 85.223±12.08, 232.41±60.86, 246.38±42.67 Hounsfield units between synCT and CT-SIM across the entire head, bone and air regions, respectively. Qualitative analysis shows that attention-GAN has the ability to use spatially focused areas to better handle outliers, areas with complex anatomy or post-surgical regions, and thus offer strong potential for supporting near real-time MR-only treatment planning.</p>","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/iri49571.2020.00034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38999271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Detection Methods of Slow Read DoS Using Full Packet Capture Data 基于全抓包数据的慢读DoS检测方法
Clifford Kemp, Chad L. Calvert, T. Khoshgoftaar
Detecting Denial of Service (DoS) attacks on web servers has become extremely popular with cybercriminals and organized crime groups. A successful DoS attack on network resources reduces availability of service to a web site and backend resources, and could easily result in a loss of millions of dollars in revenue depending on company size. There are many DoS attack methods, each of which is critical to providing an understanding of the nature of the DoS attack class. There has been a rise in recent years of application-layer DoS attack methods that target web servers and are challenging to detect. An attack may be disguised to look like legitimate traffic, except it targets specific application packets or functions. Slow Read DoS attack is one type of slow HTTP attack targeting the application-layer. Slow Read attacks are often used to exploit weaknesses in the HTTP protocol, as it is the most widely used protocol on the Internet. In this paper, we use Full Packet Capture (FPC) datasets for detecting Slow Read DoS attacks with machine learning methods. All data collected originates in a live network environment. Our approach produces FPC features taken from network packets at the IP and TCP layers. Experimental results show that the machine learners were quite successful in identifying the Slow Read attacks with high detection and low false alarm rates using FPC data. Our experiment evaluates FPC datasets to determine the accuracy and efficiency of several detection models for Slow Read attacks. The experiment demonstrates that FPC features are discriminative enough to detect such attacks.
在网络犯罪分子和有组织犯罪集团中,检测网络服务器上的拒绝服务攻击(DoS)已经变得非常流行。对网络资源的成功DoS攻击会降低对网站和后端资源的服务可用性,并且很容易导致数百万美元的收入损失,这取决于公司的规模。有许多DoS攻击方法,每一种方法都对理解DoS攻击类的本质至关重要。近年来,针对web服务器的应用层DoS攻击方法有所增加,并且很难检测到。攻击可能伪装成合法的流量,但攻击目标是特定的应用数据包或功能。慢读DoS攻击是一种针对应用层的HTTP慢读攻击。慢读攻击通常用于利用HTTP协议中的弱点,因为它是Internet上使用最广泛的协议。在本文中,我们使用完整数据包捕获(FPC)数据集通过机器学习方法检测慢读DoS攻击。所有收集的数据都来源于一个实时的网络环境。我们的方法从IP和TCP层的网络数据包中产生FPC特征。实验结果表明,利用FPC数据,机器学习器能够很好地识别出检测率高、虚警率低的慢读攻击。我们的实验评估了FPC数据集,以确定几种慢读攻击检测模型的准确性和效率。实验表明,FPC特征具有足够的判别能力来检测此类攻击。
{"title":"Detection Methods of Slow Read DoS Using Full Packet Capture Data","authors":"Clifford Kemp, Chad L. Calvert, T. Khoshgoftaar","doi":"10.1109/IRI49571.2020.00010","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00010","url":null,"abstract":"Detecting Denial of Service (DoS) attacks on web servers has become extremely popular with cybercriminals and organized crime groups. A successful DoS attack on network resources reduces availability of service to a web site and backend resources, and could easily result in a loss of millions of dollars in revenue depending on company size. There are many DoS attack methods, each of which is critical to providing an understanding of the nature of the DoS attack class. There has been a rise in recent years of application-layer DoS attack methods that target web servers and are challenging to detect. An attack may be disguised to look like legitimate traffic, except it targets specific application packets or functions. Slow Read DoS attack is one type of slow HTTP attack targeting the application-layer. Slow Read attacks are often used to exploit weaknesses in the HTTP protocol, as it is the most widely used protocol on the Internet. In this paper, we use Full Packet Capture (FPC) datasets for detecting Slow Read DoS attacks with machine learning methods. All data collected originates in a live network environment. Our approach produces FPC features taken from network packets at the IP and TCP layers. Experimental results show that the machine learners were quite successful in identifying the Slow Read attacks with high detection and low false alarm rates using FPC data. Our experiment evaluates FPC datasets to determine the accuracy and efficiency of several detection models for Slow Read attacks. The experiment demonstrates that FPC features are discriminative enough to detect such attacks.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78789284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Agile Integration: Specification-based Data Alignment 迈向敏捷集成:基于规范的数据对齐
C. Giossi, D. Maier, K. Tufte, Elliot Gall, M. Barnes
Utilizing data sets from multiple domains is a common procedure in scientific research. For example, research on the performance of buildings may require data from multiple sources that lack a singular standard for data reporting. The Building Management System might report data at regular 5minute intervals, whereas an air-quality sensor might capture values only when there has been significant change from the previous value. Many systems exist to help integrate multiple data sources into a single system or interface. However, such systems do not necessarily make it easy to modify an integration plan, for example, to accommodate data exploration, new and changing data sets or shifts in the questions of interest. We propose an agile data-integration system to enable quick and adaptive analysis across many data sets, concentrating initially on the data alignment step: combining data values from multiple time-series based data sets whose time schedules. To this end, we adopt a Domain Specific Language approach where we construct a domain model for alignment, provide a specification language for describing alignments in the model and implement an interpreter for specification in that language. Our implementation exploits a rank-based join in SQL that produces faster alignment times than the commonly suggested method of aligning data sets in a database. We present experiments to demonstrate the advantage of our method and exploit data properties for optimization.
利用来自多个领域的数据集是科学研究中的一个常见过程。例如,对建筑物性能的研究可能需要来自多个来源的数据,而这些数据缺乏单一的数据报告标准。楼宇管理系统可能每隔5分钟定期报告数据,而空气质素传感器可能只在与先前的数值有重大变化时才会捕捉数值。有许多系统可以帮助将多个数据源集成到单个系统或接口中。然而,这样的系统不一定使修改集成计划变得容易,例如,以适应数据探索、新的和不断变化的数据集或感兴趣问题的变化。我们提出了一个灵活的数据集成系统,以实现跨许多数据集的快速和自适应分析,最初集中在数据对齐步骤:组合来自多个基于时间序列的数据集的数据值,这些数据集的时间表。为此,我们采用一种领域特定语言方法,在这种方法中,我们构建一个用于对齐的领域模型,提供一种用于描述模型中的对齐的规范语言,并用该语言实现规范的解释器。我们的实现利用SQL中的基于排名的连接,它比通常建议的对齐数据库中的数据集的方法产生更快的对齐时间。我们提出了实验来证明我们的方法的优势,并利用数据属性进行优化。
{"title":"Towards Agile Integration: Specification-based Data Alignment","authors":"C. Giossi, D. Maier, K. Tufte, Elliot Gall, M. Barnes","doi":"10.1109/IRI49571.2020.00055","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00055","url":null,"abstract":"Utilizing data sets from multiple domains is a common procedure in scientific research. For example, research on the performance of buildings may require data from multiple sources that lack a singular standard for data reporting. The Building Management System might report data at regular 5minute intervals, whereas an air-quality sensor might capture values only when there has been significant change from the previous value. Many systems exist to help integrate multiple data sources into a single system or interface. However, such systems do not necessarily make it easy to modify an integration plan, for example, to accommodate data exploration, new and changing data sets or shifts in the questions of interest. We propose an agile data-integration system to enable quick and adaptive analysis across many data sets, concentrating initially on the data alignment step: combining data values from multiple time-series based data sets whose time schedules. To this end, we adopt a Domain Specific Language approach where we construct a domain model for alignment, provide a specification language for describing alignments in the model and implement an interpreter for specification in that language. Our implementation exploits a rank-based join in SQL that produces faster alignment times than the commonly suggested method of aligning data sets in a database. We present experiments to demonstrate the advantage of our method and exploit data properties for optimization.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87576325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An I/O Request Packet (IRP) Driven Effective Ransomware Detection Scheme using Artificial Neural Network 一种基于人工神经网络的I/O请求包驱动的有效勒索软件检测方案
Md. Ahsan Ayub, Andrea Continella, Ambareen Siraj
In recent times, there has been a global surge of ransomware attacks targeted at industries of various types and sizes from retail to critical infrastructure. Ransomware researchers are constantly coming across new kinds of ransomware samples every day and discovering novel ransomware families out in the wild. To mitigate this ever-growing menace, academia and industry-based security researchers have been utilizing unique ways to defend against this type of cyber-attacks. I/O Request Packet (IRP), a low-level file system I/O log, is a newly found research paradigm for defense against ransomware that is being explored frequently. As such in this study, to learn granular level, actionable insights of ransomware behavior, we analyze the IRP logs of 272 ransomware samples belonging to 18 different ransomware families captured during individual execution. We further our analysis by building an effective Artificial Neural Network (ANN) structure for successful ransomware detection by learning the underlying patterns of the IRP logs. We evaluate the ANN model with three different experimental settings to prove the effectiveness of our approach. The model demonstrates outstanding performance in terms of accuracy, precision score, recall score, and F1 score, i.e., in the range of 99.7%±0.2%.
最近,全球范围内针对从零售到关键基础设施等各种类型和规模的行业的勒索软件攻击激增。勒索软件研究人员每天都在不断地遇到新的勒索软件样本,并在野外发现新的勒索软件家族。为了减轻这种日益增长的威胁,学术界和行业安全研究人员一直在利用独特的方法来防御这种类型的网络攻击。I/O请求包(IRP)是一种低级文件系统I/O日志,是一种新发现的用于防御勒索软件的研究范式,正在被频繁探索。因此,在本研究中,为了了解勒索软件行为的颗粒级,可操作的见解,我们分析了在单个执行期间捕获的属于18个不同勒索软件家族的272个勒索软件样本的IRP日志。通过学习IRP日志的底层模式,我们构建了一个有效的人工神经网络(ANN)结构,用于成功检测勒索软件,从而进一步进行了分析。我们用三种不同的实验设置来评估人工神经网络模型,以证明我们方法的有效性。该模型在准确率、精密度评分、召回率评分和F1评分方面表现优异,均在99.7%±0.2%的范围内。
{"title":"An I/O Request Packet (IRP) Driven Effective Ransomware Detection Scheme using Artificial Neural Network","authors":"Md. Ahsan Ayub, Andrea Continella, Ambareen Siraj","doi":"10.1109/IRI49571.2020.00053","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00053","url":null,"abstract":"In recent times, there has been a global surge of ransomware attacks targeted at industries of various types and sizes from retail to critical infrastructure. Ransomware researchers are constantly coming across new kinds of ransomware samples every day and discovering novel ransomware families out in the wild. To mitigate this ever-growing menace, academia and industry-based security researchers have been utilizing unique ways to defend against this type of cyber-attacks. I/O Request Packet (IRP), a low-level file system I/O log, is a newly found research paradigm for defense against ransomware that is being explored frequently. As such in this study, to learn granular level, actionable insights of ransomware behavior, we analyze the IRP logs of 272 ransomware samples belonging to 18 different ransomware families captured during individual execution. We further our analysis by building an effective Artificial Neural Network (ANN) structure for successful ransomware detection by learning the underlying patterns of the IRP logs. We evaluate the ANN model with three different experimental settings to prove the effectiveness of our approach. The model demonstrates outstanding performance in terms of accuracy, precision score, recall score, and F1 score, i.e., in the range of 99.7%±0.2%.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84806798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Relevance of Grapheme’s Shape Complexity in Writer Verification Task 书写者验证任务中字形复杂度的相关性
A. Bensefia, Chawki Djeddi
Recognizing and identifying people, based on their physical and behavioral characteristics, have always had a wide range of applications, inciting researchers to propose dedicated human recognition systems for each human characteristic. These systems operate according to two different modes: identification mode, where the task is to assign one of the preregistered identities in the system to the human’s sample read as input. The second mode is the verification (authentication), is a decision task stating if a human’s sample read as input belongs really to the claimed identity. Handwriting has emerged as one of these behavioral features that attracted a lot of interests during the last decade. Many writer identification systems have been developed comparing to writer verification (authentication) systems. In this paper we propose an original approach based on the usage of the shape complexity to authenticate writers’ identities. To this end, a local feature (grapheme) is considered, where the graphemes are generated automatically with a dedicated segmentation module. The Fourier Elliptic Transform was used to measure the shape complexity of the resulting graphemes. Only the top complex graphemes (K-Graphemes) were used to measure the similarity between a pair of handwritten samples. The approach was evaluated with 3 sets of 50 different writers of the BFL dataset, where we obtained a performance of almost 80% of good acceptance at 8% error rate. These results validate completely the relevance of the shape complexity in writer recognition tasks.
基于人的身体和行为特征来识别和识别人,一直有着广泛的应用,这促使研究人员为每个人的特征提出专门的人类识别系统。这些系统根据两种不同的模式运行:识别模式,其中任务是将系统中预注册的身份之一分配给人类的样本读取作为输入。第二种模式是验证(authentication),这是一项决策任务,说明作为输入读取的人类样本是否真正属于所声称的身份。在过去的十年里,书写已经成为这些行为特征之一,吸引了很多人的兴趣。与编写器验证(身份验证)系统相比,已经开发了许多编写器识别系统。本文提出了一种基于形状复杂度的作者身份认证方法。为此,考虑了一个局部特征(字素),其中字素是用专用的分割模块自动生成的。傅里叶椭圆变换用于测量所得到的石墨烯的形状复杂度。仅使用顶部复杂石墨烯(k -石墨烯)来测量一对手写样本之间的相似性。我们用BFL数据集的3组50个不同的作者对该方法进行了评估,我们在8%的错误率下获得了几乎80%的良好接受度。这些结果完全验证了形状复杂性在写作者识别任务中的相关性。
{"title":"Relevance of Grapheme’s Shape Complexity in Writer Verification Task","authors":"A. Bensefia, Chawki Djeddi","doi":"10.1109/IRI49571.2020.00016","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00016","url":null,"abstract":"Recognizing and identifying people, based on their physical and behavioral characteristics, have always had a wide range of applications, inciting researchers to propose dedicated human recognition systems for each human characteristic. These systems operate according to two different modes: identification mode, where the task is to assign one of the preregistered identities in the system to the human’s sample read as input. The second mode is the verification (authentication), is a decision task stating if a human’s sample read as input belongs really to the claimed identity. Handwriting has emerged as one of these behavioral features that attracted a lot of interests during the last decade. Many writer identification systems have been developed comparing to writer verification (authentication) systems. In this paper we propose an original approach based on the usage of the shape complexity to authenticate writers’ identities. To this end, a local feature (grapheme) is considered, where the graphemes are generated automatically with a dedicated segmentation module. The Fourier Elliptic Transform was used to measure the shape complexity of the resulting graphemes. Only the top complex graphemes (K-Graphemes) were used to measure the similarity between a pair of handwritten samples. The approach was evaluated with 3 sets of 50 different writers of the BFL dataset, where we obtained a performance of almost 80% of good acceptance at 8% error rate. These results validate completely the relevance of the shape complexity in writer recognition tasks.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88360860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1