首页 > 最新文献

2019 International Conference on Computational Intelligence in Data Science (ICCIDS)最新文献

英文 中文
A Comparative Study of Different Features for Vehicle Classification 车辆分类不同特征的比较研究
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862136
Anuja Prasad, L. Mary
This paper presents a comparative study of different features for vehicle classification. Real-time vehicle classification system using computer vision is relatively cheaper and easy to install. As traffic is heterogeneous in India, road planning and traffic management is challenging. So an automated vehicle detection and classification system is useful for traffic survey, planning, signal time optimization and surveillance. In this work, traffic video data is collected using a camera placed on the top of a vehicle parking on the side of a road at an angle of approximately 45°. Both audio and video are used for vehicle detection. The presence of a vehicle is detected from frames corresponding to the peaks in the short time energy of audio. The process of adaptive background subtraction is performed on the selected frames to separate the vehicle from the background. After background subtraction, morphological processes such as erosion, dilation and closing are applied to get the region of interest. There may be mulitiple frames with the same vehicle are detected at this stage. To reduce the multiple occurrences of the same vehicle in selected frames, Speeded-Up Robust Feature (SURF) matching algorithm is used. Different features like Histogram Oriented Gradient (HOG), Local Binary Pattern (LBP), KAZE, Binary Robust Invariant Scale Keypoint (BRISK) features of selected frames are extracted and Support Vector Machine (SVM) models are developed. Vehicle classification accuracy of various features are compared using a 20 minutes traffic video. It is observed that HOG gives the best result compared to KAZE, LBP and BRISK, with an accuracy of 85.50%.
本文对车辆分类的不同特征进行了比较研究。使用计算机视觉的实时车辆分类系统相对便宜且易于安装。由于印度的交通是异构的,道路规划和交通管理是具有挑战性的。因此,车辆自动检测与分类系统对交通调查、规划、信号时间优化和监控具有重要意义。在这项工作中,交通视频数据是通过一个摄像头收集的,摄像头以大约45°的角度放置在路边停车的车辆顶部。音频和视频都用于车辆检测。从音频短时间能量峰值对应的帧中检测车辆的存在。对选取的帧进行自适应背景减法处理,将车辆从背景中分离出来。背景减除后,利用侵蚀、扩张和闭合等形态学过程得到感兴趣的区域。在此阶段可能会检测到同一车辆的多个帧。为了减少同一车辆在选定帧中多次出现,采用了SURF (accelerated - up Robust Feature)匹配算法。提取所选帧的直方图导向梯度(HOG)、局部二值模式(LBP)、KAZE、二值鲁棒不变尺度关键点(BRISK)等特征,并建立支持向量机(SVM)模型。利用一段20分钟的交通视频,比较了各种特征的车辆分类准确率。结果表明,与KAZE、LBP和BRISK相比,HOG的准确率为85.50%。
{"title":"A Comparative Study of Different Features for Vehicle Classification","authors":"Anuja Prasad, L. Mary","doi":"10.1109/ICCIDS.2019.8862136","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862136","url":null,"abstract":"This paper presents a comparative study of different features for vehicle classification. Real-time vehicle classification system using computer vision is relatively cheaper and easy to install. As traffic is heterogeneous in India, road planning and traffic management is challenging. So an automated vehicle detection and classification system is useful for traffic survey, planning, signal time optimization and surveillance. In this work, traffic video data is collected using a camera placed on the top of a vehicle parking on the side of a road at an angle of approximately 45°. Both audio and video are used for vehicle detection. The presence of a vehicle is detected from frames corresponding to the peaks in the short time energy of audio. The process of adaptive background subtraction is performed on the selected frames to separate the vehicle from the background. After background subtraction, morphological processes such as erosion, dilation and closing are applied to get the region of interest. There may be mulitiple frames with the same vehicle are detected at this stage. To reduce the multiple occurrences of the same vehicle in selected frames, Speeded-Up Robust Feature (SURF) matching algorithm is used. Different features like Histogram Oriented Gradient (HOG), Local Binary Pattern (LBP), KAZE, Binary Robust Invariant Scale Keypoint (BRISK) features of selected frames are extracted and Support Vector Machine (SVM) models are developed. Vehicle classification accuracy of various features are compared using a 20 minutes traffic video. It is observed that HOG gives the best result compared to KAZE, LBP and BRISK, with an accuracy of 85.50%.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131211009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Elaborate Comprehensive Survey on Recent Developments in Behaviour Based Intrusion Detection Systems 基于行为的入侵检测系统最新发展综述
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862119
A. M. V. Bharathy, N. Umapathi, S. Prabaharan
Intrusion detection system is described as a data monitoring, network activity study and data on possible vulnerabilities and attacks in advance. One of the main limitations of the present intrusion detection technology is the need to take out fake alarms so that the user can confound with the data. This paper deals with the different types of IDS their behaviour, response time and other important factors. This paper also demonstrates and brings out the advantages and disadvantages of six latest intrusion detection techniques and gives a clear picture of the recent advancements available in the field of IDS based on the factors detection rate, accuracy, average running time and false alarm rate.
入侵检测系统被描述为一种对数据进行监控,对网络活动和数据可能存在的漏洞进行预先研究和攻击的系统。目前入侵检测技术的主要局限性之一是需要去除假警报,使用户能够与数据混淆。本文讨论了不同类型的入侵检测系统的行为、响应时间等重要因素。本文还从检出率、准确率、平均运行时间和虚警率等方面阐述了六种最新入侵检测技术的优缺点,并对入侵检测领域的最新进展进行了较为清晰的描述。
{"title":"An Elaborate Comprehensive Survey on Recent Developments in Behaviour Based Intrusion Detection Systems","authors":"A. M. V. Bharathy, N. Umapathi, S. Prabaharan","doi":"10.1109/ICCIDS.2019.8862119","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862119","url":null,"abstract":"Intrusion detection system is described as a data monitoring, network activity study and data on possible vulnerabilities and attacks in advance. One of the main limitations of the present intrusion detection technology is the need to take out fake alarms so that the user can confound with the data. This paper deals with the different types of IDS their behaviour, response time and other important factors. This paper also demonstrates and brings out the advantages and disadvantages of six latest intrusion detection techniques and gives a clear picture of the recent advancements available in the field of IDS based on the factors detection rate, accuracy, average running time and false alarm rate.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122938405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
ICCIDS 2019 Author Index ICCIDS 2019作者索引
Pub Date : 2019-02-01 DOI: 10.1109/iccids.2019.8862106
{"title":"ICCIDS 2019 Author Index","authors":"","doi":"10.1109/iccids.2019.8862106","DOIUrl":"https://doi.org/10.1109/iccids.2019.8862106","url":null,"abstract":"","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124647580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formation of SQL from Natural Language Query using NLP 使用NLP从自然语言查询生成SQL
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862080
M. Uma, V. Sneha, G. Sneha, J. Bhuvana, B. Bharathi
Today, everyone has their own personal devices that connects to the internet. Every user tries to get the information that they require through internet. Most of the information is in the form of a database. A user who wants to access a database but having limited or no knowledge of database languages faces a challenging and difficult situation. Hence, there is a need for a system that enables the users to access the information in the database. This paper aims to develop such a system using NLP by giving structured natural language question as input and receiving SQL query as the output, to access the related information from the railways reservation database with ease. The steps involved in this process are tokenization, lemmatization, parts of speech tagging, parsing and mapping. The dataset used for the proposed system has a set of 2880 structured natural language queries on train fare and seats available. We have achieved 98.89 per cent accuracy. The paper would give an overall view of the usage of Natural Language Processing (NLP) and use of regular expressions to map the query in English language to SQL.
今天,每个人都有自己连接到互联网的个人设备。每个用户都试图通过互联网获得他们需要的信息。大多数信息以数据库的形式存在。想要访问数据库但对数据库语言了解有限或完全不了解的用户将面临一种具有挑战性和困难的情况。因此,需要一个使用户能够访问数据库中的信息的系统。本文旨在利用自然语言的结构化问题作为输入,SQL查询作为输出,利用自然语言的自然语言结构化问题作为输入,利用自然语言的自然语言结构化问题作为输出,方便地从铁路订票数据库中获取相关信息。在这个过程中涉及的步骤是标记化、词法化、词性标注、解析和映射。该系统使用的数据集包含2880个结构化的自然语言查询,涉及火车票价和可用座位。我们达到了98.89%的准确率。本文将全面介绍使用自然语言处理(NLP)和使用正则表达式将英语查询映射到SQL。
{"title":"Formation of SQL from Natural Language Query using NLP","authors":"M. Uma, V. Sneha, G. Sneha, J. Bhuvana, B. Bharathi","doi":"10.1109/ICCIDS.2019.8862080","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862080","url":null,"abstract":"Today, everyone has their own personal devices that connects to the internet. Every user tries to get the information that they require through internet. Most of the information is in the form of a database. A user who wants to access a database but having limited or no knowledge of database languages faces a challenging and difficult situation. Hence, there is a need for a system that enables the users to access the information in the database. This paper aims to develop such a system using NLP by giving structured natural language question as input and receiving SQL query as the output, to access the related information from the railways reservation database with ease. The steps involved in this process are tokenization, lemmatization, parts of speech tagging, parsing and mapping. The dataset used for the proposed system has a set of 2880 structured natural language queries on train fare and seats available. We have achieved 98.89 per cent accuracy. The paper would give an overall view of the usage of Natural Language Processing (NLP) and use of regular expressions to map the query in English language to SQL.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130158743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Crowdsensing-based WiFi Indoor Localization using Feed-forward Multilayer Perceptron Regressor 基于人群感知的WiFi室内定位前馈多层感知器回归
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862117
Simran Barnwal, Wei-Jan Peng
Most RSS based indoor localization algorithms require the a priori knowledge of location of Access Points, timewise variation of location of user, and use of multiple sensor data. The paper proposes an innovative approach combining the Crowdsensing based wireless indoor localization technology with Artificial Neural Networks, to automatically predict new users location and analyze the effect of device heterogeneity on the RSS localization accuracy, by using cell phone user data. The performance evaluation demonstrates that the trained MLP Regression model can obtain the highest localization accuracy than the probabilistic localization algorithms, without individual model for each device in the fingerprinting database. In contrast with existing systems proposed in the literature, the result shows that our proposed approach efficiently handles very large number of Access Points in 10 times larger indoor spaces.
大多数基于RSS的室内定位算法需要先验地了解接入点的位置、用户位置的时间变化以及使用多个传感器数据。本文提出了一种将基于Crowdsensing的无线室内定位技术与人工神经网络相结合的创新方法,利用手机用户数据,自动预测新用户的位置,分析设备异构性对RSS定位精度的影响。性能评估表明,训练后的MLP回归模型在不使用指纹数据库中每个设备的单独模型的情况下,可以获得比概率定位算法更高的定位精度。与文献中提出的现有系统相比,结果表明,我们提出的方法有效地处理了10倍大的室内空间中的大量接入点。
{"title":"Crowdsensing-based WiFi Indoor Localization using Feed-forward Multilayer Perceptron Regressor","authors":"Simran Barnwal, Wei-Jan Peng","doi":"10.1109/ICCIDS.2019.8862117","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862117","url":null,"abstract":"Most RSS based indoor localization algorithms require the a priori knowledge of location of Access Points, timewise variation of location of user, and use of multiple sensor data. The paper proposes an innovative approach combining the Crowdsensing based wireless indoor localization technology with Artificial Neural Networks, to automatically predict new users location and analyze the effect of device heterogeneity on the RSS localization accuracy, by using cell phone user data. The performance evaluation demonstrates that the trained MLP Regression model can obtain the highest localization accuracy than the probabilistic localization algorithms, without individual model for each device in the fingerprinting database. In contrast with existing systems proposed in the literature, the result shows that our proposed approach efficiently handles very large number of Access Points in 10 times larger indoor spaces.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130229211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Grape Leaf Disease Identification using Machine Learning Techniques 利用机器学习技术识别葡萄叶病
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862084
S. M. Jaisakthi, P. Mirunalini, D. Thenmozhi, Vatsala
Having diseases is quite natural in crops due to changing climatic and environmental conditions. Diseases affect the growth and produce of the crops and often difficult to control. To ensure good quality and high production, it is necessary to have accurate disease diagnosis and control actions to prevent them in time. Grape which is widely grown crop in India and it may be affected by different types of diseases on leaf, stem and fruit. Leaf diseases which are the early symptoms caused due to fungi, bacteria and virus. So, there is a need to have an automatic system that can be used to detect the type of diseases and to take appropriate actions. We have proposed an automatic system for detecting the diseases in the grape vines using image processing and machine learning technique. The system segments the leaf (Region of Interest) from the background image using grab cut segmentation method. From the segmented leaf part the diseased region is fruther segmented based on two different methods such as global thresholding and using semi-supervised technique. The features are extracted from the segmented diseased part and it has been classified as healthy, rot, esca, and leaf blight using different machine learning techniques such as Support Vector Machine (SVM), adaboost and Random Forest tree. Using SVM we have obtained a better testing accuracy of 93%.
由于气候和环境条件的变化,农作物生病是很正常的。病害影响农作物的生长和产量,而且往往难以控制。为了保证高质量和高产,必须有准确的疾病诊断和控制措施,及时预防。葡萄是印度广泛种植的作物,它可能受到叶子、茎和果实上不同类型疾病的影响。叶片疾病是由真菌、细菌和病毒引起的早期症状。因此,有必要有一个自动系统,可以用来检测疾病的类型并采取适当的行动。我们提出了一种基于图像处理和机器学习技术的葡萄病害自动检测系统。该系统利用抓取分割的方法从背景图像中分割出叶子(感兴趣区域)。基于全局阈值分割和半监督分割两种方法,从分割的叶片部分进一步分割出病变区域。利用支持向量机(SVM)、adaboost和随机森林树等不同的机器学习技术,从被分割的患病部位提取特征,并将其分类为健康、腐烂、esca和叶枯病。使用支持向量机进行测试,准确率达到93%。
{"title":"Grape Leaf Disease Identification using Machine Learning Techniques","authors":"S. M. Jaisakthi, P. Mirunalini, D. Thenmozhi, Vatsala","doi":"10.1109/ICCIDS.2019.8862084","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862084","url":null,"abstract":"Having diseases is quite natural in crops due to changing climatic and environmental conditions. Diseases affect the growth and produce of the crops and often difficult to control. To ensure good quality and high production, it is necessary to have accurate disease diagnosis and control actions to prevent them in time. Grape which is widely grown crop in India and it may be affected by different types of diseases on leaf, stem and fruit. Leaf diseases which are the early symptoms caused due to fungi, bacteria and virus. So, there is a need to have an automatic system that can be used to detect the type of diseases and to take appropriate actions. We have proposed an automatic system for detecting the diseases in the grape vines using image processing and machine learning technique. The system segments the leaf (Region of Interest) from the background image using grab cut segmentation method. From the segmented leaf part the diseased region is fruther segmented based on two different methods such as global thresholding and using semi-supervised technique. The features are extracted from the segmented diseased part and it has been classified as healthy, rot, esca, and leaf blight using different machine learning techniques such as Support Vector Machine (SVM), adaboost and Random Forest tree. Using SVM we have obtained a better testing accuracy of 93%.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121965465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Continuous learning mechanism of NLU-ML models boosted by human feedback 人类反馈促进NLU-ML模型的持续学习机制
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862102
G. Abinaya, Gyan Ranjan, P. Aswin Karthik
In this paper, we propose a novel framework that enables a machine learning model to constantly learn over a period of time and hence improve the performance with time and more data. We have compared the performance of different models which were trained only on the actual data against models trained with the data aided by the feedback collected by the automated framework.
在本文中,我们提出了一个新的框架,使机器学习模型能够在一段时间内不断学习,从而随着时间和数据的增加而提高性能。我们比较了仅在实际数据上训练的不同模型的性能,以及使用自动化框架收集的反馈辅助数据训练的模型的性能。
{"title":"Continuous learning mechanism of NLU-ML models boosted by human feedback","authors":"G. Abinaya, Gyan Ranjan, P. Aswin Karthik","doi":"10.1109/ICCIDS.2019.8862102","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862102","url":null,"abstract":"In this paper, we propose a novel framework that enables a machine learning model to constantly learn over a period of time and hence improve the performance with time and more data. We have compared the performance of different models which were trained only on the actual data against models trained with the data aided by the feedback collected by the automated framework.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134200779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Comparing the Wrapper Feature Selection Evaluators on Twitter Sentiment Classification Twitter情感分类中包装器特征选择评估器的比较
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862033
N. Suchetha, Anupama Nikhil, P. Hrudya
The application of machine learning algorithms on text data is challenging in several ways, the greatest being the presence of sparse, high dimensional feature set. Feature selection methods are effective in reducing the dimensionality of the data and helps in improving the computational efficiency and the performance of the learned model. Recently, evolutionary computation (EC) methods have shown success in solving the feature selection problem. However, due to the requirement of a large number of evaluations, EC based feature selection methods on text data are computationally expensive. This paper examines the different evaluation classifiers used for EC based wrapper feature selection methods. A two-stage feature selection method is applied to twitter data for sentiment classification. In the first stage, a filter feature selection method based on Information Gain (IG) is applied. During the second stage, a comparison is made between 4 different EC feature selection methods, Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Cuckoo Search (CS) and Firefly Search, with different classifiers as subset evaluators. LibLinear, K Nearest neighbours (KNN) and Naive Bayes (NB) are the classifiers used for wrapper feature subset evaluation. Also, the time required for evaluating the feature subset for the chosen classifiers is computed. Finally, the effect of the application of this combined feature selection approach is evaluated using six different learners. Results demonstrate that LibLinear is computationally efficient and achieves the best performance.
机器学习算法在文本数据上的应用在几个方面都具有挑战性,最大的挑战是存在稀疏的高维特征集。特征选择方法可以有效地降低数据的维数,有助于提高计算效率和学习模型的性能。近年来,进化计算(EC)方法在解决特征选择问题上取得了成功。然而,由于需要进行大量的评估,基于EC的文本数据特征选择方法的计算成本很高。本文研究了用于基于EC的包装器特征选择方法的不同评估分类器。将两阶段特征选择方法应用于twitter数据的情感分类。第一阶段采用基于信息增益(Information Gain, IG)的滤波器特征选择方法。在第二阶段,比较了粒子群优化(PSO)、蚁群优化(ACO)、布谷鸟搜索(CS)和萤火虫搜索(Firefly Search) 4种不同的EC特征选择方法,采用不同的分类器作为子集评估器。LibLinear, K近邻(KNN)和朴素贝叶斯(NB)是用于包装器特征子集评估的分类器。此外,还计算了评估所选分类器的特征子集所需的时间。最后,使用六种不同的学习器来评估这种组合特征选择方法的应用效果。结果表明,LibLinear算法计算效率高,达到了最佳性能。
{"title":"Comparing the Wrapper Feature Selection Evaluators on Twitter Sentiment Classification","authors":"N. Suchetha, Anupama Nikhil, P. Hrudya","doi":"10.1109/ICCIDS.2019.8862033","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862033","url":null,"abstract":"The application of machine learning algorithms on text data is challenging in several ways, the greatest being the presence of sparse, high dimensional feature set. Feature selection methods are effective in reducing the dimensionality of the data and helps in improving the computational efficiency and the performance of the learned model. Recently, evolutionary computation (EC) methods have shown success in solving the feature selection problem. However, due to the requirement of a large number of evaluations, EC based feature selection methods on text data are computationally expensive. This paper examines the different evaluation classifiers used for EC based wrapper feature selection methods. A two-stage feature selection method is applied to twitter data for sentiment classification. In the first stage, a filter feature selection method based on Information Gain (IG) is applied. During the second stage, a comparison is made between 4 different EC feature selection methods, Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Cuckoo Search (CS) and Firefly Search, with different classifiers as subset evaluators. LibLinear, K Nearest neighbours (KNN) and Naive Bayes (NB) are the classifiers used for wrapper feature subset evaluation. Also, the time required for evaluating the feature subset for the chosen classifiers is computed. Finally, the effect of the application of this combined feature selection approach is evaluated using six different learners. Results demonstrate that LibLinear is computationally efficient and achieves the best performance.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124423635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
ICCIDS 2019 Photos
Pub Date : 2019-02-01 DOI: 10.1109/iccids.2019.8862086
{"title":"ICCIDS 2019 Photos","authors":"","doi":"10.1109/iccids.2019.8862086","DOIUrl":"https://doi.org/10.1109/iccids.2019.8862086","url":null,"abstract":"","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115482069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Swift Imbalance Data Classification using SMOTE and Extreme Learning Machine 基于SMOTE和极限学习机的快速失衡数据分类
Pub Date : 2019-02-01 DOI: 10.1109/ICCIDS.2019.8862112
Rishabh Rustogi, Ayush Prasad
Continuous expansion in the fields of science and technology has led to the immense availability and attainability of data in every field. Fundamentally understanding and analyzing this data is a critical job in the decision-making process. Although, great success has been achieved by the prevailing data engineering and mining techniques, the problem of swift classification of the imbalanced data still exists in academia and industry. A potential solution to the problem of skewness in data can be resolved by data upsampling or downsampling. There exists a few techniques that firstly remove skewness and then perform classification, however, these methods suffer from hurdles like abortive precision or slower learning rate. In this paper, a hybrid method to classify binary imbalanced data using Synthetic Minority Over-sampling Technique followed by Extreme Learning Machine is proposed. Our method along with swift learning rate is efficacious to predict the desired class. We verified our model using five standard imbalance dataset and obtained higher F-measure, G-mean and ROC score for all the dataset.
科学和技术领域的不断扩展导致了每个领域数据的巨大可用性和可获得性。从根本上理解和分析这些数据是决策过程中的关键工作。尽管主流的数据工程和数据挖掘技术已经取得了巨大的成功,但不平衡数据的快速分类问题在学术界和工业界仍然存在。数据偏度问题的一个潜在解决方案可以通过数据上采样或下采样来解决。目前存在一些先去除偏度再进行分类的技术,但这些方法存在精度不高或学习率较慢等障碍。本文提出了一种利用合成少数派过采样技术和极限学习机对二元不平衡数据进行分类的混合方法。该方法具有快速的学习速度,可以有效地预测出期望的类别。我们使用五个标准失衡数据集验证了我们的模型,并获得了更高的F-measure, G-mean和ROC分数。
{"title":"Swift Imbalance Data Classification using SMOTE and Extreme Learning Machine","authors":"Rishabh Rustogi, Ayush Prasad","doi":"10.1109/ICCIDS.2019.8862112","DOIUrl":"https://doi.org/10.1109/ICCIDS.2019.8862112","url":null,"abstract":"Continuous expansion in the fields of science and technology has led to the immense availability and attainability of data in every field. Fundamentally understanding and analyzing this data is a critical job in the decision-making process. Although, great success has been achieved by the prevailing data engineering and mining techniques, the problem of swift classification of the imbalanced data still exists in academia and industry. A potential solution to the problem of skewness in data can be resolved by data upsampling or downsampling. There exists a few techniques that firstly remove skewness and then perform classification, however, these methods suffer from hurdles like abortive precision or slower learning rate. In this paper, a hybrid method to classify binary imbalanced data using Synthetic Minority Over-sampling Technique followed by Extreme Learning Machine is proposed. Our method along with swift learning rate is efficacious to predict the desired class. We verified our model using five standard imbalance dataset and obtained higher F-measure, G-mean and ROC score for all the dataset.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123712949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
2019 International Conference on Computational Intelligence in Data Science (ICCIDS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1