首页 > 最新文献

2010 Second International Conference on Machine Learning and Computing最新文献

英文 中文
Hybrid Genetic Algorithm and Learning Vector Quantization Modeling for Cost-Sensitive Bankruptcy Prediction 成本敏感破产预测的混合遗传算法和学习向量量化建模
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.29
Ning Chen, B. Ribeiro, Armando Vieira, João M. M. Duarte, J. C. Neves
Cost-sensitive classification algorithms that enable effective prediction, where the costs of misclassification can be very different, are crucial to creditors and auditors in credit risk analysis. Learning vector quantization (LVQ) is a powerful tool to solve bankruptcy prediction problem as a classification task. The genetic algorithm (GA) is applied widely in conjunction with artificial intelligent methods. The hybridization of genetic algorithm with existing classification algorithms is well illustrated in the field of bankruptcy prediction. In this paper, a hybrid GA and LVQ approach is proposed to minimize the expected misclassified cost under the asymmetric cost preference. Experiments on real-life French private company data show the proposed approach helps to improve the predictive performance in asymmetric cost setup.
对成本敏感的分类算法在信用风险分析中对债权人和审计员至关重要,它能够实现有效的预测,而错误分类的成本可能相差很大。学习向量量化(LVQ)是解决破产预测这一分类问题的有力工具。遗传算法与人工智能方法的结合得到了广泛的应用。遗传算法与现有分类算法的融合在破产预测领域得到了很好的说明。在成本偏好不对称的情况下,提出了一种遗传算法和LVQ算法的混合方法来最小化期望错分类成本。对法国私营企业真实数据的实验表明,本文提出的方法有助于提高非对称成本设置下的预测性能。
{"title":"Hybrid Genetic Algorithm and Learning Vector Quantization Modeling for Cost-Sensitive Bankruptcy Prediction","authors":"Ning Chen, B. Ribeiro, Armando Vieira, João M. M. Duarte, J. C. Neves","doi":"10.1109/ICMLC.2010.29","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.29","url":null,"abstract":"Cost-sensitive classification algorithms that enable effective prediction, where the costs of misclassification can be very different, are crucial to creditors and auditors in credit risk analysis. Learning vector quantization (LVQ) is a powerful tool to solve bankruptcy prediction problem as a classification task. The genetic algorithm (GA) is applied widely in conjunction with artificial intelligent methods. The hybridization of genetic algorithm with existing classification algorithms is well illustrated in the field of bankruptcy prediction. In this paper, a hybrid GA and LVQ approach is proposed to minimize the expected misclassified cost under the asymmetric cost preference. Experiments on real-life French private company data show the proposed approach helps to improve the predictive performance in asymmetric cost setup.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128253446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Association Rule for Classification of Type-2 Diabetic Patients 2型糖尿病患者分类关联规则
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.67
B. Patil, R. C. Joshi, Durga Toshniwal
The discovery of knowledge from medical databases is important in order to make effective medical diagnosis. The aim of data mining is extract the information from database and generate clear and understandable description of patterns. In this study we have introduced a new approach to generate association rules on numeric data. We propose a modified equal width binning interval approach to discretizing continuous valued attributes. The approximate width of the desired intervals is chosen based on the opinion of medical expert and is provided as an input parameter to the model. First we have converted numeric attributes into categorical form based on above techniques. Apriori algorithm is usually used for the market basket analysis was used to generate rules on Pima Indian diabetes data. The data set was taken from UCI machine learning repository containing total instances 768 and 8 numeric attributes.We discover that the often neglected pre-processing steps in knowledge discovery are the most critical elements in determining the success of a data mining application. Lastly we have generated the association rules which are useful to identify general associations in the data, to understand the relationship between the measured fields whether the patient goes on to develop diabetes or not. We are presented step-by-step approach to help the health doctors to explore their data and to understand the discovered rules better.
从医学数据库中发现知识对于进行有效的医学诊断是非常重要的。数据挖掘的目的是从数据库中提取信息,生成清晰易懂的模式描述。在本研究中,我们引入了一种新的方法来生成数值数据的关联规则。提出了一种改进的等宽分组区间方法来离散连续值属性。根据医学专家的意见选择期望区间的近似宽度,并将其作为模型的输入参数。首先,我们根据上述技术将数字属性转换为分类形式。采用Apriori算法对通常用于市场购物篮分析的皮马印第安人糖尿病数据生成规则。数据集取自UCI机器学习存储库,包含总共768个实例和8个数字属性。我们发现,在知识发现中经常被忽视的预处理步骤是决定数据挖掘应用成功的最关键因素。最后,我们生成了关联规则,它有助于识别数据中的一般关联,以了解所测量字段之间的关系,无论患者是否继续发展为糖尿病。我们提出了循序渐进的方法来帮助健康医生探索他们的数据,并更好地理解发现的规则。
{"title":"Association Rule for Classification of Type-2 Diabetic Patients","authors":"B. Patil, R. C. Joshi, Durga Toshniwal","doi":"10.1109/ICMLC.2010.67","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.67","url":null,"abstract":"The discovery of knowledge from medical databases is important in order to make effective medical diagnosis. The aim of data mining is extract the information from database and generate clear and understandable description of patterns. In this study we have introduced a new approach to generate association rules on numeric data. We propose a modified equal width binning interval approach to discretizing continuous valued attributes. The approximate width of the desired intervals is chosen based on the opinion of medical expert and is provided as an input parameter to the model. First we have converted numeric attributes into categorical form based on above techniques. Apriori algorithm is usually used for the market basket analysis was used to generate rules on Pima Indian diabetes data. The data set was taken from UCI machine learning repository containing total instances 768 and 8 numeric attributes.We discover that the often neglected pre-processing steps in knowledge discovery are the most critical elements in determining the success of a data mining application. Lastly we have generated the association rules which are useful to identify general associations in the data, to understand the relationship between the measured fields whether the patient goes on to develop diabetes or not. We are presented step-by-step approach to help the health doctors to explore their data and to understand the discovered rules better.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131631927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 90
An Investigation on Linear SVM and its Variants for Text Categorization 用于文本分类的线性支持向量机及其变体研究
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.64
M. A. Kumar, M. Gopal
Linear Support Vector Machines (SVMs) have been used successfully to classify text documents into set of concepts. With the increasing number of linear SVM formulations and decomposition algorithms publicly available, this paper performs a study on their efficiency and efficacy for text categorization tasks. Eight publicly available implementations are investigated in terms of Break Even Point (BEP), F1 measure, ROC plots, learning speed and sensitivity to penalty parameter, based on the experimental results on two benchmark text corpuses. The results show that out of the eight implementations, SVMlin and Proximal SVM perform better in terms of consistent performance and reduced training time. However being an extremely simple algorithm with training time independent of the penalty parameter and the category for which training is being done, Proximal SVM is appealing. We further investigated fuzzy proximal SVM on both the text corpuses; it showed improved generalization over proximal SVM.
线性支持向量机(svm)已被成功地用于将文本文档分类为概念集。随着线性支持向量机公式和分解算法的不断增加,本文对其在文本分类任务中的效率和效果进行了研究。基于两个基准文本语料库的实验结果,研究了8个公开可用的实现,包括盈亏平衡点(BEP)、F1测量、ROC图、学习速度和对惩罚参数的敏感性。结果表明,在8种实现中,SVM和Proximal SVM在一致性和减少训练时间方面表现更好。然而,作为一种极其简单的算法,训练时间与惩罚参数和训练的类别无关,Proximal SVM很有吸引力。我们进一步研究了模糊近端支持向量机在两种文本语料库上的应用;它比近端支持向量机的泛化效果更好。
{"title":"An Investigation on Linear SVM and its Variants for Text Categorization","authors":"M. A. Kumar, M. Gopal","doi":"10.1109/ICMLC.2010.64","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.64","url":null,"abstract":"Linear Support Vector Machines (SVMs) have been used successfully to classify text documents into set of concepts. With the increasing number of linear SVM formulations and decomposition algorithms publicly available, this paper performs a study on their efficiency and efficacy for text categorization tasks. Eight publicly available implementations are investigated in terms of Break Even Point (BEP), F1 measure, ROC plots, learning speed and sensitivity to penalty parameter, based on the experimental results on two benchmark text corpuses. The results show that out of the eight implementations, SVMlin and Proximal SVM perform better in terms of consistent performance and reduced training time. However being an extremely simple algorithm with training time independent of the penalty parameter and the category for which training is being done, Proximal SVM is appealing. We further investigated fuzzy proximal SVM on both the text corpuses; it showed improved generalization over proximal SVM.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134143166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Parallelism through dynamic instrumentation at runtime 在运行时通过动态插装实现并行性
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.58
Raj Yadav, Mankawal Deep Singh, Neha Mahajan
This paper presents a novel approach to achieve parallelism on multi-core systems out of the legacy software without recompilation. A profiler tool can be enhanced, from identifying the bottleneck areas, to analyzing the instruction set in bottleneck areas. As the instructions along with all data dependencies are available in the running program, heuristics can be applied to detect the candidates for instruction level parallelism. The serial regions can be regenerated into parallel regions for multiple cores using predefined OpenMP calls and instrument dynamically at runtime. We discuss the problems for parallelism 1) Identifying the parallel regions for parallelism from serial code 2) Detailed approach for generating code generation at runtime.
本文提出了一种新的方法,在不重新编译的情况下,利用遗留软件实现多核系统的并行性。可以增强分析器工具,从识别瓶颈区域,到分析瓶颈区域中的指令集。由于指令和所有数据依赖关系在运行的程序中都是可用的,因此可以应用启发式方法来检测指令级并行性的候选项。串行区域可以在运行时使用预定义的OpenMP调用和仪器动态地重新生成为多核的并行区域。我们讨论了并行性的问题:1)从串行代码中确定并行性的并行区域2)在运行时生成代码的详细方法。
{"title":"Parallelism through dynamic instrumentation at runtime","authors":"Raj Yadav, Mankawal Deep Singh, Neha Mahajan","doi":"10.1109/ICMLC.2010.58","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.58","url":null,"abstract":"This paper presents a novel approach to achieve parallelism on multi-core systems out of the legacy software without recompilation. A profiler tool can be enhanced, from identifying the bottleneck areas, to analyzing the instruction set in bottleneck areas. As the instructions along with all data dependencies are available in the running program, heuristics can be applied to detect the candidates for instruction level parallelism. The serial regions can be regenerated into parallel regions for multiple cores using predefined OpenMP calls and instrument dynamically at runtime. We discuss the problems for parallelism 1) Identifying the parallel regions for parallelism from serial code 2) Detailed approach for generating code generation at runtime.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124550011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Study of Energy Efficient, Power Aware Routing Algorithm and Their Applications 节能、功率感知路由算法及其应用研究
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.44
A. A., G. Sakthidharan, Kanchan M. Miskin
Routing is the process of moving packets through an internetwork, such as the Internet. Routing consists of two separate but related tasks: i) Defining and selecting path in the network ii) Forwarding packets based upon the defined paths from a designated source node to a designated destination node. With the advance of wireless communication technology, small size and high performance computing and communication devices like commercial laptops and personal computers are increasingly used in convention centers, conferences and electronic classrooms. In wireless ad-hoc networks, a collection of nodes with wireless communications and networking capability communicate with each other without the aid of any centralized administrator. The nodes are powered by batteries with limited energy reservoir. It becomes difficult to recharge or replace the batteries of the nodes hence energy conservation is essential. An energy efficient routing protocol (EERP) balances node energy utilization to reduce energy consumption and increase the life of nodes thus increasing the network lifetime, reducing the routing delay and increasing the reliability of the packets reaching the destination. Wireless networks do not have any fixed communication infrastructure. For an active connection the end host as well as the intermediate nodes can be mobile. Therefore routes are subject to frequent disconnection. In such an environment it is important to minimize disruptions caused by changing topology for applications using voice and video. Power Aware Routing enables the nodes to detect misbehavior like deviation from regular routing and forwarding by observing the status of the node. By exploiting non-random behaviors for the mobility patterns that mobile user exhibit, state of network topology can be predicted and perform route reconstruction proactively in a timely manner. In this paper we propose an Energy Efficient- Power Aware routing algorithm where we have integrated energy efficiency with power awareness parameters for routing of packets.
路由是在Internet(如Internet)中移动数据包的过程。路由包括两个相互独立但又相互关联的任务:1)定义和选择网络中的路径2)根据定义的路径将数据包从指定的源节点转发到指定的目的节点。随着无线通信技术的发展,商用笔记本电脑、个人电脑等小型高性能计算和通信设备越来越多地应用于会议中心、会议和电子教室。在无线自组织网络中,具有无线通信和网络功能的节点集合无需任何集中管理员的帮助即可相互通信。节点由储能有限的电池供电。节点的电池很难充电或更换,因此节能是至关重要的。EERP (energy efficient routing protocol)是一种均衡节点能量利用率的路由协议,它可以减少节点的能量消耗,延长节点的寿命,从而延长网络的生命周期,减少路由延迟,提高报文到达目的地的可靠性。无线网络没有任何固定的通信基础设施。对于活动连接,终端主机和中间节点都可以是移动的。因此,线路经常中断。在这样的环境中,重要的是尽量减少使用语音和视频的应用程序更改拓扑所造成的中断。功率感知路由可以通过观察节点的状态来检测偏离正常路由和转发等异常行为。通过利用移动用户表现出的移动模式的非随机行为,可以及时预测网络拓扑状态并主动进行路由重构。在本文中,我们提出了一种节能-功率感知路由算法,我们将能量效率与功率感知参数集成在数据包路由中。
{"title":"Study of Energy Efficient, Power Aware Routing Algorithm and Their Applications","authors":"A. A., G. Sakthidharan, Kanchan M. Miskin","doi":"10.1109/ICMLC.2010.44","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.44","url":null,"abstract":"Routing is the process of moving packets through an internetwork, such as the Internet. Routing consists of two separate but related tasks: i) Defining and selecting path in the network ii) Forwarding packets based upon the defined paths from a designated source node to a designated destination node. With the advance of wireless communication technology, small size and high performance computing and communication devices like commercial laptops and personal computers are increasingly used in convention centers, conferences and electronic classrooms. In wireless ad-hoc networks, a collection of nodes with wireless communications and networking capability communicate with each other without the aid of any centralized administrator. The nodes are powered by batteries with limited energy reservoir. It becomes difficult to recharge or replace the batteries of the nodes hence energy conservation is essential. An energy efficient routing protocol (EERP) balances node energy utilization to reduce energy consumption and increase the life of nodes thus increasing the network lifetime, reducing the routing delay and increasing the reliability of the packets reaching the destination. Wireless networks do not have any fixed communication infrastructure. For an active connection the end host as well as the intermediate nodes can be mobile. Therefore routes are subject to frequent disconnection. In such an environment it is important to minimize disruptions caused by changing topology for applications using voice and video. Power Aware Routing enables the nodes to detect misbehavior like deviation from regular routing and forwarding by observing the status of the node. By exploiting non-random behaviors for the mobility patterns that mobile user exhibit, state of network topology can be predicted and perform route reconstruction proactively in a timely manner. In this paper we propose an Energy Efficient- Power Aware routing algorithm where we have integrated energy efficiency with power awareness parameters for routing of packets.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121408764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Detecting the Number of Clusters during Expectation-Maximization Clustering Using Information Criterion 利用信息准则检测期望最大化聚类过程中的聚类数量
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.47
U. Gupta, Vinay Menon, Uday Babbar
This paper presents an algorithm to automatically determine the number of clusters in a given input data set, under a mixture of Gaussians assumption. Our algorithm extends the Expectation- Maximization clustering approach by starting with a single cluster assumption for the data, and recursively splitting one of the clusters in order to find a tighter fit. An Information Criterion parameter is used to make a selection between the current and previous model after each split. We build this approach upon prior work done on both the K-Means and Expectation-Maximization algorithms. We also present a novel idea for intelligent cluster splitting which minimizes convergence time and substantially improves accuracy.
本文提出了一种在混合高斯假设下,自动确定给定输入数据集中聚类数目的算法。我们的算法扩展了期望最大化聚类方法,从数据的单个聚类假设开始,并递归地拆分其中一个聚类以找到更紧密的拟合。每次分割后,使用Information Criterion参数在当前模型和以前的模型之间进行选择。我们在先前对K-Means和期望最大化算法所做的工作基础上构建了这种方法。我们还提出了一种新的智能聚类分割思想,使收敛时间最小化并大大提高了准确率。
{"title":"Detecting the Number of Clusters during Expectation-Maximization Clustering Using Information Criterion","authors":"U. Gupta, Vinay Menon, Uday Babbar","doi":"10.1109/ICMLC.2010.47","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.47","url":null,"abstract":"This paper presents an algorithm to automatically determine the number of clusters in a given input data set, under a mixture of Gaussians assumption. Our algorithm extends the Expectation- Maximization clustering approach by starting with a single cluster assumption for the data, and recursively splitting one of the clusters in order to find a tighter fit. An Information Criterion parameter is used to make a selection between the current and previous model after each split. We build this approach upon prior work done on both the K-Means and Expectation-Maximization algorithms. We also present a novel idea for intelligent cluster splitting which minimizes convergence time and substantially improves accuracy.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115376700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Statistical Feature Extraction for Classification of Image Spam Using Artificial Neural Networks 基于神经网络的垃圾图像分类统计特征提取
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.72
M. Soranamageswari, C. Meena
When the usages of electronic mail continue, unsolicited bulk email also continues to grow. These unsolicited bulk emails occupies server storage space and consumes large amount of network bandwidth. To overcome this serious problem, Anti-spam filters become a common component of internet security. Recently, Image spamming is a new kind of method of email spamming in which the text is embedded in image or picture files. Identifying and preventing spam is one of the top challenges in the internet world. Many approaches for identifying image spam have been established in literature. The artificial neural network is an effective classification method for solving feature extraction problems. In this paper we present an experimental system for the classification of image spam by considering statistical image feature histogram and mean value of an block of image. A comparative study of image classification based on color histogram and mean value is presented in this paper. The experimental result shows the performance of the proposed system and it achieves best results with minimum false positive.
当电子邮件的使用继续时,未经请求的大量电子邮件也继续增长。此类垃圾邮件占用服务器存储空间,占用大量网络带宽。为了克服这个严重的问题,反垃圾邮件过滤器成为互联网安全的一个常见组成部分。Image spamming是近年来出现的一种将文字嵌入到图片或图像文件中的新型垃圾邮件发送方式。识别和防止垃圾邮件是互联网世界面临的最大挑战之一。文献中已经建立了许多识别垃圾图像的方法。人工神经网络是解决特征提取问题的有效分类方法。本文提出了一种基于统计图像特征直方图和图像块均值的垃圾图像分类实验系统。本文对基于颜色直方图和均值的图像分类进行了比较研究。实验结果表明了该系统的性能,并以最小的假阳性达到了最佳效果。
{"title":"Statistical Feature Extraction for Classification of Image Spam Using Artificial Neural Networks","authors":"M. Soranamageswari, C. Meena","doi":"10.1109/ICMLC.2010.72","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.72","url":null,"abstract":"When the usages of electronic mail continue, unsolicited bulk email also continues to grow. These unsolicited bulk emails occupies server storage space and consumes large amount of network bandwidth. To overcome this serious problem, Anti-spam filters become a common component of internet security. Recently, Image spamming is a new kind of method of email spamming in which the text is embedded in image or picture files. Identifying and preventing spam is one of the top challenges in the internet world. Many approaches for identifying image spam have been established in literature. The artificial neural network is an effective classification method for solving feature extraction problems. In this paper we present an experimental system for the classification of image spam by considering statistical image feature histogram and mean value of an block of image. A comparative study of image classification based on color histogram and mean value is presented in this paper. The experimental result shows the performance of the proposed system and it achieves best results with minimum false positive.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114193626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
A Novel Approach Using Active Contour Model for Semi-Automatic Road Extraction from High Resolution Satellite Imagery 一种基于主动等高线模型的高分辨率卫星图像道路自动提取方法
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.36
Anil P.N., S. Natarajan
Road Extraction from satellite imagery is of fundamental importance in the context of spatial data capturing and updating for GIS applications. As fully automatic method for feature extraction is difficult due to the increasing complexity of objects. This paper proposes a semi-automatic road extraction methodology from high resolution satellite imagery using active contour model (Snakes). First the image is preprocessed using relaxed median filter. In the next step the user inputs initial seed points on the road to be extracted. Then the road segment is extracted using active contour model. The method is tested using high resolution satellite imagery and the results are presented in the paper.
从卫星图像中提取道路在GIS应用的空间数据捕获和更新中具有重要意义。由于目标的复杂性不断增加,自动特征提取的难度越来越大。本文提出了一种利用主动轮廓模型(Snakes)从高分辨率卫星图像中提取道路的半自动方法。首先使用松弛中值滤波对图像进行预处理。在下一步中,用户输入待提取道路上的初始种子点。然后利用活动轮廓模型提取道路段;利用高分辨率卫星图像对该方法进行了测试,并给出了测试结果。
{"title":"A Novel Approach Using Active Contour Model for Semi-Automatic Road Extraction from High Resolution Satellite Imagery","authors":"Anil P.N., S. Natarajan","doi":"10.1109/ICMLC.2010.36","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.36","url":null,"abstract":"Road Extraction from satellite imagery is of fundamental importance in the context of spatial data capturing and updating for GIS applications. As fully automatic method for feature extraction is difficult due to the increasing complexity of objects. This paper proposes a semi-automatic road extraction methodology from high resolution satellite imagery using active contour model (Snakes). First the image is preprocessed using relaxed median filter. In the next step the user inputs initial seed points on the road to be extracted. Then the road segment is extracted using active contour model. The method is tested using high resolution satellite imagery and the results are presented in the paper.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121492962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Using Abstract Information and Community Alignment Information for Link Prediction 基于抽象信息和社区对齐信息的链路预测
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.25
Mrinmaya Sachan, R. Ichise
Although there have been many recent studies of link prediction in co-authorship networks, few have tried to utilize the Semantic information hidden in abstracts of the research documents. We propose to build a link predictor in a co-authorship network where nodes represent researchers and links represent co-authorship. In this method, we use the structure of the constructed graph, and propose to add a semantic approach using abstract information, research titles and the event information to improve the accuracy of the predictor. Secondly, we make use of the fact that researchers tend to work in close knit communities. The knowledge of a pair of researchers lying in the same dense community can be used to improve the accuracy of our predictor further. Finally, we test out hypothesis on the DBLP database in a reasonable time by under-sampling and balancing the data set using decision trees and the SMOTE technique.
虽然最近有很多关于合作网络链接预测的研究,但很少有人试图利用隐藏在研究文献摘要中的语义信息。我们建议在合作作者网络中建立一个链接预测器,其中节点代表研究人员,链接代表合作作者。在该方法中,我们利用构造图的结构,并提出了一种使用抽象信息、研究标题和事件信息的语义方法来提高预测器的准确性。其次,我们利用了研究人员倾向于在紧密联系的社区中工作的事实。位于同一密集社区的一对研究人员的知识可以用来进一步提高我们的预测器的准确性。最后,我们在合理的时间内通过欠采样和使用决策树和SMOTE技术平衡数据集来验证DBLP数据库上的假设。
{"title":"Using Abstract Information and Community Alignment Information for Link Prediction","authors":"Mrinmaya Sachan, R. Ichise","doi":"10.1109/ICMLC.2010.25","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.25","url":null,"abstract":"Although there have been many recent studies of link prediction in co-authorship networks, few have tried to utilize the Semantic information hidden in abstracts of the research documents. We propose to build a link predictor in a co-authorship network where nodes represent researchers and links represent co-authorship. In this method, we use the structure of the constructed graph, and propose to add a semantic approach using abstract information, research titles and the event information to improve the accuracy of the predictor. Secondly, we make use of the fact that researchers tend to work in close knit communities. The knowledge of a pair of researchers lying in the same dense community can be used to improve the accuracy of our predictor further. Finally, we test out hypothesis on the DBLP database in a reasonable time by under-sampling and balancing the data set using decision trees and the SMOTE technique.","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131395809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Premptive Job Scheduling with Priorities and Starvation cum Congestion Avoidance in Clusters 集群中具有优先级和饥饿与拥塞避免的抢占作业调度
Pub Date : 2010-02-09 DOI: 10.1109/ICMLC.2010.60
M. Balajee, B. Suresh, M. Suneetha, V. Rani, G. Veerraju
This paper describes a new policy to schedule parallel jobs on Clusters that may be part of a Computational Grid. This algorithm proposed 3 Job Queues. In each Cluster, some number of resources is assigned to each of the Queue. The 1st Queue has some jobs which has low expected execution time(EET). The 2nd Queue has some jobs which has high expected execution time. The 3rd Queue has jobs which are part of Meta-Job from Computational Grid. In 1st there is no chance of starvation. But in 2nd Queue there is a chance of starvation. So this algorithm applied Aging technique to preempt the jobs which has low priority. And the 3rd Queue is fully dedicated to execute a part of Meta-Jobs only. So here we maintain multiple job Queues which are effectively separate jobs according to their projected execution time for Local Jobs and for part of Meta-Job. Here we preempt jobs by applying Aging Technique. Here we can avoid unnecessary traffic congestion in networks by comparing Expected Execution Time with Total Time for submitting job(s) and receiving result(s) from node(s).
本文描述了一种新的策略来调度集群上的并行作业,集群可能是计算网格的一部分。该算法提出了3个作业队列。在每个Cluster中,将一定数量的资源分配给每个Queue。第一个队列有一些预期执行时间(EET)较低的作业。第二个队列有一些作业,它们的预期执行时间很高。第三队列的作业是来自计算网格的元作业的一部分。在第一层,没有饿死的可能。但在第二队列中,有可能会饿死。因此,该算法采用老化技术对优先级较低的作业进行抢占。第三队列完全专用于执行元作业的一部分。因此,我们在这里维护多个作业队列,这些作业队列根据本地作业和部分元作业的预计执行时间有效地分离作业。在这里,我们通过使用老化技术来抢占工作。在这里,我们可以通过比较预期执行时间与提交作业和从节点接收结果的总时间来避免网络中不必要的流量拥塞。
{"title":"Premptive Job Scheduling with Priorities and Starvation cum Congestion Avoidance in Clusters","authors":"M. Balajee, B. Suresh, M. Suneetha, V. Rani, G. Veerraju","doi":"10.1109/ICMLC.2010.60","DOIUrl":"https://doi.org/10.1109/ICMLC.2010.60","url":null,"abstract":"This paper describes a new policy to schedule parallel jobs on Clusters that may be part of a Computational Grid. This algorithm proposed 3 Job Queues. In each Cluster, some number of resources is assigned to each of the Queue. The 1st Queue has some jobs which has low expected execution time(EET). The 2nd Queue has some jobs which has high expected execution time. The 3rd Queue has jobs which are part of Meta-Job from Computational Grid. In 1st there is no chance of starvation. But in 2nd Queue there is a chance of starvation. So this algorithm applied Aging technique to preempt the jobs which has low priority. And the 3rd Queue is fully dedicated to execute a part of Meta-Jobs only. So here we maintain multiple job Queues which are effectively separate jobs according to their projected execution time for Local Jobs and for part of Meta-Job. Here we preempt jobs by applying Aging Technique. Here we can avoid unnecessary traffic congestion in networks by comparing Expected Execution Time with Total Time for submitting job(s) and receiving result(s) from node(s).","PeriodicalId":423912,"journal":{"name":"2010 Second International Conference on Machine Learning and Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128895041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2010 Second International Conference on Machine Learning and Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1