首页 > 最新文献

2018 7th Brazilian Conference on Intelligent Systems (BRACIS)最新文献

英文 中文
Petroleum Reservoir Connectivity Patterns Reconstruction Using Deep Convolutional Generative Adversarial Networks 基于深度卷积生成对抗网络的油藏连通性模式重建
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00025
Rodrigo Exterkoetter, F. Bordignon, L. D. Figueiredo, M. Roisenberg, B. B. Rodrigues
In this paper, we propose a deep convolutional generative adversarial network model to reconstruct the petroleum reservoir connectivity patterns. In the petroleum exploration industry, the critical issue is determining the internal reservoir structure and connectivity, aiming to find a flow channel for placing the injection and the production wells. The state-of-the-art methods propose a combination of seismic inversion with multipoint geostatistics, which imposes connectivity patterns during the optimization. However, this approach has a high computational cost, no learning ability and do not provide a probability through the connection. Results show that our approach is able to learn the petroleum reservoir connectivity patterns from the data and reproduce them also in facies images obtained by the seismic inversion.
本文提出了一种深度卷积生成对抗网络模型来重建油藏连通性模式。在石油勘探工业中,关键问题是确定储层内部结构和连通性,以寻找注入井和生产井的流动通道。最先进的方法提出了地震反演与多点地质统计的结合,这在优化过程中施加了连通性模式。然而,这种方法计算成本高,没有学习能力,并且不提供通过连接的概率。结果表明,该方法能够从数据中学习到油气储层的连通性模式,并能在地震反演得到的相图中再现。
{"title":"Petroleum Reservoir Connectivity Patterns Reconstruction Using Deep Convolutional Generative Adversarial Networks","authors":"Rodrigo Exterkoetter, F. Bordignon, L. D. Figueiredo, M. Roisenberg, B. B. Rodrigues","doi":"10.1109/BRACIS.2018.00025","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00025","url":null,"abstract":"In this paper, we propose a deep convolutional generative adversarial network model to reconstruct the petroleum reservoir connectivity patterns. In the petroleum exploration industry, the critical issue is determining the internal reservoir structure and connectivity, aiming to find a flow channel for placing the injection and the production wells. The state-of-the-art methods propose a combination of seismic inversion with multipoint geostatistics, which imposes connectivity patterns during the optimization. However, this approach has a high computational cost, no learning ability and do not provide a probability through the connection. Results show that our approach is able to learn the petroleum reservoir connectivity patterns from the data and reproduce them also in facies images obtained by the seismic inversion.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133990591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Selecting Algorithms for the Quadratic Assignment Problem with a Multi-label Meta-Learning Approach 基于多标签元学习方法的二次分配问题算法选择
Pub Date : 2018-10-01 DOI: 10.1109/bracis.2018.00038
Augusto Lopez Dantas, Aurora Trinidad Ramirez Pozo
{"title":"Selecting Algorithms for the Quadratic Assignment Problem with a Multi-label Meta-Learning Approach","authors":"Augusto Lopez Dantas, Aurora Trinidad Ramirez Pozo","doi":"10.1109/bracis.2018.00038","DOIUrl":"https://doi.org/10.1109/bracis.2018.00038","url":null,"abstract":"","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132774419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Equidistant-Scattering-Based Cluster Index 一种新的基于等距离分散的聚类索引
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00099
Caio Flexa, Reginaldo Santos, W. Gomes, C. Sales
We propose a new non-parametric internal validity index based on mutual equidistant-scattering among within-cluster data for fine-tuning the number of clusters, i.e., the hyperparameter K. Most of the validity indexes found in the literature are considered to be dependent on the number of data objects in clusters and often tend to ignore small and low-density groups. Moreover, they select suboptimal clustering solutions when the clusters are in a certain degree of overlapping or low separation. We analysed our index performance with four of the most popular validity indexes. Experiments on both synthetic and real-world data show the effectiveness and reliability of our approach to evaluate the hyperparameter K.
我们提出了一种新的基于簇内数据相互等距离散射的非参数内部有效性指标,用于微调簇的数量,即超参数k。文献中发现的大多数有效性指标都被认为依赖于簇中数据对象的数量,往往忽略了小而低密度的群体。此外,当聚类处于一定程度的重叠或低分离时,它们会选择次优聚类解。我们用四个最流行的有效性指标分析了我们的指标表现。在合成数据和真实数据上的实验表明,我们的方法评估超参数K的有效性和可靠性。
{"title":"A Novel Equidistant-Scattering-Based Cluster Index","authors":"Caio Flexa, Reginaldo Santos, W. Gomes, C. Sales","doi":"10.1109/BRACIS.2018.00099","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00099","url":null,"abstract":"We propose a new non-parametric internal validity index based on mutual equidistant-scattering among within-cluster data for fine-tuning the number of clusters, i.e., the hyperparameter K. Most of the validity indexes found in the literature are considered to be dependent on the number of data objects in clusters and often tend to ignore small and low-density groups. Moreover, they select suboptimal clustering solutions when the clusters are in a certain degree of overlapping or low separation. We analysed our index performance with four of the most popular validity indexes. Experiments on both synthetic and real-world data show the effectiveness and reliability of our approach to evaluate the hyperparameter K.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130057433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strategies for Selection of Positive and Negative Instances in the Hierarchical Classification of Transposable Elements 转座要素层次分类中正反例的选择策略
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00079
Bruna Zamith Santos, G. Pereira, F. Nakano, R. Cerri
Transposable Elements (TEs) are DNA sequences capable of changing the gene's activity through transposition within the cells of a host. Once TEs insert themselves in other genes, they can change or reduce the activity of certain proteins, which in some cases could unfeasible the survival of such organisms or even provide genetic variability. A variety of methods has been proposed for the identification and classification of TEs, but most of them still involve a lot of manual work or are too class-specific, which restricts its applicability. Besides, the classes involved in such problems are often hierarchically structured, which is ignored by most of these methods. In this scenario, one problem that still needs further investigation is the use of strategies for selecting positive and negative instances during the induction of hierarchical models. Therefore, in this paper we explore four distinct strategies for selecting training instances, making use of several Machine Learning classifiers with different biases which were applied to the Hierarchical Classification of TEs using a local approach. Thus, we recommend the best strategy based on the results experimentally obtained.
转座因子(te)是能够通过在宿主细胞内的转座改变基因活性的DNA序列。一旦te插入到其他基因中,它们可以改变或降低某些蛋白质的活性,这在某些情况下可能会使这些生物体的生存无法实现,甚至会产生遗传变异。对于TEs的识别和分类已经提出了多种方法,但大多数方法仍然涉及大量的手工工作或过于特定于类别,这限制了其适用性。此外,这些问题所涉及的类通常是分层结构的,这一点被大多数方法所忽略。在这种情况下,仍然需要进一步研究的一个问题是在分层模型的归纳过程中选择积极和消极实例的策略的使用。因此,在本文中,我们探索了四种不同的策略来选择训练实例,利用几种具有不同偏差的机器学习分类器,这些分类器使用局部方法应用于te的分层分类。因此,我们根据实验结果推荐最佳策略。
{"title":"Strategies for Selection of Positive and Negative Instances in the Hierarchical Classification of Transposable Elements","authors":"Bruna Zamith Santos, G. Pereira, F. Nakano, R. Cerri","doi":"10.1109/BRACIS.2018.00079","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00079","url":null,"abstract":"Transposable Elements (TEs) are DNA sequences capable of changing the gene's activity through transposition within the cells of a host. Once TEs insert themselves in other genes, they can change or reduce the activity of certain proteins, which in some cases could unfeasible the survival of such organisms or even provide genetic variability. A variety of methods has been proposed for the identification and classification of TEs, but most of them still involve a lot of manual work or are too class-specific, which restricts its applicability. Besides, the classes involved in such problems are often hierarchically structured, which is ignored by most of these methods. In this scenario, one problem that still needs further investigation is the use of strategies for selecting positive and negative instances during the induction of hierarchical models. Therefore, in this paper we explore four distinct strategies for selecting training instances, making use of several Machine Learning classifiers with different biases which were applied to the Hierarchical Classification of TEs using a local approach. Thus, we recommend the best strategy based on the results experimentally obtained.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129145401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Influence of Reference Points on a Many-Objective Optimization Algorithm 参考点对多目标优化算法的影响
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00014
Matheus Carvalho, André Britto
Many-Objective Optimization Problems (MaOPs) are problems that have more than three objective functions to be optimized. Most Multi-Objective Evolutionary Algorithms scales poorly when the number of objective function increases. To face this limitation, new strategies have been proposed. One of them is the use of reference points to enhance the search of the algorithms. NSGA-III is a reference point based algorithm that has been successfully applied to solve MaOPs. NSGA-III uses a set of reference points placed on a normalized hyperplane which is equally inclined to all objective axes and has an intercept at 1 on each axis. Despite the good results of NSGA-III, the shape of the hyper-plane is not deeply explored in literature. This work studies the influence of the set of reference pointsonMany-ObjectiveOptimization.Here, itisproposedthree new transformations of the reference points set used by NSGAIII. Besides, the Vector Guided Adaptation procedure is also applied to modify original NSGA-III hyper-plane. Furthermore, an adaptation of NSGA-III algorithm is proposed and it is performed a set of experiments to evaluate the transformation procedures. Original and adapted versions of NSGA-III are faced over several benchmarking problems observing both convergence and diversity through the analysis of statistical tests.
多目标优化问题(MaOPs)是指有三个以上目标函数需要优化的问题。当目标函数数量增加时,大多数多目标进化算法的可扩展性较差。为了面对这一限制,人们提出了新的策略。其中之一是使用参考点来增强算法的搜索能力。NSGA-III是一种基于参考点的算法,已成功应用于求解MaOPs。NSGA-III使用一组放置在归一化超平面上的参考点,该超平面与所有目标轴相等倾斜,并且每个轴上的截距为1。尽管NSGA-III取得了良好的效果,但文献中对超平面形状的探讨并不深入。本文研究了多目标优化中参考点集合的影响。本文对NSGAIII使用的参考点集提出了三种新的变换方法。此外,还采用矢量引导自适应方法对原NSGA-III超平面进行了修改。在此基础上,提出了一种NSGA-III算法的自适应算法,并进行了一系列实验,对变换过程进行了评价。NSGA-III的原始版本和改编版本面临着几个基准问题,通过统计测试分析观察到收敛性和多样性。
{"title":"Influence of Reference Points on a Many-Objective Optimization Algorithm","authors":"Matheus Carvalho, André Britto","doi":"10.1109/BRACIS.2018.00014","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00014","url":null,"abstract":"Many-Objective Optimization Problems (MaOPs) are problems that have more than three objective functions to be optimized. Most Multi-Objective Evolutionary Algorithms scales poorly when the number of objective function increases. To face this limitation, new strategies have been proposed. One of them is the use of reference points to enhance the search of the algorithms. NSGA-III is a reference point based algorithm that has been successfully applied to solve MaOPs. NSGA-III uses a set of reference points placed on a normalized hyperplane which is equally inclined to all objective axes and has an intercept at 1 on each axis. Despite the good results of NSGA-III, the shape of the hyper-plane is not deeply explored in literature. This work studies the influence of the set of reference pointsonMany-ObjectiveOptimization.Here, itisproposedthree new transformations of the reference points set used by NSGAIII. Besides, the Vector Guided Adaptation procedure is also applied to modify original NSGA-III hyper-plane. Furthermore, an adaptation of NSGA-III algorithm is proposed and it is performed a set of experiments to evaluate the transformation procedures. Original and adapted versions of NSGA-III are faced over several benchmarking problems observing both convergence and diversity through the analysis of statistical tests.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129851364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exploring Evolutive Methods for Cloud Provider Selection Based on Performance Indicators 基于绩效指标的云提供商选择的进化方法探索
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00035
Lucas Borges de Moraes, Adriano Fiorese, R. S. Parpinelli
The cloud computing model has been spreading around the world and has become a basis for innovation and efficiency on provisioning computational services. This fact inspired the emergence of a large number of new companies providing cloud computing services. In order to qualify such providers, performance indicators (PI) are useful for systematic information collection. Select which providers are the most suitable to each customer's needs and with the desired quality of service, has become a hard problem with the need of robust search methods. Thus, the problem is to find the smallest set of providers that maximize the attendance of a customer's request with and the lowest price. In this paper, two evolutionary algorithms, named Genetic Algorithms (GA) and Binary Differential Evolution (BDE), are modeled to address this problem. Instances with 10, 100, and 200 providers are employed. Results obtained are compared with a deterministic method and show that the BDE approach outperforms GA and the deterministic method.
云计算模型已经在世界范围内传播,并已成为提供计算服务的创新和效率的基础。这一事实激发了大量提供云计算服务的新公司的出现。为了使这些提供者合格,性能指标(PI)对于系统的信息收集是有用的。选择最适合每个客户需求并具有理想服务质量的供应商,已经成为一个需要鲁棒搜索方法的难题。因此,问题是找到能够以最低价格最大化客户请求的出席率的最小提供商集。本文采用遗传算法(GA)和二元差分进化(BDE)两种进化算法来解决这一问题。使用具有10个、100个和200个提供者的实例。将所得结果与确定性方法进行了比较,结果表明BDE方法优于遗传算法和确定性方法。
{"title":"Exploring Evolutive Methods for Cloud Provider Selection Based on Performance Indicators","authors":"Lucas Borges de Moraes, Adriano Fiorese, R. S. Parpinelli","doi":"10.1109/BRACIS.2018.00035","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00035","url":null,"abstract":"The cloud computing model has been spreading around the world and has become a basis for innovation and efficiency on provisioning computational services. This fact inspired the emergence of a large number of new companies providing cloud computing services. In order to qualify such providers, performance indicators (PI) are useful for systematic information collection. Select which providers are the most suitable to each customer's needs and with the desired quality of service, has become a hard problem with the need of robust search methods. Thus, the problem is to find the smallest set of providers that maximize the attendance of a customer's request with and the lowest price. In this paper, two evolutionary algorithms, named Genetic Algorithms (GA) and Binary Differential Evolution (BDE), are modeled to address this problem. Instances with 10, 100, and 200 providers are employed. Results obtained are compared with a deterministic method and show that the BDE approach outperforms GA and the deterministic method.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121480756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Solving a Spatial Puzzle Using Answer Set Programming Integrated with Markov Decision Process 结合马尔可夫决策过程的答案集规划求解空间谜题
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00097
Thiago Freitas dos Santos, P. Santos, L. Ferreira, Reinaldo A. C. Bianchi, Pedro Cabalar
Spatial puzzles are interesting domains to investigate problem solving, since the reasoning processes involved in reasoning about spatial knowledge is one of the essential items for an agent to interact in the human environment. With this in mind, the goal of this work is to investigate the knowledge representation and reasoning process related to the solution of a spatial puzzle, the Fisherman's Folly, composed of flexible string, rigid objects and holes. To achieve this goal, the present paper uses heuristics (obtained after solving a relaxed version of the puzzle) to accelerate the learning process, while applying a method that combines Answer Set programming (ASP) with Reinforcement learning (RL), the oASP(MDP) algorithm, to find a solution to the puzzle. ASP is the logic language chosen to build the set of states and actions of a Markov Decision Process (MDP) representing the domain, where RL is used to learn the optimal policy of the problem.
空间谜题是研究问题解决的有趣领域,因为涉及空间知识推理的推理过程是智能体在人类环境中相互作用的基本项目之一。考虑到这一点,这项工作的目标是研究与解决空间难题有关的知识表示和推理过程,渔夫的愚蠢,由柔性的绳子,刚性的物体和洞组成。为了实现这一目标,本论文使用启发式(在解决一个放松版本的谜题后获得)来加速学习过程,同时应用一种将答案集编程(ASP)与强化学习(RL),即oASP(MDP)算法相结合的方法来找到谜题的解决方案。ASP是用于构建马尔可夫决策过程(MDP)的状态和动作集的逻辑语言,MDP表示领域,其中RL用于学习问题的最佳策略。
{"title":"Solving a Spatial Puzzle Using Answer Set Programming Integrated with Markov Decision Process","authors":"Thiago Freitas dos Santos, P. Santos, L. Ferreira, Reinaldo A. C. Bianchi, Pedro Cabalar","doi":"10.1109/BRACIS.2018.00097","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00097","url":null,"abstract":"Spatial puzzles are interesting domains to investigate problem solving, since the reasoning processes involved in reasoning about spatial knowledge is one of the essential items for an agent to interact in the human environment. With this in mind, the goal of this work is to investigate the knowledge representation and reasoning process related to the solution of a spatial puzzle, the Fisherman's Folly, composed of flexible string, rigid objects and holes. To achieve this goal, the present paper uses heuristics (obtained after solving a relaxed version of the puzzle) to accelerate the learning process, while applying a method that combines Answer Set programming (ASP) with Reinforcement learning (RL), the oASP(MDP) algorithm, to find a solution to the puzzle. ASP is the logic language chosen to build the set of states and actions of a Markov Decision Process (MDP) representing the domain, where RL is used to learn the optimal policy of the problem.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115877901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Process Mining Discovery Techniques in a Low-Structured Process Works? 低结构过程工程中的过程挖掘发现技术?
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00042
Raphael J. D'Castro, Adriano Oliveira, Augusto H. Terra
Efficiency in the operation is a crucial element for organizations, and this requires greater knowledge about existing business processes. Process Mining is an emerging discipline which aims to provide knowledge about business processes through learning from event logs from the information systems of the organization. Over the last years, the field has progressed significantly, however, some important challenges remain. Among these, we highlight the application in real-world environments, because organizations do not always have well-structured processes, that is, repeatable activities whose inputs and outputs are well defined. Also, there are more complex processes that make it more difficult to get knowledge. This paper presents a study, performed in a real environment, to evaluate the challenges and limitations of the Process Mining tools on process discovery.
操作的效率是组织的关键要素,这需要对现有业务流程有更多的了解。流程挖掘是一门新兴的学科,旨在通过从组织的信息系统的事件日志中学习来提供有关业务流程的知识。在过去的几年中,该领域取得了重大进展,但仍然存在一些重大挑战。其中,我们强调了现实环境中的应用程序,因为组织并不总是具有结构良好的过程,也就是说,输入和输出定义良好的可重复活动。此外,还有更复杂的过程使得获取知识变得更加困难。本文提出了一项在真实环境中进行的研究,以评估过程挖掘工具在过程发现方面的挑战和局限性。
{"title":"Process Mining Discovery Techniques in a Low-Structured Process Works?","authors":"Raphael J. D'Castro, Adriano Oliveira, Augusto H. Terra","doi":"10.1109/BRACIS.2018.00042","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00042","url":null,"abstract":"Efficiency in the operation is a crucial element for organizations, and this requires greater knowledge about existing business processes. Process Mining is an emerging discipline which aims to provide knowledge about business processes through learning from event logs from the information systems of the organization. Over the last years, the field has progressed significantly, however, some important challenges remain. Among these, we highlight the application in real-world environments, because organizations do not always have well-structured processes, that is, repeatable activities whose inputs and outputs are well defined. Also, there are more complex processes that make it more difficult to get knowledge. This paper presents a study, performed in a real environment, to evaluate the challenges and limitations of the Process Mining tools on process discovery.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132490161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Instance Selection and Class Balancing Techniques for Cross Project Defect Prediction 跨项目缺陷预测的实例选择和类平衡技术
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00101
Alysson Bispo, R. Prudêncio, D. V. D. Silva
Various software metrics and statistical models have been developed to help companies to predict software defects. Traditional software defect prediction approaches use historical data about previous bugs on a project in order to build predictive machine learning models. However, in many cases the historical testing data available in a project is scarce, i.e., very few or even no labeled training instances are available, which will result on a low quality defect prediction model. In order to overcome this limitation, Cross-Project Defect Prediction (CPDP) can be adopted to learn a defect prediction model for a project of interest (i.e., a target project) by reusing (transferring) data collected from several previous projects (i.e., source projects). In this paper, we focused on neighborhood-based instance selection techniques for CPDP which select labeled instances in the source projects that are similar to the unlabeled instances available in the target project. Despite its simplicity, these techniques have limitations which were addressed in our work. First, although they can select representative source instances, the quality of the selected instances is usually not addressed. Additionally, bug prediction datasets are normally unbalanced (i.e., there are more nondefect instances than defect ones), which can harm learning performance. In this paper, we proposed a new transfer learning approach for CPDP, in which instances selected by a neighborhood-based technique are filtered by the FuzzyRough Instance Selection (FRIS) technique in order to remove noisy instances in the training set. Following, in order to solve class balancing problems, the Synthetic Minority Oversampling Technique (SMOTE) technique is adopted to oversample the minority (defect-prone) class, thus increasing the chance of finding bugs correctly. Experiments were performed on a benchmark set of Java projects, achieving promising results.
已经开发了各种软件度量和统计模型来帮助公司预测软件缺陷。传统的软件缺陷预测方法使用项目中以前错误的历史数据来构建预测性机器学习模型。然而,在许多情况下,项目中可用的历史测试数据是稀缺的,也就是说,很少甚至没有标记的训练实例可用,这将导致低质量的缺陷预测模型。为了克服这个限制,可以采用跨项目缺陷预测(CPDP),通过重用(转移)从几个以前的项目(例如,源项目)收集的数据来学习感兴趣的项目(例如,目标项目)的缺陷预测模型。在本文中,我们专注于基于邻域的CPDP实例选择技术,该技术在源项目中选择与目标项目中可用的未标记实例相似的标记实例。尽管它很简单,但这些技术有局限性,我们在工作中解决了这些问题。首先,尽管它们可以选择有代表性的源实例,但通常不会解决所选实例的质量问题。此外,错误预测数据集通常是不平衡的(即,非缺陷实例比缺陷实例多),这可能会损害学习性能。在本文中,我们提出了一种新的CPDP迁移学习方法,该方法使用基于邻域技术选择的实例通过模糊粗糙实例选择(FRIS)技术进行过滤,以去除训练集中的噪声实例。接下来,为了解决类平衡问题,我们采用了合成少数派过采样技术(Synthetic Minority Oversampling Technique, SMOTE)对少数派(有缺陷的)类进行过采样,从而增加了正确发现bug的机会。在Java项目的基准集上执行了实验,获得了有希望的结果。
{"title":"Instance Selection and Class Balancing Techniques for Cross Project Defect Prediction","authors":"Alysson Bispo, R. Prudêncio, D. V. D. Silva","doi":"10.1109/BRACIS.2018.00101","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00101","url":null,"abstract":"Various software metrics and statistical models have been developed to help companies to predict software defects. Traditional software defect prediction approaches use historical data about previous bugs on a project in order to build predictive machine learning models. However, in many cases the historical testing data available in a project is scarce, i.e., very few or even no labeled training instances are available, which will result on a low quality defect prediction model. In order to overcome this limitation, Cross-Project Defect Prediction (CPDP) can be adopted to learn a defect prediction model for a project of interest (i.e., a target project) by reusing (transferring) data collected from several previous projects (i.e., source projects). In this paper, we focused on neighborhood-based instance selection techniques for CPDP which select labeled instances in the source projects that are similar to the unlabeled instances available in the target project. Despite its simplicity, these techniques have limitations which were addressed in our work. First, although they can select representative source instances, the quality of the selected instances is usually not addressed. Additionally, bug prediction datasets are normally unbalanced (i.e., there are more nondefect instances than defect ones), which can harm learning performance. In this paper, we proposed a new transfer learning approach for CPDP, in which instances selected by a neighborhood-based technique are filtered by the FuzzyRough Instance Selection (FRIS) technique in order to remove noisy instances in the training set. Following, in order to solve class balancing problems, the Synthetic Minority Oversampling Technique (SMOTE) technique is adopted to oversample the minority (defect-prone) class, thus increasing the chance of finding bugs correctly. Experiments were performed on a benchmark set of Java projects, achieving promising results.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126704751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Bio-Inspired and Heuristic Methods Applied to a Benchmark of the Task Scheduling Problem 生物启发和启发式方法在任务调度问题基准中的应用
Pub Date : 2018-10-01 DOI: 10.1109/BRACIS.2018.00095
T. I. D. Carvalho, Bruno Well Dantas Morais, G. Oliveira
Task scheduling seeks for the time-efficient allocation of the tasks of a parallel program to a multiprocessor system. Being intractable, heuristic methods have been developed to solve this problem. Among the more traditional approaches, approximate techniques as constructive list-based heuristics or simply random schedulers have been extensively employed. On the other hand, bio-inspired models, such as cellular automata (CA) and evolutionary-based schedulers, have been recently investigated as alternative approaches. However, the comparative analysis of the experimental results is primarily limited by the capacity of benchmarks to represent the problem in a full range of difficulty. Aiming to investigate the usage of a more comprehensive benchmark on comparative experiments, we have developed a set of scheduling instances based on real-world programs by applying variations in their features, including number of tasks, number of available processors and communication costs. We have applied two simple heuristics to serve both as baselines for performance and to evaluate the complexity of each problem instance as basis for comparison. Moreover, we investigate here three bio-inspired schedulers applied to the same instances. Two of them are genetic algorithm (GA) approaches while the third employs a GA to find good CA rules able to schedule unseen instances of a parallel program in a very fast operation. Our results show that the CA-based scheduler outperforms the other methods significantly on mosts instances while, on certain instances of the problem, a good solution can be produced consistently by a heuristic based on random allocations. We conclude that these instances are unfit for benchmark purposes and that there is a necessity of careful analysis and selection of problem instances for performance evaluation in this field of research.
任务调度寻求将并行程序的任务高效地分配给多处理器系统。由于这个问题难以解决,人们开发了启发式方法来解决这个问题。在更传统的方法中,近似技术如基于建设性列表的启发式或简单的随机调度器已被广泛使用。另一方面,生物启发的模型,如细胞自动机(CA)和进化为基础的调度,最近被研究作为替代方法。然而,对实验结果的比较分析主要受到基准的能力的限制,无法在整个难度范围内表示问题。为了在比较实验中研究更全面的基准的使用情况,我们开发了一组基于现实世界程序的调度实例,通过应用其特征的变化,包括任务数量、可用处理器数量和通信成本。我们应用了两个简单的启发式方法,作为性能的基准,并评估每个问题实例的复杂性,作为比较的基础。此外,我们还研究了应用于相同实例的三个仿生调度器。其中两种是遗传算法(GA)方法,而第三种是使用遗传算法来找到能够在非常快的操作中调度并行程序的未见实例的良好CA规则。我们的结果表明,在大多数情况下,基于ca的调度器的性能明显优于其他方法,而在问题的某些情况下,基于随机分配的启发式方法可以始终如一地产生良好的解决方案。我们得出的结论是,这些实例不适合用于基准测试目的,有必要仔细分析和选择问题实例,以便在这一研究领域进行性能评估。
{"title":"Bio-Inspired and Heuristic Methods Applied to a Benchmark of the Task Scheduling Problem","authors":"T. I. D. Carvalho, Bruno Well Dantas Morais, G. Oliveira","doi":"10.1109/BRACIS.2018.00095","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00095","url":null,"abstract":"Task scheduling seeks for the time-efficient allocation of the tasks of a parallel program to a multiprocessor system. Being intractable, heuristic methods have been developed to solve this problem. Among the more traditional approaches, approximate techniques as constructive list-based heuristics or simply random schedulers have been extensively employed. On the other hand, bio-inspired models, such as cellular automata (CA) and evolutionary-based schedulers, have been recently investigated as alternative approaches. However, the comparative analysis of the experimental results is primarily limited by the capacity of benchmarks to represent the problem in a full range of difficulty. Aiming to investigate the usage of a more comprehensive benchmark on comparative experiments, we have developed a set of scheduling instances based on real-world programs by applying variations in their features, including number of tasks, number of available processors and communication costs. We have applied two simple heuristics to serve both as baselines for performance and to evaluate the complexity of each problem instance as basis for comparison. Moreover, we investigate here three bio-inspired schedulers applied to the same instances. Two of them are genetic algorithm (GA) approaches while the third employs a GA to find good CA rules able to schedule unseen instances of a parallel program in a very fast operation. Our results show that the CA-based scheduler outperforms the other methods significantly on mosts instances while, on certain instances of the problem, a good solution can be produced consistently by a heuristic based on random allocations. We conclude that these instances are unfit for benchmark purposes and that there is a necessity of careful analysis and selection of problem instances for performance evaluation in this field of research.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114632256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2018 7th Brazilian Conference on Intelligent Systems (BRACIS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1