Pub Date : 2020-07-29DOI: 10.1007/978-3-030-61380-8_28
Lucas F. F. Cardoso, Vitor Santos, R. S. K. Francês, R. Prudêncio, Ronnie Alves
{"title":"Decoding machine learning benchmarks","authors":"Lucas F. F. Cardoso, Vitor Santos, R. S. K. Francês, R. Prudêncio, Ronnie Alves","doi":"10.1007/978-3-030-61380-8_28","DOIUrl":"https://doi.org/10.1007/978-3-030-61380-8_28","url":null,"abstract":"","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131044925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-29DOI: 10.1007/978-3-030-61377-8_39
A. D. Reys, Danilo Silva, Daniel de Souza Severo, S. Pedro, Marcia M. de Souza e S'a, Guilherme A. C. Salgado
{"title":"Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical Notes","authors":"A. D. Reys, Danilo Silva, Daniel de Souza Severo, S. Pedro, Marcia M. de Souza e S'a, Guilherme A. C. Salgado","doi":"10.1007/978-3-030-61377-8_39","DOIUrl":"https://doi.org/10.1007/978-3-030-61377-8_39","url":null,"abstract":"","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132934034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-14DOI: 10.1007/978-3-030-61377-8_29
Johannes V. Lochter, R. M. Silva, Tiago A. Almeida
{"title":"Deep learning models for representing out-of-vocabulary words","authors":"Johannes V. Lochter, R. M. Silva, Tiago A. Almeida","doi":"10.1007/978-3-030-61377-8_29","DOIUrl":"https://doi.org/10.1007/978-3-030-61377-8_29","url":null,"abstract":"","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"16 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122882438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-25DOI: 10.1007/978-3-030-91699-2_39
Edresson Casanova, Arnaldo Cândido Júnior, C. Shulby, F. S. Oliveira, L. Gris, Hamilton Pereira da Silva, S. Aluísio, M. Ponti
{"title":"Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models","authors":"Edresson Casanova, Arnaldo Cândido Júnior, C. Shulby, F. S. Oliveira, L. Gris, Hamilton Pereira da Silva, S. Aluísio, M. Ponti","doi":"10.1007/978-3-030-91699-2_39","DOIUrl":"https://doi.org/10.1007/978-3-030-91699-2_39","url":null,"abstract":"","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125459507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00020
Aluísio Cardoso Silva, C. Borges
The NP-complete bin packing problem is a widely studied grouping problem that serves to model several useful and practical problems, e.g., batch-processing machine scheduling, industrial and transportation logistics, etc. Due to the complexity involved to solve this class of problems, usually two main strategies are adopted: sub-optimal building heuristics and optimization models using metaheuristics algorithms. The building heuristics are computationally efficient, however, usually obtaining non-optimal solutions or local minima. Otherwise, adapted metaheuristics to handle this problem allows an effective global search which augments the chance to obtain optimal or quasi-optimal solutions, however with a high computational cost. This work develops a heuristic based genetic algorithm aiming to obtain a hybrid approach constructed to explore the best features of each strategy. Special encoding handling and specific operators are included additionally to the final model to enhance the behavior and performance of the hybrid model. Numerical experimental using well-established benchmarks for one-dimensional bin packing problem are carried out to compare the versions of the presented hybrid methods with high-quality methods presented in the literature. The results indicate the potential for the presented strategy to solve the one-dimensional bin packing problems.
{"title":"An Improved Heuristic Based Genetic Algorithm for Bin Packing Problem","authors":"Aluísio Cardoso Silva, C. Borges","doi":"10.1109/BRACIS.2019.00020","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00020","url":null,"abstract":"The NP-complete bin packing problem is a widely studied grouping problem that serves to model several useful and practical problems, e.g., batch-processing machine scheduling, industrial and transportation logistics, etc. Due to the complexity involved to solve this class of problems, usually two main strategies are adopted: sub-optimal building heuristics and optimization models using metaheuristics algorithms. The building heuristics are computationally efficient, however, usually obtaining non-optimal solutions or local minima. Otherwise, adapted metaheuristics to handle this problem allows an effective global search which augments the chance to obtain optimal or quasi-optimal solutions, however with a high computational cost. This work develops a heuristic based genetic algorithm aiming to obtain a hybrid approach constructed to explore the best features of each strategy. Special encoding handling and specific operators are included additionally to the final model to enhance the behavior and performance of the hybrid model. Numerical experimental using well-established benchmarks for one-dimensional bin packing problem are carried out to compare the versions of the presented hybrid methods with high-quality methods presented in the literature. The results indicate the potential for the presented strategy to solve the one-dimensional bin packing problems.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124709017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00131
Luis Fernando Marin Sepulveda, A. Silva, J. O. Diniz
Currently there are large amounts of data available, to obtain useful information, multiple methods have been created to fulfill specific tasks, however, identifying the most appropriate method is often a difficult task. Meta-Learning is presented as an option that can recommend for new data the most appropriate method to perform a particular task based on experience, in which the features of the data and the performance of methods are related, this relationship is known as Meta-Data. Given the continuous increase of patients with breast cancer cases and availability of datasets, the images of slides of breast tissue biopsy to identify Ductal Carcinoma were selected as the object of study. The aim of this work is construction of Meta-Data that allows application of Meta-Learning for selection of the best Ductal Carcinoma identification method in the type of images under study. The proposed methodology presents a performance of the 99.6% accuracy, 99.9% AUC and 99.7% F-measure for Meta-Data Validation.
{"title":"Meta-Data Construction for Selection of Breast Tissue Biopsy Slides Image Classifier to Identify Ductal Carcinoma","authors":"Luis Fernando Marin Sepulveda, A. Silva, J. O. Diniz","doi":"10.1109/BRACIS.2019.00131","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00131","url":null,"abstract":"Currently there are large amounts of data available, to obtain useful information, multiple methods have been created to fulfill specific tasks, however, identifying the most appropriate method is often a difficult task. Meta-Learning is presented as an option that can recommend for new data the most appropriate method to perform a particular task based on experience, in which the features of the data and the performance of methods are related, this relationship is known as Meta-Data. Given the continuous increase of patients with breast cancer cases and availability of datasets, the images of slides of breast tissue biopsy to identify Ductal Carcinoma were selected as the object of study. The aim of this work is construction of Meta-Data that allows application of Meta-Learning for selection of the best Ductal Carcinoma identification method in the type of images under study. The proposed methodology presents a performance of the 99.6% accuracy, 99.9% AUC and 99.7% F-measure for Meta-Data Validation.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124928307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00122
J. F. Saran, L. C. Botega
Situation Awareness (SAW) refers to the level of consciousness that an individual or team holds about a situation. In the field of risk management and criminal data analysis, SAW failures may led human operators to errors in the decision-making process and jeopardize human life, heritage and environment. In this scenario, critical situation assessment processes, which usually involve methods as mining, fusion and others, present opportunities to deliver better information for human reasoning and to assist in the development of SAW. However, on attempting to characterize complex scenarios can lead to poor information representation and expressiveness, which can induce the misinterpretation of data, mainly due to their quality, producing uncertainties. The state-of-the-art on information representation of risk situations and related areas presents approaches with limited usage of the quality of information. In addition, the solutions are limited to syntactic mechanisms for characterizing relations between the information, negatively limiting the assertiveness of the results. Thus, this work aims to present the development of a new approach of semantic information representation of crime situations, more specifically by modeling domain ontologies, instantiated with qualified criminal data. In a case study, real crime information is processed, represented by the new semantic model and consumed by computational inference methods. Results validate the applicability of the produced ontologies on characterizing and inferring robbery and theft situations.
{"title":"Development of Criminal Ontologies to Enhance Situation Assessment","authors":"J. F. Saran, L. C. Botega","doi":"10.1109/BRACIS.2019.00122","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00122","url":null,"abstract":"Situation Awareness (SAW) refers to the level of consciousness that an individual or team holds about a situation. In the field of risk management and criminal data analysis, SAW failures may led human operators to errors in the decision-making process and jeopardize human life, heritage and environment. In this scenario, critical situation assessment processes, which usually involve methods as mining, fusion and others, present opportunities to deliver better information for human reasoning and to assist in the development of SAW. However, on attempting to characterize complex scenarios can lead to poor information representation and expressiveness, which can induce the misinterpretation of data, mainly due to their quality, producing uncertainties. The state-of-the-art on information representation of risk situations and related areas presents approaches with limited usage of the quality of information. In addition, the solutions are limited to syntactic mechanisms for characterizing relations between the information, negatively limiting the assertiveness of the results. Thus, this work aims to present the development of a new approach of semantic information representation of crime situations, more specifically by modeling domain ontologies, instantiated with qualified criminal data. In a case study, real crime information is processed, represented by the new semantic model and consumed by computational inference methods. Results validate the applicability of the produced ontologies on characterizing and inferring robbery and theft situations.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121854144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00037
Matheus Campos Fernandes, T. Covões, André Luiz Vizine Pereira
The high cost of labeling data for analysis has increased interest in semi-supervised learning. One of its most common types is constrained clustering, which is a type of learning that does not rely on class labels for a group of objects. Instead, there is only information if some pairs of objects must be in the same cluster or in different clusters. In some applications, identifying such constraints involves reduced cost since it is less information than a class label. At the same time, Active Learning (AL) aims to minimize the cost of creating labeled datasets, trying to identify which unlabeled data are more relevant for using during the learning process, considering the labels that are already available. This paper proposes three AL strategies to an evolutionary constrained clustering algorithm (FIECE-EM) based on Gaussian Mixture Models (GMM). Experiments were executed on 10 well-known datasets, as a way to measure the impacts of each strategy. We compare the results with baseline supervised algorithms as well as COBRAS, a state-of-the-art Active Learning algorithm for constrained clustering. Two of the proposed strategies obtained significantly better results than COBRAS in our empirical evaluation. Thus, the combination of FIECE-EM with these strategies can be considered viable alternatives for AL in a constrained clustering setting.
{"title":"Active Learning for Evolutionary Constrained Clustering","authors":"Matheus Campos Fernandes, T. Covões, André Luiz Vizine Pereira","doi":"10.1109/BRACIS.2019.00037","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00037","url":null,"abstract":"The high cost of labeling data for analysis has increased interest in semi-supervised learning. One of its most common types is constrained clustering, which is a type of learning that does not rely on class labels for a group of objects. Instead, there is only information if some pairs of objects must be in the same cluster or in different clusters. In some applications, identifying such constraints involves reduced cost since it is less information than a class label. At the same time, Active Learning (AL) aims to minimize the cost of creating labeled datasets, trying to identify which unlabeled data are more relevant for using during the learning process, considering the labels that are already available. This paper proposes three AL strategies to an evolutionary constrained clustering algorithm (FIECE-EM) based on Gaussian Mixture Models (GMM). Experiments were executed on 10 well-known datasets, as a way to measure the impacts of each strategy. We compare the results with baseline supervised algorithms as well as COBRAS, a state-of-the-art Active Learning algorithm for constrained clustering. Two of the proposed strategies obtained significantly better results than COBRAS in our empirical evaluation. Thus, the combination of FIECE-EM with these strategies can be considered viable alternatives for AL in a constrained clustering setting.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123855954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00101
Raíssa Silva, K. Souza, F. Góes, Ronnie Alves
Metagenomics is related to the study of microbial genomes, known as metagenomes, describing them through their microorganisms compositions, relationships and activities, thus allowing a greater knowledge about the fundamentals of life and the broad microbial diversity. One way to accomplish such task is by analyzing information from genes contained in metagenomes. The process to identify genes in DNA sequences are usually called gene prediction. This work presents a new gene predictor using the Random Forest classifier. The proposed model obtaining better classification results when compared to state-of-the-art gene prediction tools widely used by the bioinformatics community. Random Forest presented more robust results, being 27% better than Prodigal and 20% better than FragGeneScan w.r.t AUC values while using the independent test set. Feature engineering has been revisited in the gene prediction problem, reinforcing the importance of careful evaluation of assembly a good feature set. K-mer counting features can been seen as the fundamental model building blocks to develop robust gene predictors.
{"title":"A Random Forest Classifier for Prokaryotes Gene Prediction","authors":"Raíssa Silva, K. Souza, F. Góes, Ronnie Alves","doi":"10.1109/BRACIS.2019.00101","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00101","url":null,"abstract":"Metagenomics is related to the study of microbial genomes, known as metagenomes, describing them through their microorganisms compositions, relationships and activities, thus allowing a greater knowledge about the fundamentals of life and the broad microbial diversity. One way to accomplish such task is by analyzing information from genes contained in metagenomes. The process to identify genes in DNA sequences are usually called gene prediction. This work presents a new gene predictor using the Random Forest classifier. The proposed model obtaining better classification results when compared to state-of-the-art gene prediction tools widely used by the bioinformatics community. Random Forest presented more robust results, being 27% better than Prodigal and 20% better than FragGeneScan w.r.t AUC values while using the independent test set. Feature engineering has been revisited in the gene prediction problem, reinforcing the importance of careful evaluation of assembly a good feature set. K-mer counting features can been seen as the fundamental model building blocks to develop robust gene predictors.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"248 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124164189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/BRACIS.2019.00087
João Pedro Figueirôa Nascimento, R. Neto, Lourinaldo Júnior Macário Amorim
This paper aims to answer the following research question: "How to build an efficient kick strategy for agents in the 2D Simulation League?". The robot soccer provides an opportunity for students and professionals to apply their concepts of intelligent agent development. One of the main challenges of this game is to decide when a player must kick the ball to the goal. The proposed solution to solve this question is a data mining approach. The solution consists of three components: 1) use of the Random Forest technique as a classifier, 2) enrichment of the database through the construction of new variables and 3) Features Selection. In order to validate the proposed solution, a comparative study between the original kick strategy of a base team and the solution proposed was conducted. Experiments showed that the proposed approach delivers a performance superior. The results showed that the proposed policy reached a winning rate of 65% against 28% of the original.
{"title":"An Efficient Kick Strategy for Agents in the 2D Simulation League","authors":"João Pedro Figueirôa Nascimento, R. Neto, Lourinaldo Júnior Macário Amorim","doi":"10.1109/BRACIS.2019.00087","DOIUrl":"https://doi.org/10.1109/BRACIS.2019.00087","url":null,"abstract":"This paper aims to answer the following research question: \"How to build an efficient kick strategy for agents in the 2D Simulation League?\". The robot soccer provides an opportunity for students and professionals to apply their concepts of intelligent agent development. One of the main challenges of this game is to decide when a player must kick the ball to the goal. The proposed solution to solve this question is a data mining approach. The solution consists of three components: 1) use of the Random Forest technique as a classifier, 2) enrichment of the database through the construction of new variables and 3) Features Selection. In order to validate the proposed solution, a comparative study between the original kick strategy of a base team and the solution proposed was conducted. Experiments showed that the proposed approach delivers a performance superior. The results showed that the proposed policy reached a winning rate of 65% against 28% of the original.","PeriodicalId":335206,"journal":{"name":"Brazilian Conference on Intelligent Systems","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117259228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}