Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00049
Églen Protas, José Douglas Bratti, P. Ribeiro, Paulo L. J. Drews-Jr, S. Botelho
Convolutional Neural Networks became a state-of-the-art approach for many different problems of computer vision, pattern recognition, and image processing. However, due to the large number of parameters of these architectures, researchers may find difficult to explain what the networks are using as discriminative patterns. An alternative to better understand the behavior of the learned convolutional kernels is the use of visualization techniques. Currently, visualization techniques are more frequently applied to classification tasks. In this paper, we address the visualization of image-to-image translation. One of the contributions of our work is the possibility to modify a network based on the kernel visualization and achieve superior results.
{"title":"Visualization Techniques Applied to Image-to-Image Translation","authors":"Églen Protas, José Douglas Bratti, P. Ribeiro, Paulo L. J. Drews-Jr, S. Botelho","doi":"10.1109/BRACIS.2018.00049","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00049","url":null,"abstract":"Convolutional Neural Networks became a state-of-the-art approach for many different problems of computer vision, pattern recognition, and image processing. However, due to the large number of parameters of these architectures, researchers may find difficult to explain what the networks are using as discriminative patterns. An alternative to better understand the behavior of the learned convolutional kernels is the use of visualization techniques. Currently, visualization techniques are more frequently applied to classification tasks. In this paper, we address the visualization of image-to-image translation. One of the contributions of our work is the possibility to modify a network based on the kernel visualization and achieve superior results.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116521378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00015
Hilário Oliveira, R. Lins, Rinaldo Lima, F. Freitas, S. Simske
Multi-document summarization systems aim to generate a brief text containing the most relevant information from a collection of related documents. The fast and continually growing volume of text data has increasingly drawn the attention from users and researchers to such systems. Aspects such as sentence centrality and position have been extensively studied in multi-document summarization as indicators of content relevancy. Very few works have investigated their efficient integration using global-based optimization approaches, however. This paper proposes a concept-based integer linear programming approach for multi-document summarization of news articles that integrates centrality and position features to filter out the less relevant sentences and measure the importance of concepts (textual fragments) in composing the output summary. The presented approach relies on a centrality-based strategy to perform the sentence clustering process and also to support the sentence ordering step. The benchmarks conducted with four datasets of the Document Understanding Conferences from 2001 to 2004 demonstrate that the proposed approach presents competitive performance compared with other state-of-the-art methods.
{"title":"A Concept-Based ILP Approach for Multi-document Summarization Exploring Centrality and Position","authors":"Hilário Oliveira, R. Lins, Rinaldo Lima, F. Freitas, S. Simske","doi":"10.1109/BRACIS.2018.00015","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00015","url":null,"abstract":"Multi-document summarization systems aim to generate a brief text containing the most relevant information from a collection of related documents. The fast and continually growing volume of text data has increasingly drawn the attention from users and researchers to such systems. Aspects such as sentence centrality and position have been extensively studied in multi-document summarization as indicators of content relevancy. Very few works have investigated their efficient integration using global-based optimization approaches, however. This paper proposes a concept-based integer linear programming approach for multi-document summarization of news articles that integrates centrality and position features to filter out the less relevant sentences and measure the importance of concepts (textual fragments) in composing the output summary. The presented approach relies on a centrality-based strategy to perform the sentence clustering process and also to support the sentence ordering step. The benchmarks conducted with four datasets of the Document Understanding Conferences from 2001 to 2004 demonstrate that the proposed approach presents competitive performance compared with other state-of-the-art methods.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121752138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00021
R. Berri, F. Osório
In this work, a nonintrusive system has been developed using features from inertial sensors, car telemetry, and road lane data, enabling to recognize the driving style of a drunk driver. Drunk drivers caused 10,497 deaths on USA roads in 2016 according to NHTSA. The Naturalistic Driver Behavior Dataset (NDBD) was created specifically for this work and it was used to test the proposed system. The proposed system was designed to study drunk driving situations, but it can also be used to detect any other psychoactive drugs consumption that causes abnormal driver behaviors during driving. The classifier system's output is "no risk" (normal driving) or "risk" (drunk/abnormal driving). If the system is connected to an autonomous or semi-autonomous car control system, it can be enabled to step in and act in order to avoid dangerous situations, or it can activate an alarm, or also ask for external help (e.g. contact authorities). The best results achieved in the experiments obtained 98% of accuracy in NDBD frames and only 1.5% of frames labeled in NDBD as "no risk" had a wrong prediction. The proposed system is composed by an MLP neural classifier using sigmoidal activation function and with 14 neurons in input layer, 18 neurons in hidden layer, and 1 neuron in output layer of the network. It uses periods of 220 frames (22 seconds) for the predictions and a buffer of the last 3 predictions was used for reducing the number of false predictions for "risk" output. Thus, it could avoid wrong predictions (false positives), avoiding to incorrectly enable the alarms and semi-autonomous car control system.
{"title":"A Nonintrusive System for Detecting Drunk Drivers in Modern Vehicles","authors":"R. Berri, F. Osório","doi":"10.1109/BRACIS.2018.00021","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00021","url":null,"abstract":"In this work, a nonintrusive system has been developed using features from inertial sensors, car telemetry, and road lane data, enabling to recognize the driving style of a drunk driver. Drunk drivers caused 10,497 deaths on USA roads in 2016 according to NHTSA. The Naturalistic Driver Behavior Dataset (NDBD) was created specifically for this work and it was used to test the proposed system. The proposed system was designed to study drunk driving situations, but it can also be used to detect any other psychoactive drugs consumption that causes abnormal driver behaviors during driving. The classifier system's output is \"no risk\" (normal driving) or \"risk\" (drunk/abnormal driving). If the system is connected to an autonomous or semi-autonomous car control system, it can be enabled to step in and act in order to avoid dangerous situations, or it can activate an alarm, or also ask for external help (e.g. contact authorities). The best results achieved in the experiments obtained 98% of accuracy in NDBD frames and only 1.5% of frames labeled in NDBD as \"no risk\" had a wrong prediction. The proposed system is composed by an MLP neural classifier using sigmoidal activation function and with 14 neurons in input layer, 18 neurons in hidden layer, and 1 neuron in output layer of the network. It uses periods of 220 frames (22 seconds) for the predictions and a buffer of the last 3 predictions was used for reducing the number of false predictions for \"risk\" output. Thus, it could avoid wrong predictions (false positives), avoiding to incorrectly enable the alarms and semi-autonomous car control system.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125095799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00039
Raul Sena Ferreira, Bruno M. A. da Silva, W. Teixeira, Geraldo Zimbrão, L. Alvim
Machine learning solutions usually consider that the train and test data has the same probabilistic distribution, that is, the data is stationary. However, in streaming scenarios, data distribution generally change through the time, that is, the data is non-stationary. The main challenge in such online environment is the model adaptation for the constant drifts in data distribution. Besides, other important restriction may happen in online scenarios: the extreme latency to verify the labels. Worth to mention that the incremental drift assumption is that class distributions overlap at subsequent time steps. Hence, the core region of data distribution have significant overlap with incoming data. Therefore, selecting samples from these core regions helps to retain the most important instances that represent the new distribution. This selection is denominated core support extraction (CSE). Thus, we present a study about density-based algorithms applied in non-stationary environments. We compared KDE, GMM and two variations of DBSCAN against single semi-supervised approaches. We validated these approaches in seventeen synthetic datasets and a real one, showing the strengths and weaknesses of these CSE methods through many metrics. We show that a semi-supervised classifier is improved up to 68% on a real dataset when it is applied along with a density-based CSE algorithm. The results between KDE and GMM, as CSE methods, were close but the approach using KDE is more practical due to having less parameters.
{"title":"Density-Based Core Support Extraction for Non-stationary Environments with Extreme Verification Latency","authors":"Raul Sena Ferreira, Bruno M. A. da Silva, W. Teixeira, Geraldo Zimbrão, L. Alvim","doi":"10.1109/BRACIS.2018.00039","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00039","url":null,"abstract":"Machine learning solutions usually consider that the train and test data has the same probabilistic distribution, that is, the data is stationary. However, in streaming scenarios, data distribution generally change through the time, that is, the data is non-stationary. The main challenge in such online environment is the model adaptation for the constant drifts in data distribution. Besides, other important restriction may happen in online scenarios: the extreme latency to verify the labels. Worth to mention that the incremental drift assumption is that class distributions overlap at subsequent time steps. Hence, the core region of data distribution have significant overlap with incoming data. Therefore, selecting samples from these core regions helps to retain the most important instances that represent the new distribution. This selection is denominated core support extraction (CSE). Thus, we present a study about density-based algorithms applied in non-stationary environments. We compared KDE, GMM and two variations of DBSCAN against single semi-supervised approaches. We validated these approaches in seventeen synthetic datasets and a real one, showing the strengths and weaknesses of these CSE methods through many metrics. We show that a semi-supervised classifier is improved up to 68% on a real dataset when it is applied along with a density-based CSE algorithm. The results between KDE and GMM, as CSE methods, were close but the approach using KDE is more practical due to having less parameters.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134410082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00048
P. Freitas, W. Y. L. Akamine, Mylène C. Q. Farias
In many practical multimedia applications, the visual content is modified during transmission, enhancement, modification, and compression stages. These modifications often create visible distortions that may be perceived by humans. Therefore, the development of algorithms that are able to assess the visual quality as perceived by a human viewer can lead to significant progress in multimedia applications. Many researchers have developed algorithms that estimate visual quality. These algorithms can either make use of the full pristine content (full-reference metrics), partial aspects of the pristine content (reduced-reference metrics) or only the assessed content (referenceless or no-reference metrics). These three approaches have advantages and drawbacks. Nevertheless, although the design of a referenceless metric is more challenging, they have greater applicability in different scenarios. This paper introduces a novel referenceless image quality assessment (RIQA) metric. The proposed metric uses statistics of the Binarized Statistical Image Features descriptor (BSIF) to analyze the textures of an image. These statistics are mapped into subjective quality scores using a Random Forest Regression approach. Results show that the proposed metric is robust and accurate, outperforming other state-of-the-art RIQA methods.
{"title":"Towards a Referenceless Visual Quality Assessment Model Using Binarized Statistical Image Features","authors":"P. Freitas, W. Y. L. Akamine, Mylène C. Q. Farias","doi":"10.1109/BRACIS.2018.00048","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00048","url":null,"abstract":"In many practical multimedia applications, the visual content is modified during transmission, enhancement, modification, and compression stages. These modifications often create visible distortions that may be perceived by humans. Therefore, the development of algorithms that are able to assess the visual quality as perceived by a human viewer can lead to significant progress in multimedia applications. Many researchers have developed algorithms that estimate visual quality. These algorithms can either make use of the full pristine content (full-reference metrics), partial aspects of the pristine content (reduced-reference metrics) or only the assessed content (referenceless or no-reference metrics). These three approaches have advantages and drawbacks. Nevertheless, although the design of a referenceless metric is more challenging, they have greater applicability in different scenarios. This paper introduces a novel referenceless image quality assessment (RIQA) metric. The proposed metric uses statistics of the Binarized Statistical Image Features descriptor (BSIF) to analyze the textures of an image. These statistics are mapped into subjective quality scores using a Random Forest Regression approach. Results show that the proposed metric is robust and accurate, outperforming other state-of-the-art RIQA methods.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132148778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00086
J. C. Xavier, A. Freitas, Antonino Feitosa Neto, Teresa B Ludermir
Automated Machine Learning (Auto-ML) is an emerging area of ML which consists of automatically selecting the best ML algorithm and its best hyper-parameter settings for a given input dataset, by doing a search in a large space of candidate algorithms and settings. In this work we propose a new Evolutionary Algorithm (EA) for the Auto-ML task of automatically selecting the best ensemble of classifiers and their hyper-parameter settings for an input dataset. The proposed EA was compared against a version of the well-known Auto-WEKA method adapted to search in the same space of algorithms and hyper-parameter settings as the EA. In general, the EA obtained significantly smaller classification error rates than that Auto-WEKA version in experiments with 15 classification datasets.
{"title":"A Novel Evolutionary Algorithm for Automated Machine Learning Focusing on Classifier Ensembles","authors":"J. C. Xavier, A. Freitas, Antonino Feitosa Neto, Teresa B Ludermir","doi":"10.1109/BRACIS.2018.00086","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00086","url":null,"abstract":"Automated Machine Learning (Auto-ML) is an emerging area of ML which consists of automatically selecting the best ML algorithm and its best hyper-parameter settings for a given input dataset, by doing a search in a large space of candidate algorithms and settings. In this work we propose a new Evolutionary Algorithm (EA) for the Auto-ML task of automatically selecting the best ensemble of classifiers and their hyper-parameter settings for an input dataset. The proposed EA was compared against a version of the well-known Auto-WEKA method adapted to search in the same space of algorithms and hyper-parameter settings as the EA. In general, the EA obtained significantly smaller classification error rates than that Auto-WEKA version in experiments with 15 classification datasets.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130702708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00065
M. W. Rodrigues, Wladmir Cardoso Brandão, Luis E. Zárate
Scientific collaboration improves researchers productivity by providing a way to share new ideas, learn new techniques, and find new research applications, increasing the chance to access funding. Beyond ethics and reciprocity, there are other important aspects on achieving scientific collaborations, such as research interests and expected productivity gain, that are paramount to a successful partnership. However, achieve effective collaborations is a hard work and can drain researchers time. In this work, we propose a recommendation approach that uses different strategies to suggest scientific collaboration for researchers based on their research interest. In particular, our approach exploits ResearchGate, a well known research social network from where research interests and researchers production are used to model similarity between them. Experimental results show that the content-based strategy outperforms neighborhood-based collaborative filtering strategies to recommend scientific collaboration with gains of up 16.60% in precision, 37.19% in recall, and 21.16% in F1 for the top-20 recommendation lists.
{"title":"Recommending Scientific Collaboration from ResearchGate","authors":"M. W. Rodrigues, Wladmir Cardoso Brandão, Luis E. Zárate","doi":"10.1109/BRACIS.2018.00065","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00065","url":null,"abstract":"Scientific collaboration improves researchers productivity by providing a way to share new ideas, learn new techniques, and find new research applications, increasing the chance to access funding. Beyond ethics and reciprocity, there are other important aspects on achieving scientific collaborations, such as research interests and expected productivity gain, that are paramount to a successful partnership. However, achieve effective collaborations is a hard work and can drain researchers time. In this work, we propose a recommendation approach that uses different strategies to suggest scientific collaboration for researchers based on their research interest. In particular, our approach exploits ResearchGate, a well known research social network from where research interests and researchers production are used to model similarity between them. Experimental results show that the content-based strategy outperforms neighborhood-based collaborative filtering strategies to recommend scientific collaboration with gains of up 16.60% in precision, 37.19% in recall, and 21.16% in F1 for the top-20 recommendation lists.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127217541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00090
Luis Guilherme Bergamini Mendes, A. Vendramin, Anelise Munaretto, M. Delgado
This paper presents a rule-based system for the Greedy Ant Protocol (GrAnt), named rGrAnt. GrAnt uses the Ant Colony Optimization (ACO) meta-heuristic aiming to route traffic in complex and dynamic Delay Tolerant Networks. rGrAnt has been developed to provide the protocol the ability to extract information online from nodes' social connectivity, which can range from disconnected and sparse to highly connected networking environments. With this information, the proposed protocol can guide through its fuzzy/crisp rules the ACO routing module by deciding when to consider data from heuristic functions and/or pheromone concentration, which data can be incorporated in both heuristic and pheromone parameters, and if the message forwarding phase must be less or more restrictive. In nodes with low connectivity, the rules of rGrAnt indicate that the protocol must be less restrictive when forwarding messages, in order to make better use of the few available contacts. In contrast, in nodes with high connectivity, it is necessary to restrict forwarding to avoid overloading the same sets of nodes and links. rGrAnt is compared with GrAnt in three different movement models. Results show that, in the three models, rGrAnt achieves a higher delivery ratio than GrAnt.
{"title":"A Rule-Based Greedy Ant (rGrAnt) Protocol for Networking Environments","authors":"Luis Guilherme Bergamini Mendes, A. Vendramin, Anelise Munaretto, M. Delgado","doi":"10.1109/BRACIS.2018.00090","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00090","url":null,"abstract":"This paper presents a rule-based system for the Greedy Ant Protocol (GrAnt), named rGrAnt. GrAnt uses the Ant Colony Optimization (ACO) meta-heuristic aiming to route traffic in complex and dynamic Delay Tolerant Networks. rGrAnt has been developed to provide the protocol the ability to extract information online from nodes' social connectivity, which can range from disconnected and sparse to highly connected networking environments. With this information, the proposed protocol can guide through its fuzzy/crisp rules the ACO routing module by deciding when to consider data from heuristic functions and/or pheromone concentration, which data can be incorporated in both heuristic and pheromone parameters, and if the message forwarding phase must be less or more restrictive. In nodes with low connectivity, the rules of rGrAnt indicate that the protocol must be less restrictive when forwarding messages, in order to make better use of the few available contacts. In contrast, in nodes with high connectivity, it is necessary to restrict forwarding to avoid overloading the same sets of nodes and links. rGrAnt is compared with GrAnt in three different movement models. Results show that, in the three models, rGrAnt achieves a higher delivery ratio than GrAnt.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133909630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/bracis.2018.00066
Eduardo Pereira Fressato, Arthur Fortes da Costa, Marcelo Garcia Manzato
In recommender systems (RS) one of the most used approaches is collaborative filtering (CF), which recommends items according to the behavior of similar users. Among CF approaches, those based on matrix factorization are generally more effective because they allow the system to discover the underlying characteristics of interactions between users and items. However, this approach presents the cold-start problem, which occurs because of the system's inability to recommend new items and/or accurately predict new users' preferences. This paper proposes a novel matrix factorization approach, which incorporates similarity of items using their metadata, in order to improve the rating prediction task in an item cold-start scenario. For this purpose, we explore semantic descriptions of items which are gathered from knowledge bases available online. Our approach is evaluated in two different and publicly available datasets and compared against content-based and collaborative algorithms. The experiments show the effectiveness of our approach in the item cold-start scenario.
{"title":"Similarity-Based Matrix Factorization for Item Cold-Start in Recommender Systems","authors":"Eduardo Pereira Fressato, Arthur Fortes da Costa, Marcelo Garcia Manzato","doi":"10.1109/bracis.2018.00066","DOIUrl":"https://doi.org/10.1109/bracis.2018.00066","url":null,"abstract":"In recommender systems (RS) one of the most used approaches is collaborative filtering (CF), which recommends items according to the behavior of similar users. Among CF approaches, those based on matrix factorization are generally more effective because they allow the system to discover the underlying characteristics of interactions between users and items. However, this approach presents the cold-start problem, which occurs because of the system's inability to recommend new items and/or accurately predict new users' preferences. This paper proposes a novel matrix factorization approach, which incorporates similarity of items using their metadata, in order to improve the rating prediction task in an item cold-start scenario. For this purpose, we explore semantic descriptions of items which are gathered from knowledge bases available online. Our approach is evaluated in two different and publicly available datasets and compared against content-based and collaborative algorithms. The experiments show the effectiveness of our approach in the item cold-start scenario.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123849110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/BRACIS.2018.00078
A. Rivolli, C. Soares, A. Carvalho
In multi-label classification tasks, instances are simultaneously associated with multiple labels, representing different and, possibly, related concepts from a domain. One characteristic of these tasks is a high class-label imbalance. In order to obtain improved predictive models, several algorithms either have explored the label dependencies or have dealt with the problem of imbalanced labels. This work proposes a label expansion approach which combines both alternatives. For such, some labels are expanded with data from a related class label, making the labels more balanced and representative. Preliminary experiments show the effectiveness of this approach to improve the Binary Relevance strategy. Particularly, it reduced the number of labels that were never predicted in the test instances. Although the results are preliminary, they are potentially attractive, considering the scale and consistency of the improvement obtained, as well as the broad scope of the proposed approach.
{"title":"Label Expansion for Multi-label Classification","authors":"A. Rivolli, C. Soares, A. Carvalho","doi":"10.1109/BRACIS.2018.00078","DOIUrl":"https://doi.org/10.1109/BRACIS.2018.00078","url":null,"abstract":"In multi-label classification tasks, instances are simultaneously associated with multiple labels, representing different and, possibly, related concepts from a domain. One characteristic of these tasks is a high class-label imbalance. In order to obtain improved predictive models, several algorithms either have explored the label dependencies or have dealt with the problem of imbalanced labels. This work proposes a label expansion approach which combines both alternatives. For such, some labels are expanded with data from a related class label, making the labels more balanced and representative. Preliminary experiments show the effectiveness of this approach to improve the Binary Relevance strategy. Particularly, it reduced the number of labels that were never predicted in the test instances. Although the results are preliminary, they are potentially attractive, considering the scale and consistency of the improvement obtained, as well as the broad scope of the proposed approach.","PeriodicalId":405190,"journal":{"name":"2018 7th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129070466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}