Inspiration in nature has been widely explored, from macro to micro-scale. When looking into chemical phenomena, stability and organization are two properties that emerge. Recently, artificial hydrocarbon networks (AHN), a supervised learning method inspired in the inner structures and mechanisms of chemical compounds, have been proposed as a data-driven approach in artificial intelligence. AHN have been successfully applied in data-driven approaches, such as: regression and classification models, control systems, signal processing, and robotics. To do so, molecules –the basic units of information in AHN– play an important role in the stability, organization and interpretability of this method. Interpretability, saving computing resources, and predictability have been handled by AHN, as any other machine learning model. This short paper aims to highlight the challenges, issues and trends of artificial hydrocarbon networks as a data-driven method. Throughout this document, it presents a description of the main insights of AHN and the efforts to tackle interpretability and training acceleration. Potential applications and future trends on AHN are also discussed.
{"title":"Challenges and Issues on Artificial Hydrocarbon Networks: The Chemical Nature of Data-Driven Approaches","authors":"Hiram Ponce","doi":"10.52591/lxai201906157","DOIUrl":"https://doi.org/10.52591/lxai201906157","url":null,"abstract":"Inspiration in nature has been widely explored, from macro to micro-scale. When looking into chemical phenomena, stability and organization are two properties that emerge. Recently, artificial hydrocarbon networks (AHN), a supervised learning method inspired in the inner structures and mechanisms of chemical compounds, have been proposed as a data-driven approach in artificial intelligence. AHN have been successfully applied in data-driven approaches, such as: regression and classification models, control systems, signal processing, and robotics. To do so, molecules –the basic units of information in AHN– play an important role in the stability, organization and interpretability of this method. Interpretability, saving computing resources, and predictability have been handled by AHN, as any other machine learning model. This short paper aims to highlight the challenges, issues and trends of artificial hydrocarbon networks as a data-driven method. Throughout this document, it presents a description of the main insights of AHN and the efforts to tackle interpretability and training acceleration. Potential applications and future trends on AHN are also discussed.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134026308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SafePredict is a novel meta-algorithm that works with any base prediction algorithm for online data to guarantee an arbitrarily chosen correctness rate, 1−ϵ, by allowing refusals. Allowing refusals means that the meta-algorithm may refuse to emit a prediction produced by the base algorithm on occasion so that the error rate on non-refused predictions does not exceed ϵ. The SafePredict error bound does not rely on any assumptions on the data distribution or the base predictor. When the base predictor happens not to exceed the target error rate ϵ, SafePredict refuses only a finite number of times. When the error rate of the base predictor changes through time SafePredict makes use of a weight-shifting heuristic that adapts to these changes without knowing when the changes occur yet still maintains the correctness guarantee. Empirical results show that (i) SafePredict compares favorably with state-of-the art confidence based refusal mechanisms which fail to offer robust error guarantees; and (ii) combining SafePredict with such refusal mechanisms can in many cases further reduce the number of refusals. Our software (currently in Python) is included in the supplementary material.
{"title":"SafePredict: A Machine Learning Meta-Algorithm That Uses Refusals to Guarantee Correctness","authors":"David Ramirez","doi":"10.52591/lxai2019061513","DOIUrl":"https://doi.org/10.52591/lxai2019061513","url":null,"abstract":"SafePredict is a novel meta-algorithm that works with any base prediction algorithm for online data to guarantee an arbitrarily chosen correctness rate, 1−ϵ, by allowing refusals. Allowing refusals means that the meta-algorithm may refuse to emit a prediction produced by the base algorithm on occasion so that the error rate on non-refused predictions does not exceed ϵ. The SafePredict error bound does not rely on any assumptions on the data distribution or the base predictor. When the base predictor happens not to exceed the target error rate ϵ, SafePredict refuses only a finite number of times. When the error rate of the base predictor changes through time SafePredict makes use of a weight-shifting heuristic that adapts to these changes without knowing when the changes occur yet still maintains the correctness guarantee. Empirical results show that (i) SafePredict compares favorably with state-of-the art confidence based refusal mechanisms which fail to offer robust error guarantees; and (ii) combining SafePredict with such refusal mechanisms can in many cases further reduce the number of refusals. Our software (currently in Python) is included in the supplementary material.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133276342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mining process is a emerging research area that combines data mining and machine learning, on one hand, and business process modeling and analysis, on the other hand. Mining process aims at discovering, monitoring and improving business processes by extracting real knowledge from event logs produced by the information systems used by organizations. This work aims to assess the application of computational intelligence and machine learning techniques in process mining context. The main focus of the study was to identify why the computational intelligence and machine learning techniques are not being widely used in process mining field and identify the main reasons for this phenomenon.
{"title":"A study of the application of computational intelligence and machine learning techniques in business process mining - A brief","authors":"A. Cardenas","doi":"10.52591/lxai201906158","DOIUrl":"https://doi.org/10.52591/lxai201906158","url":null,"abstract":"Mining process is a emerging research area that combines data mining and machine learning, on one hand, and business process modeling and analysis, on the other hand. Mining process aims at discovering, monitoring and improving business processes by extracting real knowledge from event logs produced by the information systems used by organizations. This work aims to assess the application of computational intelligence and machine learning techniques in process mining context. The main focus of the study was to identify why the computational intelligence and machine learning techniques are not being widely used in process mining field and identify the main reasons for this phenomenon.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121201027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We studied the utility of using machine learning algorithms in the estimation of feature importance and to visualize their dependence on Ethicality. Through our analysis and partial dependence plot we found linear relationships among variables and gained insight into features that might cause certain types of ethical behaviour.
{"title":"ML-Based Feature Importance Estimation for Predicting Unethical Behaviour under Pressure","authors":"Pablo Rivas, P. Harper, John Cary, William Brown","doi":"10.52591/lxai201906155","DOIUrl":"https://doi.org/10.52591/lxai201906155","url":null,"abstract":"We studied the utility of using machine learning algorithms in the estimation of feature importance and to visualize their dependence on Ethicality. Through our analysis and partial dependence plot we found linear relationships among variables and gained insight into features that might cause certain types of ethical behaviour.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116565824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debora Lina Ciriaco, Alexandre Pessoa, L. Salvador, Renata Wassermann
The lack of semantic information is a big challenge, even in context-driven areas like Healthcare, characterized by established terminologies. Here, semantic data integration is the solution to provide precise information and answers to questions like: What is the care pathway of newborns diagnosed with a congenital anomaly in consequence of congenital syphilis in the city of Sao Paulo? This project will use a semantic data integration technique, ontology based data integration, to integrate three health databases from the city of Sao Paulo - Brazil: mortality, live births and hospital information system. It is expected that the integration of public health databases will help to map patient care pathways, predict public resource needs and minimize unnecessary spending.
{"title":"Semantic Data Integration for Public Health in Brazil","authors":"Debora Lina Ciriaco, Alexandre Pessoa, L. Salvador, Renata Wassermann","doi":"10.52591/lxai2019061514","DOIUrl":"https://doi.org/10.52591/lxai2019061514","url":null,"abstract":"The lack of semantic information is a big challenge, even in context-driven areas like Healthcare, characterized by established terminologies. Here, semantic data integration is the solution to provide precise information and answers to questions like: What is the care pathway of newborns diagnosed with a congenital anomaly in consequence of congenital syphilis in the city of Sao Paulo? This project will use a semantic data integration technique, ontology based data integration, to integrate three health databases from the city of Sao Paulo - Brazil: mortality, live births and hospital information system. It is expected that the integration of public health databases will help to map patient care pathways, predict public resource needs and minimize unnecessary spending.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"313 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133547366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose to develop a Deep Learning (DL) framework based on the paradigm of Genetic Programming (GP). The hypothesis is that GP non-parametric and non-differentiable learning units (abstract syntax trees) have the same learning and representation capacity to Artificial Neural Networks (ANN). In an analogy to the traditional ANN/Gradient Descend/Backpropagation DL approach, the proposed framework aims at building a DL alike model fully based on GP. Preliminary results when approaching a number of application domains, suggest that GP is able to deal with large amounts of training data, such as those required in DL tasks. However, extensive research is still required regarding the construction of a multi-layered learning architecture, another hallmark of DL.
{"title":"Deep Genetic Programming","authors":"Lino Rodríguez","doi":"10.52591/lxai2019061512","DOIUrl":"https://doi.org/10.52591/lxai2019061512","url":null,"abstract":"We propose to develop a Deep Learning (DL) framework based on the paradigm of Genetic Programming (GP). The hypothesis is that GP non-parametric and non-differentiable learning units (abstract syntax trees) have the same learning and representation capacity to Artificial Neural Networks (ANN). In an analogy to the traditional ANN/Gradient Descend/Backpropagation DL approach, the proposed framework aims at building a DL alike model fully based on GP. Preliminary results when approaching a number of application domains, suggest that GP is able to deal with large amounts of training data, such as those required in DL tasks. However, extensive research is still required regarding the construction of a multi-layered learning architecture, another hallmark of DL.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127757592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graffiti is a common phenomenon in urban scenarios. Differently from urban art, graffiti tagging is a vandalism act and many local governments are putting great effort to combat it. The graffiti map of a region can be a very useful resource because it may allow one to potentially combat vandalism in locations with high level of graffiti and also to cleanup saturated regions to discourage future acts. There is currently no automatic way of obtaining a graffiti map of a region and it is obtained by manual inspection by the police or by popular participation. In this sense, we describe an ongoing work where we propose an automatic way of obtaining a graffiti map of a neighbourhood. It consists of the systematic collection of street view images followed by the identification of graffiti tags in the collected dataset and finally, in the calculation of the proposed graffiti level of that location. We validate the proposed method by evaluating the geographical distribution of graffiti in a city known to have high concentration of graffiti - São Paulo, Brazil.
{"title":"Usage of street-level imagery for city-wide graffiti mapping","authors":"Eric K. Tokuda, Cláudio T. Silva, R. Cesar-Jr","doi":"10.52591/lxai2019061510","DOIUrl":"https://doi.org/10.52591/lxai2019061510","url":null,"abstract":"Graffiti is a common phenomenon in urban scenarios. Differently from urban art, graffiti tagging is a vandalism act and many local governments are putting great effort to combat it. The graffiti map of a region can be a very useful resource because it may allow one to potentially combat vandalism in locations with high level of graffiti and also to cleanup saturated regions to discourage future acts. There is currently no automatic way of obtaining a graffiti map of a region and it is obtained by manual inspection by the police or by popular participation. In this sense, we describe an ongoing work where we propose an automatic way of obtaining a graffiti map of a neighbourhood. It consists of the systematic collection of street view images followed by the identification of graffiti tags in the collected dataset and finally, in the calculation of the proposed graffiti level of that location. We validate the proposed method by evaluating the geographical distribution of graffiti in a city known to have high concentration of graffiti - São Paulo, Brazil.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123454492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classification of complex fractionated atrial electrograms (CFAE) is crucial for the study of atrial fibrillation and for the development of treatment strategies, because these electrophysiological phenomena represent a common target for radiofrequency ablation. Since the description of CFAEs in 2004, the scientific community have been focused their efforts into its characterization and automatic classification considering the degree of fractionation, a clinical scale used in ablation procedures. Endocardial sites associated to CFAEs are usual targets in ablation therapy, as is though that they play a role in maintenance of the arrhythmia.
{"title":"Classification of atrial electrograms in atrial fibrillation using Information Theory-based measures","authors":"J. Nicolet, Juan F. Restrepo, G. Schlotthauer","doi":"10.52591/lxai201906151","DOIUrl":"https://doi.org/10.52591/lxai201906151","url":null,"abstract":"Classification of complex fractionated atrial electrograms (CFAE) is crucial for the study of atrial fibrillation and for the development of treatment strategies, because these electrophysiological phenomena represent a common target for radiofrequency ablation. Since the description of CFAEs in 2004, the scientific community have been focused their efforts into its characterization and automatic classification considering the degree of fractionation, a clinical scale used in ablation procedures. Endocardial sites associated to CFAEs are usual targets in ablation therapy, as is though that they play a role in maintenance of the arrhythmia.","PeriodicalId":402227,"journal":{"name":"LatinX in AI at International Conference on Machine Learning 2019","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127621067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}