Cleovaldo José De Lima E Silva Junior, I. Vanderlei, Jean Carlos Teixeira De Araujo, Rodrigo Rocha
In the ideal model, universities are naturally constituted by democratic decisions involving voting from different collegiate bodies, commissions, sectors, and the general community. Following the trend of digital democracy, electronic voting tools have been significantly applied in Public Universities in recent years, accentuated by the Covid-19 pandemic. One of the characteristics of electronic voting software is that they have several layers of security and protocols that protect the integrity of a virtual election. This paper used the “Attack Tree” and “Risk Assessment” methods to propose and present a heuristic method of security assessment, which could serve as a model for future digital elections applied in Public Universities.
{"title":"Evaluation of Open-Source E-Voting Systems Using Helios Voting in Public University Elections","authors":"Cleovaldo José De Lima E Silva Junior, I. Vanderlei, Jean Carlos Teixeira De Araujo, Rodrigo Rocha","doi":"10.1145/3539637.3557061","DOIUrl":"https://doi.org/10.1145/3539637.3557061","url":null,"abstract":"In the ideal model, universities are naturally constituted by democratic decisions involving voting from different collegiate bodies, commissions, sectors, and the general community. Following the trend of digital democracy, electronic voting tools have been significantly applied in Public Universities in recent years, accentuated by the Covid-19 pandemic. One of the characteristics of electronic voting software is that they have several layers of security and protocols that protect the integrity of a virtual election. This paper used the “Attack Tree” and “Risk Assessment” methods to propose and present a heuristic method of security assessment, which could serve as a model for future digital elections applied in Public Universities.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126618044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel P. Oliveira, Arthur P. G. Reis, Felipe A. N. Freitas, Lucas L. Costa, Mariana O. Silva, P. Brum, Samuel E. L. Oliveira, Michele A. Brandão, A. Lacerda, G. Pappa
One application for using government data is the detection of irregularities that may indicate fraud in the public sector. This paper presents an approach that analyzes public bidding data available on the Web to detect bidder inconsistencies. Specifically, we propose a hierarchical decision approach from public bidding data, where each bidder is classified as Valid, Doubtful, or Invalid, based on the compatibility between the bidding items and the divisions of the CNAE codes (National Classification of Economic activities). The results reveal that combining commonly available data on bidders and extracting the description of bid items can help in fraud detection. Furthermore, the proposed approach can reduce the number of bids a specialist must analyze to detect fraud, making it easier to identify inconsistencies.
{"title":"Detecting Inconsistencies in Public Bids: An Automated and Data-based Approach","authors":"Gabriel P. Oliveira, Arthur P. G. Reis, Felipe A. N. Freitas, Lucas L. Costa, Mariana O. Silva, P. Brum, Samuel E. L. Oliveira, Michele A. Brandão, A. Lacerda, G. Pappa","doi":"10.1145/3539637.3558230","DOIUrl":"https://doi.org/10.1145/3539637.3558230","url":null,"abstract":"One application for using government data is the detection of irregularities that may indicate fraud in the public sector. This paper presents an approach that analyzes public bidding data available on the Web to detect bidder inconsistencies. Specifically, we propose a hierarchical decision approach from public bidding data, where each bidder is classified as Valid, Doubtful, or Invalid, based on the compatibility between the bidding items and the divisions of the CNAE codes (National Classification of Economic activities). The results reveal that combining commonly available data on bidders and extracting the description of bid items can help in fraud detection. Furthermore, the proposed approach can reduce the number of bids a specialist must analyze to detect fraud, making it easier to identify inconsistencies.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125843863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Video summarization has attracted researchers’ attention because it provides a compact and informative video version, supporting users and systems to save efforts in searching and understanding content of interest. Current techniques employ different strategies to select which video segments should be included in the final summary. The challenge is to process multimodal data present in the video looking for relevance clues (like redundant or complementary information) that help make a decision. A recent strategy is to use subjectivity detection. The presence or the absence of subjectivity can be explored as a relevance clue, helping to bring video summaries closer to the final user’s expectations. However, despite this potential, there is a gap on how to capture subjectivity information from videos. This paper investigates video summarization through subjectivity classification from video transcripts. This approach requires dealing with recent challenges that are important in video summarization tasks, such as detecting subjectivity in different languages and across multiple domains. We propose a multilingual machine learning model trained to deal with subjectivity classification in multiple domains. An experimental evaluation with different benchmark datasets indicates that our multilingual and multi-domain method achieves competitive results, even compared to language-specific models. Furthermore, such a model can be used to provide subjectivity as a content selection criterion in the video summarization task, filtering out segments that are not relevant to a video domain of interest.
{"title":"Video Summarization using Text Subjectivity Classification","authors":"L. Moraes, R. Marcacini, R. Goularte","doi":"10.1145/3539637.3556998","DOIUrl":"https://doi.org/10.1145/3539637.3556998","url":null,"abstract":"Video summarization has attracted researchers’ attention because it provides a compact and informative video version, supporting users and systems to save efforts in searching and understanding content of interest. Current techniques employ different strategies to select which video segments should be included in the final summary. The challenge is to process multimodal data present in the video looking for relevance clues (like redundant or complementary information) that help make a decision. A recent strategy is to use subjectivity detection. The presence or the absence of subjectivity can be explored as a relevance clue, helping to bring video summaries closer to the final user’s expectations. However, despite this potential, there is a gap on how to capture subjectivity information from videos. This paper investigates video summarization through subjectivity classification from video transcripts. This approach requires dealing with recent challenges that are important in video summarization tasks, such as detecting subjectivity in different languages and across multiple domains. We propose a multilingual machine learning model trained to deal with subjectivity classification in multiple domains. An experimental evaluation with different benchmark datasets indicates that our multilingual and multi-domain method achieves competitive results, even compared to language-specific models. Furthermore, such a model can be used to provide subjectivity as a content selection criterion in the video summarization task, filtering out segments that are not relevant to a video domain of interest.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125062965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Internet of Things (IoT) devices are increasingly present in people’s daily lives. Thus has increased research interest in investigating strategies that can ensure that these applications work as expected considering specific and vital characteristics of IoT, for example, security, performance and interoperability. In a testing point of view, there is a need to optimize and define an efficient strategy, from its planning to its execution. Considering all the steps that can be taken to test an IoT application, this process, if performed manually, can demand great effort and time. Machine learning (ML) algorithms have been applied in several areas of computing in order to optimize and automate processes that involve large volumes of data. In this paper, we present a systematic mapping resulting in 40 studies that highlights techniques or approaches that use machine learning algorithms for the most diverse goals within the IoT application testing process, such as the use of neural networks for predicting the cost of time in the preparation and execution of tests; identification of security attacks; and automatic generation of test cases from textual language. We also identified that the vast majority of testing techniques are focused on a specific IoT characteristic (e.g., security, performance), specially security, and apply the machine learning algorithm in two ways: directly in the algorithm, called predictive maintenance, or during the execution of planned tests, both of them bring difficulties related to extracting and defining data to train ML algorithms.
{"title":"Using Machine Learning on Testing IoT Applications: a systematic mapping","authors":"L. M. Freitas, Valéria Lelli","doi":"10.1145/3539637.3558049","DOIUrl":"https://doi.org/10.1145/3539637.3558049","url":null,"abstract":"Internet of Things (IoT) devices are increasingly present in people’s daily lives. Thus has increased research interest in investigating strategies that can ensure that these applications work as expected considering specific and vital characteristics of IoT, for example, security, performance and interoperability. In a testing point of view, there is a need to optimize and define an efficient strategy, from its planning to its execution. Considering all the steps that can be taken to test an IoT application, this process, if performed manually, can demand great effort and time. Machine learning (ML) algorithms have been applied in several areas of computing in order to optimize and automate processes that involve large volumes of data. In this paper, we present a systematic mapping resulting in 40 studies that highlights techniques or approaches that use machine learning algorithms for the most diverse goals within the IoT application testing process, such as the use of neural networks for predicting the cost of time in the preparation and execution of tests; identification of security attacks; and automatic generation of test cases from textual language. We also identified that the vast majority of testing techniques are focused on a specific IoT characteristic (e.g., security, performance), specially security, and apply the machine learning algorithm in two ways: directly in the algorithm, called predictive maintenance, or during the execution of planned tests, both of them bring difficulties related to extracting and defining data to train ML algorithms.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121588587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommender systems have become crucial since they appeared, helping users make decisions. Commonly, recommendation algorithms use the historical interaction data between users and items to predict the users’ tastes and suggest new items. However, offering recommendations sometimes is insufficient for the user to make a decision. In this way, the recommendations’ explanation to support the decision-making process has been considered an essential property. The explanations of recommendations can be generated from different resources, such as unstructured data (e.g., users’ reviews), and presented in many ways, such as summarization. However, offering static explanations may not be helpful in several situations. For example, some users familiar with the content may be willing to receive explanations with fewer details than others who are not acquainted with the domain. In this context, we an approach propose to generate summaries with different levels of detail as post-hoc explanations. We used an aspect-based extractive summarization approach with hierarchical clustering of aspects to select sentences from users’ reviews. Then, this hierarchical structure is used to create explanations of recommended items with different lengths, depending on the user’s preferences. Our dynamic explanation system was evaluated against two state-of-art baselines, and the results are promising.
{"title":"Aspect-Based Summarization: An Approach With Different Levels of Details to Explain Recommendations","authors":"Luan Soares de Souza, M. Manzato","doi":"10.1145/3539637.3557002","DOIUrl":"https://doi.org/10.1145/3539637.3557002","url":null,"abstract":"Recommender systems have become crucial since they appeared, helping users make decisions. Commonly, recommendation algorithms use the historical interaction data between users and items to predict the users’ tastes and suggest new items. However, offering recommendations sometimes is insufficient for the user to make a decision. In this way, the recommendations’ explanation to support the decision-making process has been considered an essential property. The explanations of recommendations can be generated from different resources, such as unstructured data (e.g., users’ reviews), and presented in many ways, such as summarization. However, offering static explanations may not be helpful in several situations. For example, some users familiar with the content may be willing to receive explanations with fewer details than others who are not acquainted with the domain. In this context, we an approach propose to generate summaries with different levels of detail as post-hoc explanations. We used an aspect-based extractive summarization approach with hierarchical clustering of aspects to select sentences from users’ reviews. Then, this hierarchical structure is used to create explanations of recommended items with different lengths, depending on the user’s preferences. Our dynamic explanation system was evaluated against two state-of-art baselines, and the results are promising.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"38 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121004375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Internet of Things (IoT) devices are increasingly accessible and are part of people’s daily lives. This opens up great possibilities for innovative storytelling experiences, allowing new forms of consumption, going beyond conventional multimedia. In this context, the need for advances in the representation and orchestration of pervasive storytelling is perceptible. This work proposes a conceptual model called Pervasive Storytelling (PST) that aims to represent stories to be told in a pervasive way. By modeling the specific domain of pervasive storytelling at a high level of abstraction, the model supports the variability typical of pervasive environments, such as changes in location, device connectivity, proximity between users, and others. This work also proposes a cloud presentation engine, capable of interpreting and orchestrating storytelling instances represented through PST.
{"title":"Representation model and cloud-based orchestrator for pervasive storytelling","authors":"Pedro H V Almeida, Carlos Pernisa, M. Moreno","doi":"10.1145/3539637.3558047","DOIUrl":"https://doi.org/10.1145/3539637.3558047","url":null,"abstract":"Internet of Things (IoT) devices are increasingly accessible and are part of people’s daily lives. This opens up great possibilities for innovative storytelling experiences, allowing new forms of consumption, going beyond conventional multimedia. In this context, the need for advances in the representation and orchestration of pervasive storytelling is perceptible. This work proposes a conceptual model called Pervasive Storytelling (PST) that aims to represent stories to be told in a pervasive way. By modeling the specific domain of pervasive storytelling at a high level of abstraction, the model supports the variability typical of pervasive environments, such as changes in location, device connectivity, proximity between users, and others. This work also proposes a cloud presentation engine, capable of interpreting and orchestrating storytelling instances represented through PST.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130603208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Author Profiling is a prominent research area in which computational approaches have been proposed to predict authors’ characteristics from their texts. Gender, age, personality traits, and occupation are examples of commonly analyzed characteristics. It is a task of growing importance, with applications in different areas such as forensics, marketing, and e-commerce. Although a lot of research has been conducted on this task for some widely used languages (e.g., English), there is still a lot of room for improvement in studies involving the Portuguese language. Thus, this work contributes by proposing and evaluating a cascading approach, which combines a weighted lexical approach, a heuristic, and a classifier, for the gender prediction problem using only textual content written in the Portuguese language. The proposed approach considers both specificities of the Portuguese language and domain characteristics of the texts. The results obtained from the proposed approach showed that exploring the specificities of the Portuguese language and domain characteristics of the texts can positively contribute to the performance of the gender prediction task.
{"title":"A Cascade Approach for Gender Prediction from Texts in Portuguese Language","authors":"João Pedro Moreira de Morais, L. Merschmann","doi":"10.1145/3539637.3557057","DOIUrl":"https://doi.org/10.1145/3539637.3557057","url":null,"abstract":"Author Profiling is a prominent research area in which computational approaches have been proposed to predict authors’ characteristics from their texts. Gender, age, personality traits, and occupation are examples of commonly analyzed characteristics. It is a task of growing importance, with applications in different areas such as forensics, marketing, and e-commerce. Although a lot of research has been conducted on this task for some widely used languages (e.g., English), there is still a lot of room for improvement in studies involving the Portuguese language. Thus, this work contributes by proposing and evaluating a cascading approach, which combines a weighted lexical approach, a heuristic, and a classifier, for the gender prediction problem using only textual content written in the Portuguese language. The proposed approach considers both specificities of the Portuguese language and domain characteristics of the texts. The results obtained from the proposed approach showed that exploring the specificities of the Portuguese language and domain characteristics of the texts can positively contribute to the performance of the gender prediction task.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcos Fontes Feitosa, Saul Rocha, G. Gonçalves, C. H. G. Ferreira, Jussara M. Almeida
Violence and a sense of insecurity are among the main problems in urban centres. In Brazil, an average rate of 20 deaths per month is estimated for every 100,000 inhabitants due to violence. Virtual social networks are increasingly used as a means for users to express their opinions or indignation about this problem. In this article, we analyze the sentiment of users in comments shared on Twitter about police operations with great repercussions in news portals in Brazil. In this sense, we explore lexicon and machine learning models to understand the emotion in which users discuss public safety on social networks and their opinion about the work of government agencies to reduce violence in cities. Our experiments show how challenging this inference is given peculiar characteristics of the context, such as mostly negative and sarcastic expressions. Nevertheless, our best classifiers achieved accuracy and specificity (macro F1) greater than 60% for identifying sentiments polarity, indicating a promising methodology for automatically inferring public opinion about police operations.
{"title":"Sentiment Analysis on Twitter Repercussion of Police Operations","authors":"Marcos Fontes Feitosa, Saul Rocha, G. Gonçalves, C. H. G. Ferreira, Jussara M. Almeida","doi":"10.1145/3539637.3558050","DOIUrl":"https://doi.org/10.1145/3539637.3558050","url":null,"abstract":"Violence and a sense of insecurity are among the main problems in urban centres. In Brazil, an average rate of 20 deaths per month is estimated for every 100,000 inhabitants due to violence. Virtual social networks are increasingly used as a means for users to express their opinions or indignation about this problem. In this article, we analyze the sentiment of users in comments shared on Twitter about police operations with great repercussions in news portals in Brazil. In this sense, we explore lexicon and machine learning models to understand the emotion in which users discuss public safety on social networks and their opinion about the work of government agencies to reduce violence in cities. Our experiments show how challenging this inference is given peculiar characteristics of the context, such as mostly negative and sarcastic expressions. Nevertheless, our best classifiers achieved accuracy and specificity (macro F1) greater than 60% for identifying sentiments polarity, indicating a promising methodology for automatically inferring public opinion about police operations.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117064885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated Machine Learning (AutoML) is a research area that aims to help humans solve Machine Learning (ML) problems by automatically discovering good model pipelines (algorithms and their hyperparameters for every stage of a machine learning process) for a given dataset. Since we have a combinatorial optimization problem for which it is impossible to evaluate all possible pipelines, most AutoML systems use Evolutionary Algorithm (EA) or Bayesian Optimization (BO) to find a good solution. As these systems usually evaluate the pipelines’ performance using the k-fold cross-validation method, the chance of finding an overfitted solution increases with the number of pipelines evaluated. Therefore, to avoid the aforementioned issue, we propose an Auto-ML system, named Auto-ML System for Text Classification (ASTeC), that uses the Bootstrap Bias Corrected CV (BBC-CV) to evaluate the pipelines’ performance. More specifically, the proposed system combines EA, BO, and BBC-CV to find a good model pipeline for the text classification task. We evaluate our proposal by comparing it against two state-of-the-art systems, the Tree-based Pipeline Optimization Tool (TPOT) and Google Cloud AutoML service. To do so, we use seven public datasets composed of written Brazilian Portuguese texts from the sentiment analysis domain. Statistical tests show that our system is equivalent to or better than both of them in all evaluated datasets.
自动化机器学习(AutoML)是一个研究领域,旨在通过自动发现给定数据集的良好模型管道(机器学习过程的每个阶段的算法及其超参数)来帮助人类解决机器学习(ML)问题。由于我们有一个组合优化问题,它不可能评估所有可能的管道,大多数AutoML系统使用进化算法(EA)或贝叶斯优化(BO)来找到一个好的解决方案。由于这些系统通常使用k-fold交叉验证方法来评估管道的性能,因此发现过拟合解的机会随着评估管道数量的增加而增加。因此,为了避免上述问题,我们提出了一个Auto-ML系统,名为Auto-ML system for Text Classification (ASTeC),它使用Bootstrap Bias Corrected CV (BBC-CV)来评估管道的性能。更具体地说,该系统结合了EA、BO和BBC-CV,为文本分类任务找到了一个良好的模型管道。我们通过比较两个最先进的系统来评估我们的建议,即基于树的管道优化工具(TPOT)和谷歌云自动服务。为此,我们使用了七个公共数据集,这些数据集由情感分析领域的书面巴西葡萄牙语文本组成。统计测试表明,在所有评估的数据集中,我们的系统等于或优于两者。
{"title":"An Auto-ML Approach Applied to Text Classification","authors":"Douglas Nunes de Oliveira, L. Merschmann","doi":"10.1145/3539637.3557054","DOIUrl":"https://doi.org/10.1145/3539637.3557054","url":null,"abstract":"Automated Machine Learning (AutoML) is a research area that aims to help humans solve Machine Learning (ML) problems by automatically discovering good model pipelines (algorithms and their hyperparameters for every stage of a machine learning process) for a given dataset. Since we have a combinatorial optimization problem for which it is impossible to evaluate all possible pipelines, most AutoML systems use Evolutionary Algorithm (EA) or Bayesian Optimization (BO) to find a good solution. As these systems usually evaluate the pipelines’ performance using the k-fold cross-validation method, the chance of finding an overfitted solution increases with the number of pipelines evaluated. Therefore, to avoid the aforementioned issue, we propose an Auto-ML system, named Auto-ML System for Text Classification (ASTeC), that uses the Bootstrap Bias Corrected CV (BBC-CV) to evaluate the pipelines’ performance. More specifically, the proposed system combines EA, BO, and BBC-CV to find a good model pipeline for the text classification task. We evaluate our proposal by comparing it against two state-of-the-art systems, the Tree-based Pipeline Optimization Tool (TPOT) and Google Cloud AutoML service. To do so, we use seven public datasets composed of written Brazilian Portuguese texts from the sentiment analysis domain. Statistical tests show that our system is equivalent to or better than both of them in all evaluated datasets.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Filipe Fernandes S. B. de Matos, Wellington Oliveira, F. C. Filho, P. Rego, Fernando A. M. Trinta
Computation offloading has been proposed as an efficient technique to mitigate the computational and energy restrictions associated with mobile devices. Previous work has shown that network latency is a challenge for offloading solutions. In the last years, we have seen continuous improvement in mobile device hardware and studies that have pointed to Java’s poor performance compared to other programming languages. This paper proposes a new Android service, called the Multi-Language Offloading Service, that exploits these two aspects to reduce network consumption and indirectly mitigate the latency problem in an offloading scenario. This service scans the local network searching for binaries of server processes, and executes them on the mobile device itself to handle the requests of the client application locally, without depending on the network. We perform tests with real devices and a Java benchmark application that communicates with Rust server processes via the Apache Thrift framework. The results indicate that, when processing tasks that handle large amounts of data, the service reduces up to forty times the network consumption, 86% the task response time, and 25% the energy use of the mobile device.
{"title":"Multi-Language Offloading Service: An Android Service Aimed at Mitigating the Network Consumption During Computation Offloading","authors":"Filipe Fernandes S. B. de Matos, Wellington Oliveira, F. C. Filho, P. Rego, Fernando A. M. Trinta","doi":"10.1145/3539637.3557001","DOIUrl":"https://doi.org/10.1145/3539637.3557001","url":null,"abstract":"Computation offloading has been proposed as an efficient technique to mitigate the computational and energy restrictions associated with mobile devices. Previous work has shown that network latency is a challenge for offloading solutions. In the last years, we have seen continuous improvement in mobile device hardware and studies that have pointed to Java’s poor performance compared to other programming languages. This paper proposes a new Android service, called the Multi-Language Offloading Service, that exploits these two aspects to reduce network consumption and indirectly mitigate the latency problem in an offloading scenario. This service scans the local network searching for binaries of server processes, and executes them on the mobile device itself to handle the requests of the client application locally, without depending on the network. We perform tests with real devices and a Java benchmark application that communicates with Rust server processes via the Apache Thrift framework. The results indicate that, when processing tasks that handle large amounts of data, the service reduces up to forty times the network consumption, 86% the task response time, and 25% the energy use of the mobile device.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124386705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}