Proceedings of the Brazilian Symposium on Multimedia and the Web最新文献

英文中文

Evaluation of Open-Source E-Voting Systems Using Helios Voting in Public University Elections 在公立大学选举中使用Helios投票的开源电子投票系统的评估

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3557061

Cleovaldo José De Lima E Silva Junior, I. Vanderlei, Jean Carlos Teixeira De Araujo, Rodrigo Rocha

In the ideal model, universities are naturally constituted by democratic decisions involving voting from different collegiate bodies, commissions, sectors, and the general community. Following the trend of digital democracy, electronic voting tools have been significantly applied in Public Universities in recent years, accentuated by the Covid-19 pandemic. One of the characteristics of electronic voting software is that they have several layers of security and protocols that protect the integrity of a virtual election. This paper used the “Attack Tree” and “Risk Assessment” methods to propose and present a heuristic method of security assessment, which could serve as a model for future digital elections applied in Public Universities.

在理想的模式下，大学自然是由民主决策组成的，包括来自不同学院机构、委员会、部门和一般社区的投票。随着数字民主的趋势，电子投票工具近年来在公立大学得到了大量应用，特别是在新冠肺炎大流行的情况下。电子投票软件的一个特点是，它们有几层安全和协议，以保护虚拟选举的完整性。本文利用“攻击树”和“风险评估”方法，提出并提出了一种启发式的安全评估方法，可作为未来公立大学数字化选举应用的模型。

引用次数: 0

Detecting Inconsistencies in Public Bids: An Automated and Data-based Approach 在公开投标中发现不一致:一种自动化和基于数据的方法

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3558230

Gabriel P. Oliveira, Arthur P. G. Reis, Felipe A. N. Freitas, Lucas L. Costa, Mariana O. Silva, P. Brum, Samuel E. L. Oliveira, Michele A. Brandão, A. Lacerda, G. Pappa

One application for using government data is the detection of irregularities that may indicate fraud in the public sector. This paper presents an approach that analyzes public bidding data available on the Web to detect bidder inconsistencies. Specifically, we propose a hierarchical decision approach from public bidding data, where each bidder is classified as Valid, Doubtful, or Invalid, based on the compatibility between the bidding items and the divisions of the CNAE codes (National Classification of Economic activities). The results reveal that combining commonly available data on bidders and extracting the description of bid items can help in fraud detection. Furthermore, the proposed approach can reduce the number of bids a specialist must analyze to detect fraud, making it easier to identify inconsistencies.

使用政府数据的一个应用是检测可能表明公共部门存在欺诈行为的违规行为。本文提出了一种分析网上公开投标数据的方法，以发现投标人的不一致。具体而言，我们提出了一种基于公开招标数据的分层决策方法，根据招标项目与CNAE代码(国家经济活动分类)划分的兼容性，将每个投标人分为有效、可疑或无效。结果表明，结合常见的投标人数据并提取投标项目描述有助于欺诈检测。此外，所提出的方法可以减少专家为检测欺诈而必须分析的投标数量，从而更容易识别不一致之处。

引用次数: 5

Video Summarization using Text Subjectivity Classification 基于文本主体性分类的视频摘要

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3556998

L. Moraes, R. Marcacini, R. Goularte

Video summarization has attracted researchers’ attention because it provides a compact and informative video version, supporting users and systems to save efforts in searching and understanding content of interest. Current techniques employ different strategies to select which video segments should be included in the final summary. The challenge is to process multimodal data present in the video looking for relevance clues (like redundant or complementary information) that help make a decision. A recent strategy is to use subjectivity detection. The presence or the absence of subjectivity can be explored as a relevance clue, helping to bring video summaries closer to the final user’s expectations. However, despite this potential, there is a gap on how to capture subjectivity information from videos. This paper investigates video summarization through subjectivity classification from video transcripts. This approach requires dealing with recent challenges that are important in video summarization tasks, such as detecting subjectivity in different languages and across multiple domains. We propose a multilingual machine learning model trained to deal with subjectivity classification in multiple domains. An experimental evaluation with different benchmark datasets indicates that our multilingual and multi-domain method achieves competitive results, even compared to language-specific models. Furthermore, such a model can be used to provide subjectivity as a content selection criterion in the video summarization task, filtering out segments that are not relevant to a video domain of interest.

视频摘要因为提供了一个紧凑的、信息丰富的视频版本，支持用户和系统节省搜索和理解感兴趣内容的努力而引起了研究人员的注意。目前的技术采用不同的策略来选择哪些视频片段应该包括在最后的摘要中。挑战在于处理视频中的多模态数据，寻找有助于做出决策的相关线索(如冗余或补充信息)。最近的一种策略是使用主观性检测。主观性的存在或缺失可以作为相关性线索进行探索，有助于使视频摘要更接近最终用户的期望。然而，尽管有这种潜力，如何从视频中捕捉主观性信息仍存在差距。本文通过对视频文本的主体性分类来研究视频摘要。这种方法需要处理最近在视频摘要任务中重要的挑战，例如在不同语言和跨多个领域中检测主观性。我们提出了一个多语言机器学习模型来训练处理多领域的主观性分类。不同基准数据集的实验评估表明，即使与特定语言的模型相比，我们的多语言和多领域方法也取得了具有竞争力的结果。此外，该模型可用于在视频摘要任务中提供主观性作为内容选择标准，过滤掉与感兴趣的视频域不相关的片段。

{"title":"Video Summarization using Text Subjectivity Classification","authors":"L. Moraes, R. Marcacini, R. Goularte","doi":"10.1145/3539637.3556998","DOIUrl":"https://doi.org/10.1145/3539637.3556998","url":null,"abstract":"Video summarization has attracted researchers’ attention because it provides a compact and informative video version, supporting users and systems to save efforts in searching and understanding content of interest. Current techniques employ different strategies to select which video segments should be included in the final summary. The challenge is to process multimodal data present in the video looking for relevance clues (like redundant or complementary information) that help make a decision. A recent strategy is to use subjectivity detection. The presence or the absence of subjectivity can be explored as a relevance clue, helping to bring video summaries closer to the final user’s expectations. However, despite this potential, there is a gap on how to capture subjectivity information from videos. This paper investigates video summarization through subjectivity classification from video transcripts. This approach requires dealing with recent challenges that are important in video summarization tasks, such as detecting subjectivity in different languages and across multiple domains. We propose a multilingual machine learning model trained to deal with subjectivity classification in multiple domains. An experimental evaluation with different benchmark datasets indicates that our multilingual and multi-domain method achieves competitive results, even compared to language-specific models. Furthermore, such a model can be used to provide subjectivity as a content selection criterion in the video summarization task, filtering out segments that are not relevant to a video domain of interest.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125062965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Using Machine Learning on Testing IoT Applications: a systematic mapping 使用机器学习测试物联网应用:系统映射

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3558049

L. M. Freitas, Valéria Lelli

Internet of Things (IoT) devices are increasingly present in people’s daily lives. Thus has increased research interest in investigating strategies that can ensure that these applications work as expected considering specific and vital characteristics of IoT, for example, security, performance and interoperability. In a testing point of view, there is a need to optimize and define an efficient strategy, from its planning to its execution. Considering all the steps that can be taken to test an IoT application, this process, if performed manually, can demand great effort and time. Machine learning (ML) algorithms have been applied in several areas of computing in order to optimize and automate processes that involve large volumes of data. In this paper, we present a systematic mapping resulting in 40 studies that highlights techniques or approaches that use machine learning algorithms for the most diverse goals within the IoT application testing process, such as the use of neural networks for predicting the cost of time in the preparation and execution of tests; identification of security attacks; and automatic generation of test cases from textual language. We also identified that the vast majority of testing techniques are focused on a specific IoT characteristic (e.g., security, performance), specially security, and apply the machine learning algorithm in two ways: directly in the algorithm, called predictive maintenance, or during the execution of planned tests, both of them bring difficulties related to extracting and defining data to train ML algorithms.

物联网(IoT)设备越来越多地出现在人们的日常生活中。因此，考虑到物联网的特定和重要特征，例如安全性、性能和互操作性，研究策略可以确保这些应用按预期工作的研究兴趣增加了。从测试的角度来看，需要优化和定义一个有效的策略，从它的计划到它的执行。考虑到测试物联网应用程序可以采取的所有步骤，如果手动执行此过程，可能需要大量的精力和时间。机器学习(ML)算法已经应用于计算的几个领域，以优化和自动化涉及大量数据的过程。在本文中，我们提出了一个系统的映射，导致40项研究，突出了在物联网应用测试过程中使用机器学习算法实现最多样化目标的技术或方法，例如使用神经网络来预测准备和执行测试的时间成本;识别安全攻击;从文本语言自动生成测试用例。我们还发现，绝大多数测试技术都专注于特定的物联网特征(例如，安全性，性能)，特别是安全性，并以两种方式应用机器学习算法:直接在算法中应用，称为预测性维护，或者在执行计划测试期间，这两种方法都带来了与提取和定义数据以训练ML算法相关的困难。

{"title":"Using Machine Learning on Testing IoT Applications: a systematic mapping","authors":"L. M. Freitas, Valéria Lelli","doi":"10.1145/3539637.3558049","DOIUrl":"https://doi.org/10.1145/3539637.3558049","url":null,"abstract":"Internet of Things (IoT) devices are increasingly present in people’s daily lives. Thus has increased research interest in investigating strategies that can ensure that these applications work as expected considering specific and vital characteristics of IoT, for example, security, performance and interoperability. In a testing point of view, there is a need to optimize and define an efficient strategy, from its planning to its execution. Considering all the steps that can be taken to test an IoT application, this process, if performed manually, can demand great effort and time. Machine learning (ML) algorithms have been applied in several areas of computing in order to optimize and automate processes that involve large volumes of data. In this paper, we present a systematic mapping resulting in 40 studies that highlights techniques or approaches that use machine learning algorithms for the most diverse goals within the IoT application testing process, such as the use of neural networks for predicting the cost of time in the preparation and execution of tests; identification of security attacks; and automatic generation of test cases from textual language. We also identified that the vast majority of testing techniques are focused on a specific IoT characteristic (e.g., security, performance), specially security, and apply the machine learning algorithm in two ways: directly in the algorithm, called predictive maintenance, or during the execution of planned tests, both of them bring difficulties related to extracting and defining data to train ML algorithms.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121588587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Aspect-Based Summarization: An Approach With Different Levels of Details to Explain Recommendations 基于方面的总结:用不同层次的细节来解释建议的方法

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3557002

Luan Soares de Souza, M. Manzato

Recommender systems have become crucial since they appeared, helping users make decisions. Commonly, recommendation algorithms use the historical interaction data between users and items to predict the users’ tastes and suggest new items. However, offering recommendations sometimes is insufficient for the user to make a decision. In this way, the recommendations’ explanation to support the decision-making process has been considered an essential property. The explanations of recommendations can be generated from different resources, such as unstructured data (e.g., users’ reviews), and presented in many ways, such as summarization. However, offering static explanations may not be helpful in several situations. For example, some users familiar with the content may be willing to receive explanations with fewer details than others who are not acquainted with the domain. In this context, we an approach propose to generate summaries with different levels of detail as post-hoc explanations. We used an aspect-based extractive summarization approach with hierarchical clustering of aspects to select sentences from users’ reviews. Then, this hierarchical structure is used to create explanations of recommended items with different lengths, depending on the user’s preferences. Our dynamic explanation system was evaluated against two state-of-art baselines, and the results are promising.

推荐系统自出现以来就变得至关重要，它可以帮助用户做出决定。通常，推荐算法使用用户和商品之间的历史交互数据来预测用户的口味并推荐新商品。然而，提供推荐有时不足以让用户做出决定。这样，支持决策过程的建议解释被认为是一项基本属性。推荐的解释可以从不同的资源生成，例如非结构化数据(例如，用户的评论)，并以多种方式呈现，例如摘要。然而，在某些情况下，提供静态的解释可能没有帮助。例如，一些熟悉内容的用户可能比其他不熟悉该领域的用户更愿意接受包含较少细节的解释。在这种情况下，我们提出了一种方法，即生成具有不同细节级别的摘要，作为事后解释。我们使用了一种基于方面的提取摘要方法，通过方面的分层聚类从用户评论中选择句子。然后，根据用户的偏好，使用这个层次结构来创建不同长度的推荐项目的解释。我们的动态解释系统在两个最先进的基线上进行了评估，结果是有希望的。

{"title":"Aspect-Based Summarization: An Approach With Different Levels of Details to Explain Recommendations","authors":"Luan Soares de Souza, M. Manzato","doi":"10.1145/3539637.3557002","DOIUrl":"https://doi.org/10.1145/3539637.3557002","url":null,"abstract":"Recommender systems have become crucial since they appeared, helping users make decisions. Commonly, recommendation algorithms use the historical interaction data between users and items to predict the users’ tastes and suggest new items. However, offering recommendations sometimes is insufficient for the user to make a decision. In this way, the recommendations’ explanation to support the decision-making process has been considered an essential property. The explanations of recommendations can be generated from different resources, such as unstructured data (e.g., users’ reviews), and presented in many ways, such as summarization. However, offering static explanations may not be helpful in several situations. For example, some users familiar with the content may be willing to receive explanations with fewer details than others who are not acquainted with the domain. In this context, we an approach propose to generate summaries with different levels of detail as post-hoc explanations. We used an aspect-based extractive summarization approach with hierarchical clustering of aspects to select sentences from users’ reviews. Then, this hierarchical structure is used to create explanations of recommended items with different lengths, depending on the user’s preferences. Our dynamic explanation system was evaluated against two state-of-art baselines, and the results are promising.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"38 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121004375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Representation model and cloud-based orchestrator for pervasive storytelling 用于普遍讲故事的表示模型和基于云的编排器

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3558047

Pedro H V Almeida, Carlos Pernisa, M. Moreno

Internet of Things (IoT) devices are increasingly accessible and are part of people’s daily lives. This opens up great possibilities for innovative storytelling experiences, allowing new forms of consumption, going beyond conventional multimedia. In this context, the need for advances in the representation and orchestration of pervasive storytelling is perceptible. This work proposes a conceptual model called Pervasive Storytelling (PST) that aims to represent stories to be told in a pervasive way. By modeling the specific domain of pervasive storytelling at a high level of abstraction, the model supports the variability typical of pervasive environments, such as changes in location, device connectivity, proximity between users, and others. This work also proposes a cloud presentation engine, capable of interpreting and orchestrating storytelling instances represented through PST.

物联网(IoT)设备越来越容易获得，并成为人们日常生活的一部分。这为创新的讲故事体验开辟了巨大的可能性，允许新的消费形式，超越传统的多媒体。在这种情况下，对普遍的故事叙述的表现和编排的需要是显而易见的。这项工作提出了一个名为普适叙事(PST)的概念模型，旨在以普适的方式表达故事。通过在高层次抽象上对普及故事叙述的特定领域进行建模，该模型支持普及环境的典型可变性，例如位置、设备连接、用户和其他人之间的接近程度的变化。这项工作还提出了一个云表示引擎，能够解释和编排通过PST表示的讲故事实例。

引用次数: 1

A Cascade Approach for Gender Prediction from Texts in Portuguese Language 葡萄牙语文本性别预测的级联方法

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3557057

João Pedro Moreira de Morais, L. Merschmann

Author Profiling is a prominent research area in which computational approaches have been proposed to predict authors’ characteristics from their texts. Gender, age, personality traits, and occupation are examples of commonly analyzed characteristics. It is a task of growing importance, with applications in different areas such as forensics, marketing, and e-commerce. Although a lot of research has been conducted on this task for some widely used languages (e.g., English), there is still a lot of room for improvement in studies involving the Portuguese language. Thus, this work contributes by proposing and evaluating a cascading approach, which combines a weighted lexical approach, a heuristic, and a classifier, for the gender prediction problem using only textual content written in the Portuguese language. The proposed approach considers both specificities of the Portuguese language and domain characteristics of the texts. The results obtained from the proposed approach showed that exploring the specificities of the Portuguese language and domain characteristics of the texts can positively contribute to the performance of the gender prediction task.

作者特征分析是一个突出的研究领域，已经提出了计算方法来预测作者的文本特征。性别、年龄、性格特征和职业是通常分析的特征。这是一项越来越重要的任务，在取证、市场营销和电子商务等不同领域都有应用。尽管对于一些广泛使用的语言(如英语)已经进行了大量的研究，但在涉及葡萄牙语的研究中仍有很大的改进空间。因此，这项工作提出并评估了一种级联方法，该方法结合了加权词汇方法、启发式方法和分类器，用于仅使用葡萄牙语编写的文本内容的性别预测问题。提出的方法考虑了葡萄牙语的特殊性和文本的领域特征。该方法的结果表明，探索葡萄牙语的特殊性和文本的领域特征对性别预测任务的执行有积极的贡献。

{"title":"A Cascade Approach for Gender Prediction from Texts in Portuguese Language","authors":"João Pedro Moreira de Morais, L. Merschmann","doi":"10.1145/3539637.3557057","DOIUrl":"https://doi.org/10.1145/3539637.3557057","url":null,"abstract":"Author Profiling is a prominent research area in which computational approaches have been proposed to predict authors’ characteristics from their texts. Gender, age, personality traits, and occupation are examples of commonly analyzed characteristics. It is a task of growing importance, with applications in different areas such as forensics, marketing, and e-commerce. Although a lot of research has been conducted on this task for some widely used languages (e.g., English), there is still a lot of room for improvement in studies involving the Portuguese language. Thus, this work contributes by proposing and evaluating a cascading approach, which combines a weighted lexical approach, a heuristic, and a classifier, for the gender prediction problem using only textual content written in the Portuguese language. The proposed approach considers both specificities of the Portuguese language and domain characteristics of the texts. The results obtained from the proposed approach showed that exploring the specificities of the Portuguese language and domain characteristics of the texts can positively contribute to the performance of the gender prediction task.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sentiment Analysis on Twitter Repercussion of Police Operations 警察行动推特反响的情绪分析

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3558050

Marcos Fontes Feitosa, Saul Rocha, G. Gonçalves, C. H. G. Ferreira, Jussara M. Almeida

Violence and a sense of insecurity are among the main problems in urban centres. In Brazil, an average rate of 20 deaths per month is estimated for every 100,000 inhabitants due to violence. Virtual social networks are increasingly used as a means for users to express their opinions or indignation about this problem. In this article, we analyze the sentiment of users in comments shared on Twitter about police operations with great repercussions in news portals in Brazil. In this sense, we explore lexicon and machine learning models to understand the emotion in which users discuss public safety on social networks and their opinion about the work of government agencies to reduce violence in cities. Our experiments show how challenging this inference is given peculiar characteristics of the context, such as mostly negative and sarcastic expressions. Nevertheless, our best classifiers achieved accuracy and specificity (macro F1) greater than 60% for identifying sentiments polarity, indicating a promising methodology for automatically inferring public opinion about police operations.

暴力和不安全感是城市中心的主要问题。在巴西，估计每月每10万居民中平均有20人死于暴力。虚拟社交网络越来越多地被用户用来表达他们对这个问题的意见或愤慨。在这篇文章中，我们分析了用户在Twitter上分享的评论中对警方行动的看法，这些行动在巴西的新闻门户网站上产生了巨大的影响。从这个意义上说，我们探索了词汇和机器学习模型，以了解用户在社交网络上讨论公共安全时的情绪，以及他们对政府机构减少城市暴力工作的看法。我们的实验表明，在语境的特殊特征下，比如大多数是消极和讽刺的表达，这种推断是多么具有挑战性。尽管如此，我们最好的分类器在识别情绪极性方面实现了超过60%的准确性和特异性(宏观F1)，这表明一种有前途的方法可以自动推断公众对警察行动的看法。

引用次数: 1

An Auto-ML Approach Applied to Text Classification 一种应用于文本分类的自动ml方法

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3557054

Douglas Nunes de Oliveira, L. Merschmann

Automated Machine Learning (AutoML) is a research area that aims to help humans solve Machine Learning (ML) problems by automatically discovering good model pipelines (algorithms and their hyperparameters for every stage of a machine learning process) for a given dataset. Since we have a combinatorial optimization problem for which it is impossible to evaluate all possible pipelines, most AutoML systems use Evolutionary Algorithm (EA) or Bayesian Optimization (BO) to find a good solution. As these systems usually evaluate the pipelines’ performance using the k-fold cross-validation method, the chance of finding an overfitted solution increases with the number of pipelines evaluated. Therefore, to avoid the aforementioned issue, we propose an Auto-ML system, named Auto-ML System for Text Classification (ASTeC), that uses the Bootstrap Bias Corrected CV (BBC-CV) to evaluate the pipelines’ performance. More specifically, the proposed system combines EA, BO, and BBC-CV to find a good model pipeline for the text classification task. We evaluate our proposal by comparing it against two state-of-the-art systems, the Tree-based Pipeline Optimization Tool (TPOT) and Google Cloud AutoML service. To do so, we use seven public datasets composed of written Brazilian Portuguese texts from the sentiment analysis domain. Statistical tests show that our system is equivalent to or better than both of them in all evaluated datasets.

自动化机器学习(AutoML)是一个研究领域，旨在通过自动发现给定数据集的良好模型管道(机器学习过程的每个阶段的算法及其超参数)来帮助人类解决机器学习(ML)问题。由于我们有一个组合优化问题，它不可能评估所有可能的管道，大多数AutoML系统使用进化算法(EA)或贝叶斯优化(BO)来找到一个好的解决方案。由于这些系统通常使用k-fold交叉验证方法来评估管道的性能，因此发现过拟合解的机会随着评估管道数量的增加而增加。因此，为了避免上述问题，我们提出了一个Auto-ML系统，名为Auto-ML system for Text Classification (ASTeC)，它使用Bootstrap Bias Corrected CV (BBC-CV)来评估管道的性能。更具体地说，该系统结合了EA、BO和BBC-CV，为文本分类任务找到了一个良好的模型管道。我们通过比较两个最先进的系统来评估我们的建议，即基于树的管道优化工具(TPOT)和谷歌云自动服务。为此，我们使用了七个公共数据集，这些数据集由情感分析领域的书面巴西葡萄牙语文本组成。统计测试表明，在所有评估的数据集中，我们的系统等于或优于两者。

{"title":"An Auto-ML Approach Applied to Text Classification","authors":"Douglas Nunes de Oliveira, L. Merschmann","doi":"10.1145/3539637.3557054","DOIUrl":"https://doi.org/10.1145/3539637.3557054","url":null,"abstract":"Automated Machine Learning (AutoML) is a research area that aims to help humans solve Machine Learning (ML) problems by automatically discovering good model pipelines (algorithms and their hyperparameters for every stage of a machine learning process) for a given dataset. Since we have a combinatorial optimization problem for which it is impossible to evaluate all possible pipelines, most AutoML systems use Evolutionary Algorithm (EA) or Bayesian Optimization (BO) to find a good solution. As these systems usually evaluate the pipelines’ performance using the k-fold cross-validation method, the chance of finding an overfitted solution increases with the number of pipelines evaluated. Therefore, to avoid the aforementioned issue, we propose an Auto-ML system, named Auto-ML System for Text Classification (ASTeC), that uses the Bootstrap Bias Corrected CV (BBC-CV) to evaluate the pipelines’ performance. More specifically, the proposed system combines EA, BO, and BBC-CV to find a good model pipeline for the text classification task. We evaluate our proposal by comparing it against two state-of-the-art systems, the Tree-based Pipeline Optimization Tool (TPOT) and Google Cloud AutoML service. To do so, we use seven public datasets composed of written Brazilian Portuguese texts from the sentiment analysis domain. Statistical tests show that our system is equivalent to or better than both of them in all evaluated datasets.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Multi-Language Offloading Service: An Android Service Aimed at Mitigating the Network Consumption During Computation Offloading 多语言卸载服务:一种Android服务，旨在减轻计算卸载期间的网络消耗

Proceedings of the Brazilian Symposium on Multimedia and the Web

Pub Date : 2022-11-07 DOI: 10.1145/3539637.3557001

Filipe Fernandes S. B. de Matos, Wellington Oliveira, F. C. Filho, P. Rego, Fernando A. M. Trinta

Computation offloading has been proposed as an efficient technique to mitigate the computational and energy restrictions associated with mobile devices. Previous work has shown that network latency is a challenge for offloading solutions. In the last years, we have seen continuous improvement in mobile device hardware and studies that have pointed to Java’s poor performance compared to other programming languages. This paper proposes a new Android service, called the Multi-Language Offloading Service, that exploits these two aspects to reduce network consumption and indirectly mitigate the latency problem in an offloading scenario. This service scans the local network searching for binaries of server processes, and executes them on the mobile device itself to handle the requests of the client application locally, without depending on the network. We perform tests with real devices and a Java benchmark application that communicates with Rust server processes via the Apache Thrift framework. The results indicate that, when processing tasks that handle large amounts of data, the service reduces up to forty times the network consumption, 86% the task response time, and 25% the energy use of the mobile device.

计算卸载已被提出作为一种有效的技术，以减轻与移动设备相关的计算和能量限制。以前的工作表明，网络延迟是卸载解决方案的一个挑战。在过去的几年中，我们看到了移动设备硬件的持续改进，并且研究表明，与其他编程语言相比，Java的性能较差。本文提出了一种新的Android服务，称为多语言卸载服务，它利用这两个方面来减少网络消耗并间接缓解卸载场景中的延迟问题。该服务扫描本地网络，搜索服务器进程的二进制文件，并在移动设备本身上执行它们，以本地处理客户端应用程序的请求，而不依赖于网络。我们使用真实设备和一个Java基准应用程序执行测试，该应用程序通过Apache Thrift框架与Rust服务器进程通信。结果表明，当处理处理大量数据的任务时，该服务最多可减少40倍的网络消耗、86%的任务响应时间和25%的移动设备能耗。

{"title":"Multi-Language Offloading Service: An Android Service Aimed at Mitigating the Network Consumption During Computation Offloading","authors":"Filipe Fernandes S. B. de Matos, Wellington Oliveira, F. C. Filho, P. Rego, Fernando A. M. Trinta","doi":"10.1145/3539637.3557001","DOIUrl":"https://doi.org/10.1145/3539637.3557001","url":null,"abstract":"Computation offloading has been proposed as an efficient technique to mitigate the computational and energy restrictions associated with mobile devices. Previous work has shown that network latency is a challenge for offloading solutions. In the last years, we have seen continuous improvement in mobile device hardware and studies that have pointed to Java’s poor performance compared to other programming languages. This paper proposes a new Android service, called the Multi-Language Offloading Service, that exploits these two aspects to reduce network consumption and indirectly mitigate the latency problem in an offloading scenario. This service scans the local network searching for binaries of server processes, and executes them on the mobile device itself to handle the requests of the client application locally, without depending on the network. We perform tests with real devices and a Java benchmark application that communicates with Rust server processes via the Apache Thrift framework. The results indicate that, when processing tasks that handle large amounts of data, the service reduces up to forty times the network consumption, 86% the task response time, and 25% the energy use of the mobile device.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124386705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Brazilian Symposium on Multimedia and the Web

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀