首页 > 最新文献

Journal of the Brazilian Computer Society最新文献

英文 中文
Characterizing the hyperspecialists in the context of crowdsourcing software development 描述众包软件开发背景下的超级专家
Pub Date : 2018-12-01 DOI: 10.1186/s13173-018-0082-2
Anderson Bergamini de Neira, Igor Steinmacher, Igor Scaliante Wiese
Companies around the world use crowdsourcing platforms to complete simple tasks, collect product ideas, and launch advertising campaigns. Recently, crowdsourcing has also been used for software development to run tests, fix small defects, or perform small coding tasks. Among the pillars upholding the crowdsourcing business model are the platform participants, as they are responsible for accomplishing the requested tasks. Since successful crowdsourcing heavily relies on attracting and retaining participants, it is essential to understand how they behave. This exploratory study aims to understand a specific contributor profile: hyperspecialists. We analyzed developers’ participation on challenges in two ways. First, we analyzed the type of challenge that 664 Topcoder platform developers participated in during the first 18 months of their participation. Second, we focused on the profile of users who had more collaborations in the development challenges. After quantitative analysis, we observed that, in general, users who do not stop participating have behavioral traits that indicate hyper-specialization, since they participate in the majority of the same types of challenge. An interesting, though troubling, finding was the high dropout rate on the platform: 66% of participants discontinued their participation during the study period. The results also showed that hyperspecialization can be observed in terms of technologies required in the development challenges. We found that 60% of the 2,086 developers analyzed participated in at least 75% of challenges that required the same technology. We found hyperspecialists and non-specialists significantly differ in behavior and characteristics, including hyperspecialists’ lower winning rate when compared to non-specialists.
世界各地的公司都使用众包平台来完成简单的任务、收集产品创意和发起广告活动。最近,众包也被用于软件开发,以运行测试,修复小缺陷,或执行小的编码任务。支持众包商业模式的支柱之一是平台参与者,因为他们负责完成请求的任务。由于成功的众包在很大程度上依赖于吸引和留住参与者,因此了解他们的行为方式至关重要。这项探索性研究旨在了解一个特定的贡献者概况:超级专家。我们从两方面分析了开发者对挑战的参与情况。首先,我们分析了664名Topcoder平台开发者在前18个月参与的挑战类型。其次,我们关注那些在开发挑战中有更多协作的用户。经过定量分析,我们观察到,一般来说,不停止参与的用户具有表明高度专业化的行为特征,因为他们参与了大多数相同类型的挑战。一个有趣但令人不安的发现是,该平台的辍学率很高:66%的参与者在研究期间停止了参与。结果还表明,在发展挑战所需的技术方面,可以观察到高度专业化。我们发现,在分析的2086名开发人员中,有60%的人参与了至少75%的需要相同技术的挑战。我们发现超级专家和非专业人士在行为和特征上存在显著差异,包括与非专业人士相比,超级专家的胜率更低。
{"title":"Characterizing the hyperspecialists in the context of crowdsourcing software development","authors":"Anderson Bergamini de Neira, Igor Steinmacher, Igor Scaliante Wiese","doi":"10.1186/s13173-018-0082-2","DOIUrl":"https://doi.org/10.1186/s13173-018-0082-2","url":null,"abstract":"Companies around the world use crowdsourcing platforms to complete simple tasks, collect product ideas, and launch advertising campaigns. Recently, crowdsourcing has also been used for software development to run tests, fix small defects, or perform small coding tasks. Among the pillars upholding the crowdsourcing business model are the platform participants, as they are responsible for accomplishing the requested tasks. Since successful crowdsourcing heavily relies on attracting and retaining participants, it is essential to understand how they behave. This exploratory study aims to understand a specific contributor profile: hyperspecialists. We analyzed developers’ participation on challenges in two ways. First, we analyzed the type of challenge that 664 Topcoder platform developers participated in during the first 18 months of their participation. Second, we focused on the profile of users who had more collaborations in the development challenges. After quantitative analysis, we observed that, in general, users who do not stop participating have behavioral traits that indicate hyper-specialization, since they participate in the majority of the same types of challenge. An interesting, though troubling, finding was the high dropout rate on the platform: 66% of participants discontinued their participation during the study period. The results also showed that hyperspecialization can be observed in terms of technologies required in the development challenges. We found that 60% of the 2,086 developers analyzed participated in at least 75% of challenges that required the same technology. We found hyperspecialists and non-specialists significantly differ in behavior and characteristics, including hyperspecialists’ lower winning rate when compared to non-specialists.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"36 4","pages":"1-16"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138520078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Who drives company-owned OSS projects: internal or external members? 谁驱动公司拥有的OSS项目:内部成员还是外部成员?
Pub Date : 2018-12-01 DOI: 10.1186/s13173-018-0079-x
Luis Felipe Dias, Igor Steinmacher, Gustavo Pinto
{"title":"Who drives company-owned OSS projects: internal or external members?","authors":"Luis Felipe Dias, Igor Steinmacher, Gustavo Pinto","doi":"10.1186/s13173-018-0079-x","DOIUrl":"https://doi.org/10.1186/s13173-018-0079-x","url":null,"abstract":"","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13173-018-0079-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65831955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An aspect-driven method for enriching product catalogs with user opinions 用用户意见丰富产品目录的方面驱动方法
Pub Date : 2018-11-28 DOI: 10.1186/s13173-018-0080-4
Tiago de Melo, Altigran da Silva, Edleno S. de Moura
In this paper, we propose a method for enriching product catalogs, which traditionally include only objective data provided by manufacturers or retailers, with subjective information extracted from reviews written by customers. Our method was designed to associate opinions taken from reviews with the product attributes they refer to. This is done by matching aspect expression identified in opinions with attributes from the product, which we model here as aspect classes. To verify the effectiveness of our method, we executed an extensive experimental evaluation that revealed that customers frequently mention aspects related to product attributes in their reviews. The attributes often receive more mentions than the product itself. Our method consistently reached almost 0.7 of F 1 measure in the task of associating the opinion with the correct attribute (or with the product as a whole), across four product categories, in two different scenarios. These results significantly improved the results achieved by a representative baseline.
在本文中,我们提出了一种丰富产品目录的方法,传统的产品目录只包括制造商或零售商提供的客观数据,而从客户撰写的评论中提取主观信息。我们的方法旨在将评论中的意见与其所引用的产品属性联系起来。这是通过将意见中标识的方面表达式与产品的属性进行匹配来实现的,我们在这里将其建模为方面类。为了验证我们方法的有效性,我们执行了一个广泛的实验评估,显示客户经常在他们的评论中提到与产品属性相关的方面。这些属性通常比产品本身得到更多的关注。在四个产品类别中,在两种不同的场景中,我们的方法在将意见与正确的属性(或与整个产品)相关联的任务中始终达到f1测量的近0.7。这些结果显著改善了代表性基线所取得的结果。
{"title":"An aspect-driven method for enriching product catalogs with user opinions","authors":"Tiago de Melo, Altigran da Silva, Edleno S. de Moura","doi":"10.1186/s13173-018-0080-4","DOIUrl":"https://doi.org/10.1186/s13173-018-0080-4","url":null,"abstract":"In this paper, we propose a method for enriching product catalogs, which traditionally include only objective data provided by manufacturers or retailers, with subjective information extracted from reviews written by customers. Our method was designed to associate opinions taken from reviews with the product attributes they refer to. This is done by matching aspect expression identified in opinions with attributes from the product, which we model here as aspect classes. To verify the effectiveness of our method, we executed an extensive experimental evaluation that revealed that customers frequently mention aspects related to product attributes in their reviews. The attributes often receive more mentions than the product itself. Our method consistently reached almost 0.7 of F 1 measure in the task of associating the opinion with the correct attribute (or with the product as a whole), across four product categories, in two different scenarios. These results significantly improved the results achieved by a representative baseline.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"41 3","pages":"1-19"},"PeriodicalIF":0.0,"publicationDate":"2018-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138520073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Investigations into data published and consumed on the Web: a systematic mapping study 对网络上发布和消费的数据的调查:系统的地图研究
Pub Date : 2018-11-07 DOI: 10.1186/s13173-018-0077-z
Helton Douglas A. dos Santos, Marcelo Iury S. Oliveira, G. D. F. A. B. Lima, Karina Moura da Silva, Rayelle I. Vera Cruz S. Muniz, Bernadette Farias Lóscio
{"title":"Investigations into data published and consumed on the Web: a systematic mapping study","authors":"Helton Douglas A. dos Santos, Marcelo Iury S. Oliveira, G. D. F. A. B. Lima, Karina Moura da Silva, Rayelle I. Vera Cruz S. Muniz, Bernadette Farias Lóscio","doi":"10.1186/s13173-018-0077-z","DOIUrl":"https://doi.org/10.1186/s13173-018-0077-z","url":null,"abstract":"","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13173-018-0077-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65832406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Cartographic Knowledge Within a Geoportal: Interactions and Feedback in the User Interface 地理门户中的地图知识集成:用户界面中的交互和反馈
Pub Date : 2018-11-04 DOI: 10.14714/CP89.1402
Nadia Panchaud, L. Hurni
Custom user maps (also called map mashups) made on geoportals by novice users often lead to poor cartographic results, because cartographic expertise is not part of the mapmaking process. In order to integrate cartographic design functionality within a geoportal, we explored several strategies and design choices. These strategies aimed at integrating explanations about cartographic rules and functions within the mapmaking process. They are defined and implemented based on a review of human-centered design, usability best practices, and previous work on cartographic applications. Cartographic rules and functions were made part of a cartographic wizard, which was evaluated with the help of a usability study. The study results show that the overall user experience with the cartographic functions and the wizard workflow was positive, although implementing functionalities for a diverse target audience proved challenging. Additionally, the results show that offering different ways to access information is welcomed and that explanations pertaining directly to the specific user-generated map are both helpful and preferred. Finally, the results provide guidelines for user interaction design for cartographic functionality on geoportals and other online mapping platforms.
新手用户在地理门户网站上制作的自定义用户地图(也称为地图混搭)通常会导致较差的制图结果,因为制图专业知识不是制图过程的一部分。为了在地理门户网站中集成地图设计功能,我们探索了几种策略和设计选择。这些策略的目的是在制图过程中整合对制图规则和功能的解释。它们的定义和实现是基于对以人为中心的设计、可用性最佳实践和以前在地图应用程序上的工作的回顾。制图规则和功能成为制图向导的一部分,并通过可用性研究对其进行评估。研究结果表明,制图功能和向导工作流的总体用户体验是积极的,尽管为不同的目标受众实现功能证明是具有挑战性的。此外,结果表明,提供不同的获取信息的方式是受欢迎的,直接与特定用户生成的地图相关的解释既有帮助,也更受欢迎。最后,研究结果为地理门户网站和其他在线地图平台的制图功能的用户交互设计提供了指导。
{"title":"Integrating Cartographic Knowledge Within a Geoportal: Interactions and Feedback in the User Interface","authors":"Nadia Panchaud, L. Hurni","doi":"10.14714/CP89.1402","DOIUrl":"https://doi.org/10.14714/CP89.1402","url":null,"abstract":"Custom user maps (also called map mashups) made on geoportals by novice users often lead to poor cartographic results, because cartographic expertise is not part of the mapmaking process. In order to integrate cartographic design functionality within a geoportal, we explored several strategies and design choices. These strategies aimed at integrating explanations about cartographic rules and functions within the mapmaking process. They are defined and implemented based on a review of human-centered design, usability best practices, and previous work on cartographic applications. Cartographic rules and functions were made part of a cartographic wizard, which was evaluated with the help of a usability study. The study results show that the overall user experience with the cartographic functions and the wizard workflow was positive, although implementing functionalities for a diverse target audience proved challenging. Additionally, the results show that offering different ways to access information is welcomed and that explanations pertaining directly to the specific user-generated map are both helpful and preferred. Finally, the results provide guidelines for user interaction design for cartographic functionality on geoportals and other online mapping platforms.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"1 1","pages":"5-24"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43696081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
On the identification of design problems in stinky code: experiences and tool support 臭气代码中设计问题的识别:经验和工具支持
Pub Date : 2018-10-22 DOI: 10.1186/s13173-018-0078-y
Willian Oizumi, Leonardo Sousa, Anderson Oliveira, Alessandro Garcia, Anne Benedicte Agbachi, Roberto Oliveira, Carlos Lucena
BackgroundDevelopers often have to locate design problems in the source code. Several types of design problems may manifest as code smells in the program. A code smell is a source code structure that may reveal a partial hint about the manifestation of a design problem. Recent studies suggest that developers should ignore smells occurring in isolation in a program location. Instead, they should focus on analyzing stinkier code, i.e., program locations—e.g., a class or a hierarchy—affected by multiple smells. There is evidence that the stinkier a program location is, the more likely it contains a design problem. However, there is no empirical evidence on whether developers can effectively identify a design problem in stinkier code. Developers may struggle to make an analysis of inter-related smells affecting the same program location. Besides that, the analysis of stinkier code may require proper tool support due to its analysis complexity. However, there is little knowledge on what are the requirements for a tool that helps developers in revealing stinkier program locations. As a result, developers may not be able to identify design problems due to tool issues.MethodTo address this matter, we aimed at achieving three goals. In the first case, we proposed Organic—a tool supporting the analysis of stinky code. In the second case, we applied a mixed-method approach to analyze if and how developers can effectively find design problems when reflecting upon stinky code—i.e., a program location affected by multiple smells. We conducted a study with 11 software professionals. Finally, in the third case, we aimed at understanding if Organic could be used by developers to identify design problems. To achieve this goal, we used a method from the Semiotic Engineering theory. This method enabled us to evaluate what are the tool issues that may hinder the identification of design problems in stinky code.ResultOur study revealed that only 36.36% of the developers found more design problems when explicitly reasoning about multiple smells as compared to single smells. Moreover, 63.63% of the developers reported much lesser false positives when using the first approach as compared to the latter. The second study, in its turn, showed that most developers may be unable to identify design problems in stinky code without proper tool support.ConclusionOur experiences, in particular the second study, helped us to refine the features of Organic for better supporting developers in reflecting upon stinkier code. For example, analyses of stinky code scattered in class hierarchies or packages is often difficult, time-consuming, and requires proper visualization support. Moreover, without effective support, it remains time-consuming to discard stinky program locations that do not represent design problems.
开发人员经常需要定位源代码中的设计问题。有几种类型的设计问题可能表现为程序中的代码气味。代码气味是一种源代码结构,它可以揭示有关设计问题表现形式的部分暗示。最近的研究表明,开发人员应该忽略程序位置中孤立出现的气味。相反,他们应该专注于分析更糟糕的代码,例如,程序位置。一个类或层次——受多种气味影响。有证据表明,程序位置越臭,它包含设计问题的可能性就越大。然而,没有经验证据表明开发人员是否能够有效地识别出臭气熏天的代码中的设计问题。开发人员可能很难对影响相同程序位置的相互关联的气味进行分析。除此之外,由于分析的复杂性,分析较差的代码可能需要适当的工具支持。然而,对于帮助开发人员揭示更臭的程序位置的工具的需求,人们知之甚少。因此,由于工具问题,开发人员可能无法识别设计问题。为了解决这个问题,我们的目标是实现三个目标。在第一种情况下,我们提出了organic——一种支持分析糟糕代码的工具。在第二种情况下,我们采用混合方法来分析开发人员在反思糟糕的代码时是否以及如何有效地发现设计问题。,受多种气味影响的程序位置。我们对11位软件专家进行了一项研究。最后,在第三种情况下,我们的目标是了解开发人员是否可以使用Organic来识别设计问题。为了实现这一目标,我们使用了符号学工程理论中的一种方法。这种方法使我们能够评估哪些工具问题可能会阻碍在糟糕的代码中识别设计问题。结果研究表明,只有36.36%的开发人员在明确推理多种气味时发现了比单一气味更多的设计问题。此外,与使用后一种方法相比,使用第一种方法时,63.63%的开发者报告的误报要少得多。第二项研究表明,如果没有适当的工具支持,大多数开发人员可能无法识别糟糕代码中的设计问题。我们的经验,特别是第二项研究,帮助我们改进了Organic的功能,以便更好地支持开发人员反思糟糕的代码。例如,分析分散在类层次结构或包中的糟糕代码通常是困难的、耗时的,并且需要适当的可视化支持。此外,如果没有有效的支持,丢弃不代表设计问题的糟糕程序位置仍然很耗时。
{"title":"On the identification of design problems in stinky code: experiences and tool support","authors":"Willian Oizumi, Leonardo Sousa, Anderson Oliveira, Alessandro Garcia, Anne Benedicte Agbachi, Roberto Oliveira, Carlos Lucena","doi":"10.1186/s13173-018-0078-y","DOIUrl":"https://doi.org/10.1186/s13173-018-0078-y","url":null,"abstract":"BackgroundDevelopers often have to locate design problems in the source code. Several types of design problems may manifest as code smells in the program. A code smell is a source code structure that may reveal a partial hint about the manifestation of a design problem. Recent studies suggest that developers should ignore smells occurring in isolation in a program location. Instead, they should focus on analyzing stinkier code, i.e., program locations—e.g., a class or a hierarchy—affected by multiple smells. There is evidence that the stinkier a program location is, the more likely it contains a design problem. However, there is no empirical evidence on whether developers can effectively identify a design problem in stinkier code. Developers may struggle to make an analysis of inter-related smells affecting the same program location. Besides that, the analysis of stinkier code may require proper tool support due to its analysis complexity. However, there is little knowledge on what are the requirements for a tool that helps developers in revealing stinkier program locations. As a result, developers may not be able to identify design problems due to tool issues.MethodTo address this matter, we aimed at achieving three goals. In the first case, we proposed Organic—a tool supporting the analysis of stinky code. In the second case, we applied a mixed-method approach to analyze if and how developers can effectively find design problems when reflecting upon stinky code—i.e., a program location affected by multiple smells. We conducted a study with 11 software professionals. Finally, in the third case, we aimed at understanding if Organic could be used by developers to identify design problems. To achieve this goal, we used a method from the Semiotic Engineering theory. This method enabled us to evaluate what are the tool issues that may hinder the identification of design problems in stinky code.ResultOur study revealed that only 36.36% of the developers found more design problems when explicitly reasoning about multiple smells as compared to single smells. Moreover, 63.63% of the developers reported much lesser false positives when using the first approach as compared to the latter. The second study, in its turn, showed that most developers may be unable to identify design problems in stinky code without proper tool support.ConclusionOur experiences, in particular the second study, helped us to refine the features of Organic for better supporting developers in reflecting upon stinkier code. For example, analyses of stinky code scattered in class hierarchies or packages is often difficult, time-consuming, and requires proper visualization support. Moreover, without effective support, it remains time-consuming to discard stinky program locations that do not represent design problems.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"52 5","pages":"1-30"},"PeriodicalIF":0.0,"publicationDate":"2018-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138520076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Combining instance selection and self-training to improve data stream quantification 结合实例选择和自训练,提高数据流量化
Pub Date : 2018-10-12 DOI: 10.1186/s13173-018-0076-0
André G. Maletzke, Denis M. dos Reis, Gustavo E. A. P. A. Batista
In the last years, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant amount of methods to solve problems in diverse tasks, more prominently in classification, clustering, and anomaly detection. However, a relevant task known as quantification has remained mostly unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. Recently, we proposed the SQSI algorithm to quantify data streams with concept drifts. SQSI uses a statistical test to identify concept drifts and retrain the classifiers. However, the retraining involves requiring the labels for all newly arrived instances. In this paper, we extend SQSI algorithm by exploring instance selection techniques allied to semi-supervised learning. The idea is to request the classes of a smaller subset of recent examples. Our experiments demonstrate that SQSI’s extension significantly reduces the dependency on actual labels while maintaining or improving the quantification accuracy.
近年来,数据流学习因其大量的应用而引起了研究人员和实践者的关注。这些应用促使研究界提出了大量的方法来解决不同任务中的问题,尤其是在分类、聚类和异常检测方面。然而,一项被称为量化的相关任务大部分仍未被探索。量化目标是在未标记的集合中提供类流行率的估计。最近,我们提出了SQSI算法来量化带有概念漂移的数据流。SQSI使用统计测试来识别概念漂移并重新训练分类器。然而,再培训涉及到要求所有新到达实例的标签。在本文中,我们通过探索与半监督学习相关的实例选择技术来扩展SQSI算法。其思想是请求最近示例的一个较小子集的类。我们的实验表明,SQSI的扩展显著降低了对实际标签的依赖,同时保持或提高了量化准确性。
{"title":"Combining instance selection and self-training to improve data stream quantification","authors":"André G. Maletzke, Denis M. dos Reis, Gustavo E. A. P. A. Batista","doi":"10.1186/s13173-018-0076-0","DOIUrl":"https://doi.org/10.1186/s13173-018-0076-0","url":null,"abstract":"In the last years, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant amount of methods to solve problems in diverse tasks, more prominently in classification, clustering, and anomaly detection. However, a relevant task known as quantification has remained mostly unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. Recently, we proposed the SQSI algorithm to quantify data streams with concept drifts. SQSI uses a statistical test to identify concept drifts and retrain the classifiers. However, the retraining involves requiring the labels for all newly arrived instances. In this paper, we extend SQSI algorithm by exploring instance selection techniques allied to semi-supervised learning. The idea is to request the classes of a smaller subset of recent examples. Our experiments demonstrate that SQSI’s extension significantly reduces the dependency on actual labels while maintaining or improving the quantification accuracy.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"33 1","pages":"1-17"},"PeriodicalIF":0.0,"publicationDate":"2018-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138520109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Using weaker consistency models with monitoring and recovery for improving performance of key-value stores 将较弱的一致性模型与监控和恢复结合使用,以提高关键价值存储的性能
Pub Date : 2018-10-01 DOI: 10.1186/s13173-019-0091-9
Duong N. Nguyen, Aleksey Charapko, S. Kulkarni, M. Demirbas
{"title":"Using weaker consistency models with monitoring and recovery for improving performance of key-value stores","authors":"Duong N. Nguyen, Aleksey Charapko, S. Kulkarni, M. Demirbas","doi":"10.1186/s13173-019-0091-9","DOIUrl":"https://doi.org/10.1186/s13173-019-0091-9","url":null,"abstract":"","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13173-019-0091-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41533870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Update summarization: building from scratch for Portuguese and comparing to English 更新总结:从头开始为葡萄牙语构建并与英语进行比较
Pub Date : 2018-09-21 DOI: 10.1186/s13173-018-0075-1
Fernando Antônio Asevedo Nóbrega, Thiago Alexandre Salgueiro Pardo
{"title":"Update summarization: building from scratch for Portuguese and comparing to English","authors":"Fernando Antônio Asevedo Nóbrega, Thiago Alexandre Salgueiro Pardo","doi":"10.1186/s13173-018-0075-1","DOIUrl":"https://doi.org/10.1186/s13173-018-0075-1","url":null,"abstract":"","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"4 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72592060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Influence of algorithmic abstraction and mathematical knowledge on rates of dropout from Computing degree courses 算法抽象和数学知识对计算机学位课程辍学率的影响
Pub Date : 2018-08-09 DOI: 10.1186/s13173-018-0074-2
Raphael Magalhães Hoed, Marcelo Ladeira, Leticia Lopes Leite
This paper presents a study of rates of dropout from Brazilian degree courses, based on data provided by the National Institute for Educational Studies and Research “Anísio Teixeira” (INEP) and a case study carried out at the University of Brasilia (UnB). Dropout was calculated by tracking the status of each student between 2010 and 2014 in the eight major areas according to the classification of the Organisation for Economic Co-operation and Development (OECD), for the major area of Science, Mathematics, and Computing, and for the area of Computing. Data were analyzed in order to check for potential evidence regarding the influence on dropout of factors such as algorithmic abstraction, number of applicants per place, or the gender of students. A survey was also performed using online questionnaires for circumvented students from the courses of Bachelor of Computer Science, Degree in Computing, and Computer Engineering between 2005 and 2015. This survey revealed the influence on dropout of several factors and particularly institutional and vocational factors; it is clear that difficulties in algorithmic abstraction and mathematical knowledge influence rates of dropout from computing courses.
本文基于巴西国家教育研究所“Anísio特谢拉”(INEP)提供的数据和在巴西利亚大学(UnB)进行的案例研究,对巴西学位课程的辍学率进行了研究。根据经济合作与发展组织(OECD)的分类,通过跟踪2010年至2014年期间每个学生在八个主要领域的状况来计算辍学率,这些领域分别是科学、数学和计算领域,以及计算领域。对数据进行分析,以检查有关算法抽象、每个名额的申请人数或学生性别等因素对退学影响的潜在证据。对2005年至2015年计算机科学学士、计算机学位和计算机工程课程的被回避学生进行了在线问卷调查。该调查揭示了几个因素对辍学的影响,特别是制度和职业因素;很明显,算法抽象和数学知识方面的困难影响了计算机课程的辍学率。
{"title":"Influence of algorithmic abstraction and mathematical knowledge on rates of dropout from Computing degree courses","authors":"Raphael Magalhães Hoed, Marcelo Ladeira, Leticia Lopes Leite","doi":"10.1186/s13173-018-0074-2","DOIUrl":"https://doi.org/10.1186/s13173-018-0074-2","url":null,"abstract":"This paper presents a study of rates of dropout from Brazilian degree courses, based on data provided by the National Institute for Educational Studies and Research “Anísio Teixeira” (INEP) and a case study carried out at the University of Brasilia (UnB). Dropout was calculated by tracking the status of each student between 2010 and 2014 in the eight major areas according to the classification of the Organisation for Economic Co-operation and Development (OECD), for the major area of Science, Mathematics, and Computing, and for the area of Computing. Data were analyzed in order to check for potential evidence regarding the influence on dropout of factors such as algorithmic abstraction, number of applicants per place, or the gender of students. A survey was also performed using online questionnaires for circumvented students from the courses of Bachelor of Computer Science, Degree in Computing, and Computer Engineering between 2005 and 2015. This survey revealed the influence on dropout of several factors and particularly institutional and vocational factors; it is clear that difficulties in algorithmic abstraction and mathematical knowledge influence rates of dropout from computing courses.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":"7 1","pages":"1-16"},"PeriodicalIF":0.0,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of the Brazilian Computer Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1