首页 > 最新文献

International Journal of Open Source Software and Processes最新文献

英文 中文
Understanding User Engagement With Multi-Representational License Comprehension Interfaces 通过多表示许可理解接口理解用户参与
Q4 Computer Science Pub Date : 2020-01-01 DOI: 10.4018/IJOSSP.2020100102
Mahugnon Olivier Avande, R. Gandhi, Harvey P. Siy
License information for any non-trivial open-source software demonstrates the growing complexity of compliance management. Studies have shown that understanding open-source licenses is difficult. Prior research has not examined how developers would use interfaces displaying license text and its graphical models in studying a license. Consequently, a repeatable eye tracking-based methodology was developed to study user engagement when exploring open-source rights and obligations in a multi-modal fashion. Experiences of 10 participants in an exploratory case study design indicate that eye-tracking is feasible to quantitatively and qualitatively observe distinct interaction patterns in the use of license comprehension interfaces. A low correlation was observed between self-reported usability survey data and eye-tracking data. Conversely, a high correlation between eye-tracker and mouse data suggests the use of either in future studies. This paper provides a framework to conduct such studies as an alternative to surveys while offering interesting hypotheses for future studies.
任何重要的开源软件的许可信息都表明了遵从性管理日益复杂。研究表明,理解开源许可证是很困难的。先前的研究并没有检查开发人员在研究许可时如何使用显示许可文本及其图形模型的界面。因此,开发了一种可重复的基于眼动追踪的方法,用于在以多模态方式探索开源权利和义务时研究用户参与度。在一个探索性案例研究设计中,10名参与者的经验表明,眼动追踪可以定量和定性地观察许可证理解界面使用中不同的交互模式。自我报告的可用性调查数据与眼动追踪数据之间的相关性较低。相反,眼动仪和小鼠数据之间的高度相关性表明,在未来的研究中,两者都可以使用。本文提供了一个框架来进行这样的研究,作为调查的替代方案,同时为未来的研究提供了有趣的假设。
{"title":"Understanding User Engagement With Multi-Representational License Comprehension Interfaces","authors":"Mahugnon Olivier Avande, R. Gandhi, Harvey P. Siy","doi":"10.4018/IJOSSP.2020100102","DOIUrl":"https://doi.org/10.4018/IJOSSP.2020100102","url":null,"abstract":"License information for any non-trivial open-source software demonstrates the growing complexity of compliance management. Studies have shown that understanding open-source licenses is difficult. Prior research has not examined how developers would use interfaces displaying license text and its graphical models in studying a license. Consequently, a repeatable eye tracking-based methodology was developed to study user engagement when exploring open-source rights and obligations in a multi-modal fashion. Experiences of 10 participants in an exploratory case study design indicate that eye-tracking is feasible to quantitatively and qualitatively observe distinct interaction patterns in the use of license comprehension interfaces. A low correlation was observed between self-reported usability survey data and eye-tracking data. Conversely, a high correlation between eye-tracker and mouse data suggests the use of either in future studies. This paper provides a framework to conduct such studies as an alternative to surveys while offering interesting hypotheses for future studies.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"15 1","pages":"27-45"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78691128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Spider Bird Swarm Algorithm-Based Deep Recurrent Neural Network for Malicious JavaScript Detection Using Box-Cox Transformation 基于Box-Cox变换的基于自适应蜘蛛鸟群算法的深度递归神经网络恶意JavaScript检测
Q4 Computer Science Pub Date : 2020-01-01 DOI: 10.4018/IJOSSP.2020100103
Scaria Alex, T. Rajkumar
JavaScript is a scripting language that is commonly used in the web pages for providing dynamic functionality in order to enhance user experience. Malicious JavaScript in webpages on internet is an important security issue due to their potentially and universality severe impact. Finding the malicious JavaScript is usually more difficult and time-consuming task in the research community. Hence, an adaptive spider bird swarm algorithm-based deep recurrent neural network (adaptive SBSA-based deep RNN) is proposed for detecting the malicious JavaScript codes in web applications. However, the proposed adaptive SBSA is designed by integrating the adaptive concept with the bird swarm algorithm (BSA) and spider monkey optimization (SMO). With the deep RNN classifier, the complexity issues exists in detecting the malicious codes is effectively resolved through the process of hierarchical computation. Due to the efficiency of the proposed approach, it can evaluate under large real-life datasets.
JavaScript是一种脚本语言,通常用于网页中提供动态功能,以增强用户体验。恶意JavaScript由于其潜在的、普遍的、严重的影响而成为一个重要的网络安全问题。在研究社区中,查找恶意JavaScript通常是一项更加困难和耗时的任务。为此,提出了一种基于自适应蜘蛛鸟群算法的深度递归神经网络(adaptive SBSA-based deep RNN)来检测web应用程序中的恶意JavaScript代码。然而,本文提出的自适应SBSA是将自适应概念与鸟群算法(BSA)和蜘蛛猴优化(SMO)相结合而设计的。深度RNN分类器通过分层计算的过程,有效地解决了恶意代码检测中存在的复杂性问题。由于该方法的有效性,它可以在大型真实数据集下进行评估。
{"title":"Adaptive Spider Bird Swarm Algorithm-Based Deep Recurrent Neural Network for Malicious JavaScript Detection Using Box-Cox Transformation","authors":"Scaria Alex, T. Rajkumar","doi":"10.4018/IJOSSP.2020100103","DOIUrl":"https://doi.org/10.4018/IJOSSP.2020100103","url":null,"abstract":"JavaScript is a scripting language that is commonly used in the web pages for providing dynamic functionality in order to enhance user experience. Malicious JavaScript in webpages on internet is an important security issue due to their potentially and universality severe impact. Finding the malicious JavaScript is usually more difficult and time-consuming task in the research community. Hence, an adaptive spider bird swarm algorithm-based deep recurrent neural network (adaptive SBSA-based deep RNN) is proposed for detecting the malicious JavaScript codes in web applications. However, the proposed adaptive SBSA is designed by integrating the adaptive concept with the bird swarm algorithm (BSA) and spider monkey optimization (SMO). With the deep RNN classifier, the complexity issues exists in detecting the malicious codes is effectively resolved through the process of hierarchical computation. Due to the efficiency of the proposed approach, it can evaluate under large real-life datasets.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"15 1","pages":"46-59"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90389808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatically Labelled Software Topic Model 自动标记软件主题模型
Q4 Computer Science Pub Date : 2020-01-01 DOI: 10.4018/ijossp.2020010104
Youcef Bouziane, M. Abdi, Salah Sadou
Public software repositories (SR) maintain a massive amount of valuable data offering opportunities to support software engineering (SE) tasks. Researchers have applied information retrieval techniques in mining software repositories. Topic models are one of these techniques. However, this technique does not give an interpretation nor labels to the extracted topics and it requires manual analysis to identify them. Some approaches were proposed to automatically label the topics using tags in SR, but they do not consider the existence of spam-tags and they have difficulties to scale to large tag space. This article introduces a novel approach called automatically labelled software topic model (AL-STM) that labels the topics based on observed tags in SR. It mitigates the shortcomings of manual and automatic labelling of topics in SE. AL-STM is implemented using 22K GitHub projects and evaluated in a SE task (tag recommending) against the currently used techniques. The empirical results suggest that AL-STM is more robust in terms of MAP and nDCG, and more scalable to large tag space.
公共软件存储库(SR)维护大量有价值的数据,为支持软件工程(SE)任务提供了机会。研究人员将信息检索技术应用于软件资源库的挖掘。主题模型就是其中一种技术。然而,这种技术没有对提取的主题给出解释和标签,需要手工分析来识别它们。提出了一些利用SR中的标签自动标记主题的方法,但它们没有考虑垃圾标签的存在,并且难以扩展到大的标签空间。本文介绍了一种新的方法,即自动标记软件主题模型(AL-STM),它基于sr中观察到的标签对主题进行标记,减轻了SE中手动和自动标记主题的缺点。AL-STM是使用22K GitHub项目实现的,并根据当前使用的技术在SE任务(标签推荐)中进行评估。实证结果表明,AL-STM在MAP和nDCG方面具有更强的鲁棒性,并且具有更大的标签空间可扩展性。
{"title":"Automatically Labelled Software Topic Model","authors":"Youcef Bouziane, M. Abdi, Salah Sadou","doi":"10.4018/ijossp.2020010104","DOIUrl":"https://doi.org/10.4018/ijossp.2020010104","url":null,"abstract":"Public software repositories (SR) maintain a massive amount of valuable data offering opportunities to support software engineering (SE) tasks. Researchers have applied information retrieval techniques in mining software repositories. Topic models are one of these techniques. However, this technique does not give an interpretation nor labels to the extracted topics and it requires manual analysis to identify them. Some approaches were proposed to automatically label the topics using tags in SR, but they do not consider the existence of spam-tags and they have difficulties to scale to large tag space. This article introduces a novel approach called automatically labelled software topic model (AL-STM) that labels the topics based on observed tags in SR. It mitigates the shortcomings of manual and automatic labelling of topics in SE. AL-STM is implemented using 22K GitHub projects and evaluated in a SE task (tag recommending) against the currently used techniques. The empirical results suggest that AL-STM is more robust in terms of MAP and nDCG, and more scalable to large tag space.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"25 1","pages":"57-78"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87149359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Ripple Effect Identification in Software Applications 软件应用中的涟漪效应识别
Q4 Computer Science Pub Date : 2020-01-01 DOI: 10.4018/ijossp.2020010103
Anushree Agrawal, R. K. Singh
Changes are made frequently in software to incorporate new requirements. The changes made to one class are not limited to that particular class, but they also affect other entities. Early identification of these change prone entities is very essential for minimizing future faults in the software applications. Thus, it is very important to develop quality models for identifying the ripple effect of changed classes to effectively utilize the limited resources during the software development lifecycle. Association rule mining is a popular approach suggested in literature, but a major limitation of this approach is its inability to generate recommendations in case of new addition of classes. This article suggests the development of prediction model using learning techniques to overcome this limitation. The authors evaluate the performance of thirteen statistical, ML, and search-based techniques using eight open source software applications in this work. The findings of this study are promising and support the application of SBT and ML techniques for ripple effect identification.
在软件中经常进行更改以纳入新的需求。对一个类所做的更改不仅限于该特定类,还会影响其他实体。早期识别这些易发生变更的实体对于最小化软件应用程序中未来的错误是非常必要的。因此,为了在软件开发生命周期中有效地利用有限的资源,开发质量模型来识别变更类的连锁反应是非常重要的。关联规则挖掘是文献中提出的一种流行的方法,但是这种方法的一个主要限制是它不能在添加新类的情况下生成建议。本文建议利用学习技术开发预测模型来克服这一局限。作者在这项工作中使用八个开源软件应用程序评估了十三种统计、机器学习和基于搜索的技术的性能。本研究结果为SBT和ML技术在连锁反应鉴定中的应用提供了支持。
{"title":"Ripple Effect Identification in Software Applications","authors":"Anushree Agrawal, R. K. Singh","doi":"10.4018/ijossp.2020010103","DOIUrl":"https://doi.org/10.4018/ijossp.2020010103","url":null,"abstract":"Changes are made frequently in software to incorporate new requirements. The changes made to one class are not limited to that particular class, but they also affect other entities. Early identification of these change prone entities is very essential for minimizing future faults in the software applications. Thus, it is very important to develop quality models for identifying the ripple effect of changed classes to effectively utilize the limited resources during the software development lifecycle. Association rule mining is a popular approach suggested in literature, but a major limitation of this approach is its inability to generate recommendations in case of new addition of classes. This article suggests the development of prediction model using learning techniques to overcome this limitation. The authors evaluate the performance of thirteen statistical, ML, and search-based techniques using eight open source software applications in this work. The findings of this study are promising and support the application of SBT and ML techniques for ripple effect identification.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"11 1","pages":"41-56"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81150855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction 软件可靠性预测中的类不平衡特征选择与集成研究
Q4 Computer Science Pub Date : 2019-10-01 DOI: 10.4018/ijossp.2019100102
Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella, R. Pote
Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.
通过早期的软件缺陷预测模型可以提高软件质量。然而,由于缺陷的表示不足和用于预测它们的不相关度量而导致的类不平衡是阻碍模型性能的两个主要挑战。本文提出了一种新的两阶段混合特征选择与加权支持向量机增强(WSVMBoost)集成框架,进一步提高了模型的性能。EHF是特征选择模型(如过滤器和嵌入模型)的集成特征排序,以选择相关的度量。本文还利用5个软件可靠性数据集,对随机森林(Random Forest)、RUSBoost、WSVMBoost等分类集成,以及决策树(Decision Tree)和支持向量机(SVM)等基础学习器进行了研究。从统计测试来看,在预测软件缺陷方面,带有WSVMBoost的EHF在性能方面比其他特征选择混合体获得了最好的平均排名。此外,本研究表明McCabe和Hasalted方法级度量在改进模型性能方面同样重要。
{"title":"A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction","authors":"Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella, R. Pote","doi":"10.4018/ijossp.2019100102","DOIUrl":"https://doi.org/10.4018/ijossp.2019100102","url":null,"abstract":"Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"20 1","pages":"20-43"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90478417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software Fault Prediction Using Deep Learning Algorithms 基于深度学习算法的软件故障预测
Q4 Computer Science Pub Date : 2019-10-01 DOI: 10.4018/ijossp.2019100101
Osama Al Qasem, Mohammed Akour
Software faults prediction (SFP) processes can be used for detecting faulty constructs at early stages of the development lifecycle, in addition to its being used in several phases of the development process. Machine learning (ML) is widely used in this area. One of the most promising subsets from ML is deep learning that achieves remarkable performance in various areas. Two deep learning algorithms are used in this paper, the Multi-layer perceptrons (MLPs) and Convolutional Neural Network (CNN). In order to evaluate the studied algorithms, four commonly used datasets from NASA are used i.e. (PC1, KC1, KC2 and CM1). The experiment results show how the CNN algorithm achieves prediction superiority of the MLP algorithm. The accuracy and detection rate measurements when using CNN has reached the standard ratio respectively as follows: PC1 97.7% - 73.9%, KC1 100% - 100%, KC2 99.3% - 99.2% and CM1 97.3% - 82.3%. This study provides promising results in using the deep learning for software fault prediction research.
软件故障预测(SFP)过程可用于在开发生命周期的早期阶段检测有缺陷的结构,此外还可用于开发过程的几个阶段。机器学习(ML)被广泛应用于这一领域。机器学习中最有前途的子集之一是深度学习,它在各个领域都取得了卓越的表现。本文使用了两种深度学习算法,多层感知器(mlp)和卷积神经网络(CNN)。为了评估所研究的算法,使用了来自NASA的四个常用数据集(PC1, KC1, KC2和CM1)。实验结果表明,CNN算法达到了MLP算法的预测优势。使用CNN时测量的准确率和检出率分别达到标准比例:PC1 97.7% - 73.9%, KC1 100% - 100%, KC2 99.3% - 99.2%, CM1 97.3% - 82.3%。本研究为将深度学习应用于软件故障预测研究提供了有益的结果。
{"title":"Software Fault Prediction Using Deep Learning Algorithms","authors":"Osama Al Qasem, Mohammed Akour","doi":"10.4018/ijossp.2019100101","DOIUrl":"https://doi.org/10.4018/ijossp.2019100101","url":null,"abstract":"Software faults prediction (SFP) processes can be used for detecting faulty constructs at early stages of the development lifecycle, in addition to its being used in several phases of the development process. Machine learning (ML) is widely used in this area. One of the most promising subsets from ML is deep learning that achieves remarkable performance in various areas. Two deep learning algorithms are used in this paper, the Multi-layer perceptrons (MLPs) and Convolutional Neural Network (CNN). In order to evaluate the studied algorithms, four commonly used datasets from NASA are used i.e. (PC1, KC1, KC2 and CM1). The experiment results show how the CNN algorithm achieves prediction superiority of the MLP algorithm. The accuracy and detection rate measurements when using CNN has reached the standard ratio respectively as follows: PC1 97.7% - 73.9%, KC1 100% - 100%, KC2 99.3% - 99.2% and CM1 97.3% - 82.3%. This study provides promising results in using the deep learning for software fault prediction research.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"100 1","pages":"1-19"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85906330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A Novel Approach to Optimize the Performance of Hadoop Frameworks for Sentiment Analysis 一种优化Hadoop情感分析框架性能的新方法
Q4 Computer Science Pub Date : 2019-10-01 DOI: 10.4018/ijossp.2019100103
G. Srinivasa, Amith K. Jain, Prithviraj Jain, R. NageshH.
Twitter is one among most popular micro blogging services with millions of active users. It is a hub of massive collection of data arriving from various sources. In Twitter, users most often express their views, opinions, thoughts, emotions or feelings about a particular topic, product or service, of their interest, choice or concern. This makes twitter a hub of gargantuan amount of data, and at the same time a useful platform in getting to know and understand the underlying sentiment behind a particular product or for that matter anything expressed in twitter as tweets. It is important to note here that aforesaid massive collection of data is not just any redundant data, but one which contains useful information as noted earlier. In view of aforesaid context, Sentiment analysis in relation to twitter data gains enormous importance. Sentiment analysis offers itself as a good approach in classifying the opinions formulated by individuals (tweeters) into different sentiments such as, positive, negative, or neutral. Implementing Sentiment analysis algorithms using conventional tools leads to high computation time, and thus are less effective. Hence, there is a need for state-of-the-art tools and techniques to be developed for sentiment analysis making it the need of the hour to facilitate faster computation. An Apache Hadoop framework is one such option that supports distributed data computing and has been commonly adopted for a variety of use-cases. In this article, the author identifies factors affecting the performance of sentiment analysis algorithms based on Hadoop framework and proposes an approach for optimizing the performance of sentiment analysis. The experimental results depict the potential of the proposed approach.
Twitter是最受欢迎的微博服务之一,拥有数百万活跃用户。它是来自各种来源的大量数据收集的中心。在Twitter上,用户最常表达他们对特定主题、产品或服务、他们的兴趣、选择或关注的观点、意见、想法、情感或感受。这使得twitter成为海量数据的中心,同时也是一个有用的平台,可以了解和理解特定产品背后的潜在情绪,或者在twitter上以tweet的形式表达的任何东西。这里必须指出的是,上述大量的数据收集不仅仅是任何冗余的数据,而是包含前面提到的有用信息的数据。鉴于上述背景,与twitter数据相关的情绪分析变得非常重要。情绪分析是一种很好的方法,可以将个人(推特用户)的观点分类为不同的情绪,如积极、消极或中立。使用传统工具实现情感分析算法会导致高计算时间,因此效率较低。因此,需要开发最先进的工具和技术来进行情感分析,使其成为促进更快计算的需要。Apache Hadoop框架就是这样一种选择,它支持分布式数据计算,并已被广泛用于各种用例。在本文中,作者识别了影响基于Hadoop框架的情感分析算法性能的因素,并提出了一种优化情感分析性能的方法。实验结果表明了该方法的可行性。
{"title":"A Novel Approach to Optimize the Performance of Hadoop Frameworks for Sentiment Analysis","authors":"G. Srinivasa, Amith K. Jain, Prithviraj Jain, R. NageshH.","doi":"10.4018/ijossp.2019100103","DOIUrl":"https://doi.org/10.4018/ijossp.2019100103","url":null,"abstract":"Twitter is one among most popular micro blogging services with millions of active users. It is a hub of massive collection of data arriving from various sources. In Twitter, users most often express their views, opinions, thoughts, emotions or feelings about a particular topic, product or service, of their interest, choice or concern. This makes twitter a hub of gargantuan amount of data, and at the same time a useful platform in getting to know and understand the underlying sentiment behind a particular product or for that matter anything expressed in twitter as tweets. It is important to note here that aforesaid massive collection of data is not just any redundant data, but one which contains useful information as noted earlier. In view of aforesaid context, Sentiment analysis in relation to twitter data gains enormous importance. Sentiment analysis offers itself as a good approach in classifying the opinions formulated by individuals (tweeters) into different sentiments such as, positive, negative, or neutral. Implementing Sentiment analysis algorithms using conventional tools leads to high computation time, and thus are less effective. Hence, there is a need for state-of-the-art tools and techniques to be developed for sentiment analysis making it the need of the hour to facilitate faster computation. An Apache Hadoop framework is one such option that supports distributed data computing and has been commonly adopted for a variety of use-cases. In this article, the author identifies factors affecting the performance of sentiment analysis algorithms based on Hadoop framework and proposes an approach for optimizing the performance of sentiment analysis. The experimental results depict the potential of the proposed approach.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"44 1","pages":"44-59"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85344289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Optimization Driven Constraints Handling in Combinatorial Interaction Testing 组合交互测试中的优化驱动约束处理
Q4 Computer Science Pub Date : 2019-07-01 DOI: 10.4018/ijossp.2019070102
P. Ramgouda, V. Chandraprakash
The combinatorial strategy is useful in the reduction of the number of input parameters into a compact set of a system based on the combinations of the parameters. This strategy can be used in testing the behaviour that takes place when the events are allowed to be executed in an appropriate order. Basically, in the software systems, for the highly configurable system, the input configurations are based on the constraints, and the construction of this idea undergoes various kinds of difficulties. The proposed Jaya-Bat optimization algorithm is developed with the combinatorial interaction test cases in an effective manner in the presence of the constraints. The proposed Jaya-Bat based optimization algorithm is the integration of the Jaya optimization algorithm (JOA) and the Bat optimization algorithm (BA). The experimentation is carried out in terms of average size and the average time to prove the effectiveness of the proposed algorithm. From the results, it is clear that the proposed algorithm is capable of selecting the test cases optimally with better performance.
组合策略在将输入参数的数量减少到基于参数组合的系统的紧凑集合中是有用的。此策略可用于测试允许以适当顺序执行事件时发生的行为。基本上,在软件系统中,对于高度可配置的系统,输入配置是基于约束的,这种思想的构建经历了各种各样的困难。在存在约束条件的情况下,利用组合交互测试用例有效地开发了Jaya-Bat优化算法。提出的基于Jaya-Bat的优化算法是Jaya优化算法(JOA)和Bat优化算法(BA)的集成。在平均大小和平均时间方面进行了实验,以证明所提算法的有效性。从结果来看,很明显,所提出的算法能够以更好的性能选择最佳的测试用例。
{"title":"Optimization Driven Constraints Handling in Combinatorial Interaction Testing","authors":"P. Ramgouda, V. Chandraprakash","doi":"10.4018/ijossp.2019070102","DOIUrl":"https://doi.org/10.4018/ijossp.2019070102","url":null,"abstract":"The combinatorial strategy is useful in the reduction of the number of input parameters into a compact set of a system based on the combinations of the parameters. This strategy can be used in testing the behaviour that takes place when the events are allowed to be executed in an appropriate order. Basically, in the software systems, for the highly configurable system, the input configurations are based on the constraints, and the construction of this idea undergoes various kinds of difficulties. The proposed Jaya-Bat optimization algorithm is developed with the combinatorial interaction test cases in an effective manner in the presence of the constraints. The proposed Jaya-Bat based optimization algorithm is the integration of the Jaya optimization algorithm (JOA) and the Bat optimization algorithm (BA). The experimentation is carried out in terms of average size and the average time to prove the effectiveness of the proposed algorithm. From the results, it is clear that the proposed algorithm is capable of selecting the test cases optimally with better performance.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"46 3 1","pages":"19-37"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89575435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Topic Modeling Based Approach for Enhancing Corpus Querying 基于主题建模的语料库查询增强方法
Q4 Computer Science Pub Date : 2019-07-01 DOI: 10.4018/ijossp.2019070103
N. Alhindawi, B. A. Ata, Lana Obeidat, M. Al-Batah, M. Abu-Ata
In information retrieval, the accuracy of the retrieval process is mainly dependent on query terms selection; therefore, the user must choose the needed terms carefully and selectively. Traditionally, the process of selecting query terms is done manually. However, in the last two decades, a lot of research has been directed towards automating the process of choosing and enhancing query terms. In this article, a new novel approach is presented, which relies on topic modeling in query building and expansion. Two open source systems were selected to perform the experiments, results show that adding the topic's term to the user's query clearly improves its quality and thus, improves the ranking results.
在信息检索中,检索过程的准确性主要取决于查询词的选择;因此,用户必须谨慎而有选择性地选择所需的术语。传统上,选择查询词的过程是手动完成的。然而,在过去的二十年里,大量的研究都是针对自动选择和增强查询词的过程。本文提出了一种基于主题建模的查询构建和扩展方法。选择两个开源系统进行实验,结果表明,在用户的查询中添加主题术语明显提高了查询的质量,从而提高了排名结果。
{"title":"A Topic Modeling Based Approach for Enhancing Corpus Querying","authors":"N. Alhindawi, B. A. Ata, Lana Obeidat, M. Al-Batah, M. Abu-Ata","doi":"10.4018/ijossp.2019070103","DOIUrl":"https://doi.org/10.4018/ijossp.2019070103","url":null,"abstract":"In information retrieval, the accuracy of the retrieval process is mainly dependent on query terms selection; therefore, the user must choose the needed terms carefully and selectively. Traditionally, the process of selecting query terms is done manually. However, in the last two decades, a lot of research has been directed towards automating the process of choosing and enhancing query terms. In this article, a new novel approach is presented, which relies on topic modeling in query building and expansion. Two open source systems were selected to perform the experiments, results show that adding the topic's term to the user's query clearly improves its quality and thus, improves the ranking results.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"294 1","pages":"38-50"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74987294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Quality of Mobile Applications Based on a Hybrid MCDM Approach 基于混合MCDM方法的移动应用质量评估
Q4 Computer Science Pub Date : 2019-07-01 DOI: 10.4018/ijossp.2019070104
P. Aggarwal, P. S. Grover, Laxmi Ahuja
With the expansion in the quantity of cell phone utilization, mobile applications are developing significantly in today's high-tech environment. With this high demand, the quality of mobile applications is turning into a major issue. The organizations are still finding a way to develop quality applications. The number of quality models has already been proposed for assessing the quality of a mobile application but none of them provide a holistic view towards quality assurance. The present research work proposes an empirical evaluation of the SQM-MApp quality model using a hybrid multi-criteria decision-making approach named ELimination Et Choix Traduisant la REalité (ELimination and Choice Expressing REality) (ELECTRE-TRI) method and step-wise weight assessment ratio analysis (SWARA) method for ranking and determining weights of chosen quality factors respectively. The proposed approach specifically is for the mobile applications that are from the gaming domain. Also, validation of the proposed approach is performed by assessing the quality of gaming applications.
随着手机使用量的扩大,移动应用在当今高科技环境下得到了显著的发展。在这种高需求下,移动应用程序的质量正在成为一个主要问题。这些组织仍在寻找开发高质量应用程序的方法。已经提出了许多质量模型来评估移动应用程序的质量,但没有一个模型能够提供质量保证的整体视图。本研究提出了SQM-MApp质量模型的实证评价,采用一种混合多准则决策方法,即消除和选择表达现实(electretri)方法和逐步加权评估比率分析法(SWARA)方法,分别对所选质量因素进行排序和确定权重。所建议的方法特别适用于来自游戏领域的移动应用程序。此外,通过评估游戏应用程序的质量来验证所提出的方法。
{"title":"Assessing Quality of Mobile Applications Based on a Hybrid MCDM Approach","authors":"P. Aggarwal, P. S. Grover, Laxmi Ahuja","doi":"10.4018/ijossp.2019070104","DOIUrl":"https://doi.org/10.4018/ijossp.2019070104","url":null,"abstract":"With the expansion in the quantity of cell phone utilization, mobile applications are developing significantly in today's high-tech environment. With this high demand, the quality of mobile applications is turning into a major issue. The organizations are still finding a way to develop quality applications. The number of quality models has already been proposed for assessing the quality of a mobile application but none of them provide a holistic view towards quality assurance. The present research work proposes an empirical evaluation of the SQM-MApp quality model using a hybrid multi-criteria decision-making approach named ELimination Et Choix Traduisant la REalité (ELimination and Choice Expressing REality) (ELECTRE-TRI) method and step-wise weight assessment ratio analysis (SWARA) method for ranking and determining weights of chosen quality factors respectively. The proposed approach specifically is for the mobile applications that are from the gaming domain. Also, validation of the proposed approach is performed by assessing the quality of gaming applications.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"298 1","pages":"51-63"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79650923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
International Journal of Open Source Software and Processes
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1