2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)最新文献_第8页

Impacts of Size and History Length on Energetic Community Load Forecasting: A Case Study 规模和历史长度对能量群落负荷预测的影响:一个案例研究

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.00-61

M. Tits, Benjamin Bernaud, Amel Achour, Maher Badri, L. Guedria

Recently, most European distribution systems (DS) are overwhelmed by the coupled growth of decentralized production and residential appliance volatility. To cope with this issue, new solutions are emerging, such as local energy storage and energetic community management. The latter aims for the collective self-consumption maximization of the locally-produced energy through optimal planning of flexible appliances, to reduce DS maintenance costs and energy loss. The quality of short-term load forecasting is key in this process. However, it depends on various factors, foremost including the characteristics of the concerned energetic community. In this paper, we propose a methodology and a use case, based on randomized sampling for the simulation of virtual energetic communities (VEC). From the numerous simulated VEC, statistical analysis allows to assess the impact of the VEC characteristics (such as size, resident type and availability of historical data) on its predictability. From a 2-year dataset of 52 households recorded in a Belgian city, we quantify the impacts of these characteristics, and show that for this specific case study, a trade-off for efficient forecasting can be reached for a community of about 10-30 households and 2-12 months of history length.

最近，大多数欧洲配电系统(DS)被分散生产和家用电器波动的耦合增长所淹没。为了应对这一问题，新的解决方案正在出现，例如本地能源存储和充满活力的社区管理。后者的目标是通过柔性设备的优化规划，实现本地生产能源的集体自我消费最大化，以减少DS的维护成本和能源损失。在此过程中，短期负荷预测的质量是关键。然而，它取决于各种因素，最重要的是包括有关的有活力的社区的特点。本文提出了一种基于随机抽样的虚拟能量群落(VEC)模拟方法和用例。从众多模拟VEC中，统计分析可以评估VEC特征(如规模、居民类型和历史数据的可用性)对其可预测性的影响。从比利时一个城市记录的52个家庭的2年数据集中，我们量化了这些特征的影响，并表明对于这个特定的案例研究，可以对大约10-30个家庭和2-12个月的历史长度的社区进行有效预测。

{"title":"Impacts of Size and History Length on Energetic Community Load Forecasting: A Case Study","authors":"M. Tits, Benjamin Bernaud, Amel Achour, Maher Badri, L. Guedria","doi":"10.1109/COMPSAC48688.2020.00-61","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.00-61","url":null,"abstract":"Recently, most European distribution systems (DS) are overwhelmed by the coupled growth of decentralized production and residential appliance volatility. To cope with this issue, new solutions are emerging, such as local energy storage and energetic community management. The latter aims for the collective self-consumption maximization of the locally-produced energy through optimal planning of flexible appliances, to reduce DS maintenance costs and energy loss. The quality of short-term load forecasting is key in this process. However, it depends on various factors, foremost including the characteristics of the concerned energetic community. In this paper, we propose a methodology and a use case, based on randomized sampling for the simulation of virtual energetic communities (VEC). From the numerous simulated VEC, statistical analysis allows to assess the impact of the VEC characteristics (such as size, resident type and availability of historical data) on its predictability. From a 2-year dataset of 52 households recorded in a Belgian city, we quantify the impacts of these characteristics, and show that for this specific case study, a trade-off for efficient forecasting can be reached for a community of about 10-30 households and 2-12 months of history length.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124863852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Effects of Social Media Use on Health and Academic Performance Among Students at the University of Sharjah 社交媒体使用对沙迦大学学生健康和学习成绩的影响

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.0-176

S. Rahman, A. Marzouqi, Swetha Variyath, Shristee Rahman, Masud Rabbani, Sheikh Iqbal Ahamed

From the statistics, almost 5 billion people in 2020 will be connected to Social Media (SM). Studies have drawn attention to the harms of SM to the health of students; it affects their attention span, memory, sleep, vision, and overall physical, mental, and social health. In this paper, we investigate the effects of SM use on the health and academic performance of students at the University of Sharjah. This study shows that students with more self-regulation have better control over social media use. A cross-sectional mixed approach (CSMA) was used to conduct the research using both quantitative and qualitative data. Out of 300 student participants in our study, the majority of them used Instagram, followed by WhatsApp and Twitter. Students reported an average time of 3-4 hours per day on social media; however, qualitative data showed that many students spent all day on social media. A majority of the students used social media to chat with friends and make new connections. They agreed that their use of social media has reduced reading of paper-based resources and has affected their grammar and writing skills. The use of SM delayed their bedtime and left fewer hours for sleep and caused eyestrain, neck/shoulder pain, fatigue, and poor posture, with declining physical activity. This study concludes that social media use does affect academic performance and health among the students of the University of Sharjah. Considering the negative consequences of extensive social media use, universities need to create awareness programs and can incorporate this as a topic in health education and awareness courses. Our study also generated new information and insights about the effects of high levels of SM usage on the health and academic performance among university students, thereby creating opportunities for further research.

从统计数据来看，到2020年，将有近50亿人连接到社交媒体(SM)。SM对学生健康的危害已经引起了人们的关注;它会影响他们的注意力、记忆力、睡眠、视力以及整体的身体、心理和社会健康。在本文中，我们调查了SM使用对沙迦大学学生的健康和学业成绩的影响。这项研究表明，自我调节能力强的学生对社交媒体的使用有更好的控制。采用横断面混合方法(CSMA)对定量和定性数据进行研究。在我们研究的300名学生参与者中，大多数人使用Instagram，其次是WhatsApp和Twitter。学生们平均每天花在社交媒体上的时间为3-4小时;然而，定性数据显示，许多学生整天都在社交媒体上。大多数学生使用社交媒体与朋友聊天，结交新朋友。他们一致认为，社交媒体的使用减少了对纸质资源的阅读，影响了他们的语法和写作技能。SM的使用推迟了他们的就寝时间，减少了睡眠时间，导致眼睛疲劳、颈/肩痛、疲劳和不良姿势，并减少了身体活动。这项研究的结论是，社交媒体的使用确实影响了沙迦大学学生的学习成绩和健康。考虑到广泛使用社交媒体的负面影响，大学需要创建意识项目，并将其作为健康教育和意识课程的主题。我们的研究还产生了关于高水平使用SM对大学生健康和学习成绩影响的新信息和见解，从而为进一步的研究创造了机会。

{"title":"Effects of Social Media Use on Health and Academic Performance Among Students at the University of Sharjah","authors":"S. Rahman, A. Marzouqi, Swetha Variyath, Shristee Rahman, Masud Rabbani, Sheikh Iqbal Ahamed","doi":"10.1109/COMPSAC48688.2020.0-176","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.0-176","url":null,"abstract":"From the statistics, almost 5 billion people in 2020 will be connected to Social Media (SM). Studies have drawn attention to the harms of SM to the health of students; it affects their attention span, memory, sleep, vision, and overall physical, mental, and social health. In this paper, we investigate the effects of SM use on the health and academic performance of students at the University of Sharjah. This study shows that students with more self-regulation have better control over social media use. A cross-sectional mixed approach (CSMA) was used to conduct the research using both quantitative and qualitative data. Out of 300 student participants in our study, the majority of them used Instagram, followed by WhatsApp and Twitter. Students reported an average time of 3-4 hours per day on social media; however, qualitative data showed that many students spent all day on social media. A majority of the students used social media to chat with friends and make new connections. They agreed that their use of social media has reduced reading of paper-based resources and has affected their grammar and writing skills. The use of SM delayed their bedtime and left fewer hours for sleep and caused eyestrain, neck/shoulder pain, fatigue, and poor posture, with declining physical activity. This study concludes that social media use does affect academic performance and health among the students of the University of Sharjah. Considering the negative consequences of extensive social media use, universities need to create awareness programs and can incorporate this as a topic in health education and awareness courses. Our study also generated new information and insights about the effects of high levels of SM usage on the health and academic performance among university students, thereby creating opportunities for further research.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122170925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Predicting the Political Polarity of Tweets Using Supervised Machine Learning 使用监督机器学习预测推文的政治极性

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.000-9

Michelle Voong, Keerthana Gunda, S. Gokhale

With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.

随着社交媒体的出现;政治家、媒体机构和普通公民都经常求助于Twitter来分享他们的想法和感受。从中立的推文中辨别出政治偏见的推文，有助于确定当选官员或媒体参与政治言论的倾向。本文提出了一种有监督的机器学习方法来预测推文是政治偏见还是中立。该方法使用了Crowdflower提供的标记数据集，其中每条tweet都标有党派/中立标签以及其消息类型和受众。该方法考虑了多种语言特性的组合，包括术语频率-逆文档频率(TF-IDF)、双引号和三元组，以及元数据特性，包括提及、转发和url，以及消息类型和受众的附加标签。它训练简单分类器和集成分类器，并使用精度、召回率和f1分数来评估它们的性能。结果表明，分类器在结合TF-IDF和元数据特征(可以从tweet中自动提取)进行训练时，可以准确地预测tweet的极性，从而消除了手动、繁琐且容易出错的额外标记的需要。

{"title":"Predicting the Political Polarity of Tweets Using Supervised Machine Learning","authors":"Michelle Voong, Keerthana Gunda, S. Gokhale","doi":"10.1109/COMPSAC48688.2020.000-9","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.000-9","url":null,"abstract":"With the advent of social media; politicians, media outlets, and ordinary citizens alike are routinely turning to Twitter to share their thoughts and feelings. Discerning politically biased tweets from neutral ones can assist in determining the propensity of an elected official or a media outlet in engaging in political rhetoric. This paper presents a supervised machine learning approach to predict whether a tweet is politically biased or neutral. The approach uses a labeled data set available at Crowdflower, where each tweet is tagged with a partisan/neutral label plus its message type and audience. The approach considers a combination of linguistic features including Term Frequency-Inverse Document Frequency (TF-IDF), bigrams, and trigrams along with metadata features including mentions, retweets, and URLs, as well as the additional labels of message type and audience. It trains both simple and ensemble classifiers and assesses their performance using precision, recall, and F1-score. The results demonstrate that the classifiers can predict the polarity of a tweet accurately when trained on a combination of TF-IDF and metadata features that can be extracted automatically from the tweets, eliminating the need for additional tagging which is manual, cumbersome and error prone.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122910349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

How Much Support Can API Recommendation Methods Provide for Component-Based Synthesis? API推荐方法能为基于组件的合成提供多少支持?

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.0-155

Jiaxin Liu, Binbin Liu, Wei Dong, Yating Zhang, Daiyan Wang

Program synthesis is one of the key research areas in software engineering. Many approaches design domain-specific language to constrain the program space to make the problem tractable. Although these approaches can be effective in certain domains, it is still a challenge to synthesize programs in generic programming languages. Fortunately, the component-based synthesis provides a promising way to generate generic programs from a component library of application programming interfaces (APIs). However, the program space constituted by all the APIs in the library is still very large. Hence, only small programs can be synthesized in practice. In recent years, many approaches of API recommendation have been proposed, which can recommend relevant APIs given some specifications. We think that applying this technique to component-based synthesis is a feasible way to reduce the program space. And we believe that how much support the API recommendation methods can provide to component-based synthesis is also an important criterion in measuring the effectiveness of these methods. In this paper, we investigate 5 state-of-the-art API recommendation methods to study their effectiveness in supporting component-based synthesis. Besides, we propose an approach of API Recommendation via General Search (ARGS). We collect a set of programming tasks and compare our approach with these 5 API recommendation methods on synthesizing these tasks. The experimental results show that the capability of these API recommendation methods is limited in supporting component-based synthesis. On the contrary, ARGS can support component-based synthesis well, which can effectively narrow down the program space and eventually improve the efficiency of program synthesis. The experimental results show that ARGS can help to significantly reduce the synthesis time by 86.1% compared to the original SyPet.

程序综合是软件工程中的一个重要研究领域。许多方法设计领域特定语言来约束程序空间，使问题易于处理。尽管这些方法在某些领域是有效的，但用泛型编程语言合成程序仍然是一个挑战。幸运的是，基于组件的合成提供了一种很有前途的方法，可以从应用程序编程接口(api)的组件库生成泛型程序。但是，由库中所有api构成的程序空间仍然非常大。因此，在实践中只能合成小程序。近年来，人们提出了许多API推荐方法，即在给定一定规范的情况下推荐相关API。我们认为将该技术应用于基于组件的合成是减小程序空间的可行方法。我们认为，API推荐方法对基于组件的合成的支持程度也是衡量这些方法有效性的重要标准。在本文中，我们研究了5种最先进的API推荐方法，以研究它们在支持基于组件的合成中的有效性。此外，我们还提出了一种基于通用搜索(ARGS)的API推荐方法。我们收集了一组编程任务，并将我们的方法与这5种API推荐方法在综合这些任务方面进行了比较。实验结果表明，这些API推荐方法在支持基于组件的合成方面能力有限。相反，ARGS可以很好地支持基于组件的合成，这可以有效地缩小程序空间，最终提高程序合成效率。实验结果表明，与原有的SyPet相比，ARGS可以将合成时间显著降低86.1%。

{"title":"How Much Support Can API Recommendation Methods Provide for Component-Based Synthesis?","authors":"Jiaxin Liu, Binbin Liu, Wei Dong, Yating Zhang, Daiyan Wang","doi":"10.1109/COMPSAC48688.2020.0-155","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.0-155","url":null,"abstract":"Program synthesis is one of the key research areas in software engineering. Many approaches design domain-specific language to constrain the program space to make the problem tractable. Although these approaches can be effective in certain domains, it is still a challenge to synthesize programs in generic programming languages. Fortunately, the component-based synthesis provides a promising way to generate generic programs from a component library of application programming interfaces (APIs). However, the program space constituted by all the APIs in the library is still very large. Hence, only small programs can be synthesized in practice. In recent years, many approaches of API recommendation have been proposed, which can recommend relevant APIs given some specifications. We think that applying this technique to component-based synthesis is a feasible way to reduce the program space. And we believe that how much support the API recommendation methods can provide to component-based synthesis is also an important criterion in measuring the effectiveness of these methods. In this paper, we investigate 5 state-of-the-art API recommendation methods to study their effectiveness in supporting component-based synthesis. Besides, we propose an approach of API Recommendation via General Search (ARGS). We collect a set of programming tasks and compare our approach with these 5 API recommendation methods on synthesizing these tasks. The experimental results show that the capability of these API recommendation methods is limited in supporting component-based synthesis. On the contrary, ARGS can support component-based synthesis well, which can effectively narrow down the program space and eventually improve the efficiency of program synthesis. The experimental results show that ARGS can help to significantly reduce the synthesis time by 86.1% compared to the original SyPet.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125750915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Hybrid Algorithms Construction of Hyper-Heuristic for Test Case Prioritization 一种用于测试用例优先排序的超启发式混合算法构建

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.000-2

Zheng Li, Yanzhao Xi, Ruilian Zhao

By scheduling algorithms in the low-level algorithm library, the hyper-heuristic algorithm can help to effectively select an appropriate method to deal with hard computational search problems. The hyper-heuristic algorithm usually includes a high-level scheduling layer and a low-level algorithm layer. The high-level strategy layer selects the algorithm for the next scheduling by evaluating the execution effect of the different algorithms in the low-level layer, while the low-level layer includes a variety of different heuristic algorithms which called algorithm library. The concrete hyper-heuristic framework for multi-objective test case prioritization was presented where the 18 multi-objective algorithms were formed in the low-level library. It has been gradually realized that a hybrid algorithm by combining single objective algorithm and multi-objective optimization algorithm is better than the individual. This paper explores the influence of the construction pattern of algorithm library for the hyper-heuristic algorithm by constructing the fusion pattern of different types of algorithms.

超启发式算法通过对底层算法库中的算法进行调度，可以有效地选择合适的方法来处理难计算搜索问题。超启发式算法通常包括高级调度层和低级算法层。高级策略层通过评估低级层中不同算法的执行效果来选择下一次调度的算法，而低级层则包含各种不同的启发式算法，称为算法库。给出了多目标测试用例优先排序的具体超启发式框架，在底层库中形成了18种多目标算法。人们逐渐认识到，单目标算法和多目标优化算法相结合的混合算法优于单个算法。本文通过构建不同类型算法的融合模式，探讨算法库构建模式对超启发式算法的影响。

{"title":"A Hybrid Algorithms Construction of Hyper-Heuristic for Test Case Prioritization","authors":"Zheng Li, Yanzhao Xi, Ruilian Zhao","doi":"10.1109/COMPSAC48688.2020.000-2","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.000-2","url":null,"abstract":"By scheduling algorithms in the low-level algorithm library, the hyper-heuristic algorithm can help to effectively select an appropriate method to deal with hard computational search problems. The hyper-heuristic algorithm usually includes a high-level scheduling layer and a low-level algorithm layer. The high-level strategy layer selects the algorithm for the next scheduling by evaluating the execution effect of the different algorithms in the low-level layer, while the low-level layer includes a variety of different heuristic algorithms which called algorithm library. The concrete hyper-heuristic framework for multi-objective test case prioritization was presented where the 18 multi-objective algorithms were formed in the low-level library. It has been gradually realized that a hybrid algorithm by combining single objective algorithm and multi-objective optimization algorithm is better than the individual. This paper explores the influence of the construction pattern of algorithm library for the hyper-heuristic algorithm by constructing the fusion pattern of different types of algorithms.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129542328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

On the Effectiveness of Random Node Sampling in Influence Maximization on Unknown Graph 随机节点抽样在未知图影响最大化中的有效性

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.0-188

Yuki Wakisaka, Kazuyuki Yamashita, Sho Tsugawa, H. Ohsaki

Influence maximization in a social network has been intensively studied, motivated by its application to so-called viral marketing. The influence maximization problem is formulated as a combinatorial optimization problem on a graph that aims to identify a small set of influential nodes (i.e., seed nodes) such that the expected size of the influence cascade triggered by the seed nodes is maximized. In general, it is difficult in practice to obtain the complete knowledge on large-scale networks. Therefore, a problem of identifying a set of influential seed nodes only from a partial structure of the network obtained from network sampling strategies has also been studied in recent years. To achieve efficient influence propagation in unknown networks, the number of sample nodes must be determined appropriately for obtaining a partial structure of the network. In this paper, we clarify the relation between the sample size and the expected size of influence cascade triggered by the seed nodes through mathematical analyses. Specifically, we derive the expected size of influence cascade with random node sampling and degree-based seed node selection. Through several numerical examples using datasets of real social networks, we also investigate the implication of our analysis results to influence maximization on unknown social networks.

社交网络中的影响力最大化已经被深入研究，其动机是将其应用于所谓的病毒式营销。影响最大化问题被表述为图上的组合优化问题，该问题旨在识别一小组有影响的节点(即种子节点)，从而使由种子节点触发的影响级联的预期大小最大化。通常，在实践中很难获得大规模网络上的完整知识。因此，仅从网络采样策略获得的部分网络结构中识别一组有影响的种子节点的问题近年来也得到了研究。为了在未知网络中实现有效的影响传播，必须适当地确定样本节点的数量，以获得网络的部分结构。本文通过数学分析，阐明了种子节点引发的影响级联的样本量与期望大小之间的关系。具体而言，我们通过随机节点采样和基于程度的种子节点选择来推导影响级联的期望大小。通过几个使用真实社会网络数据集的数值示例，我们还研究了我们的分析结果对未知社会网络影响最大化的含义。

{"title":"On the Effectiveness of Random Node Sampling in Influence Maximization on Unknown Graph","authors":"Yuki Wakisaka, Kazuyuki Yamashita, Sho Tsugawa, H. Ohsaki","doi":"10.1109/COMPSAC48688.2020.0-188","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.0-188","url":null,"abstract":"Influence maximization in a social network has been intensively studied, motivated by its application to so-called viral marketing. The influence maximization problem is formulated as a combinatorial optimization problem on a graph that aims to identify a small set of influential nodes (i.e., seed nodes) such that the expected size of the influence cascade triggered by the seed nodes is maximized. In general, it is difficult in practice to obtain the complete knowledge on large-scale networks. Therefore, a problem of identifying a set of influential seed nodes only from a partial structure of the network obtained from network sampling strategies has also been studied in recent years. To achieve efficient influence propagation in unknown networks, the number of sample nodes must be determined appropriately for obtaining a partial structure of the network. In this paper, we clarify the relation between the sample size and the expected size of influence cascade triggered by the seed nodes through mathematical analyses. Specifically, we derive the expected size of influence cascade with random node sampling and degree-based seed node selection. Through several numerical examples using datasets of real social networks, we also investigate the implication of our analysis results to influence maximization on unknown social networks.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130962780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Detection of Change of Users in SNS by Two Dimensional CNN 基于二维CNN的SNS用户变化检测

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.0-159

H. Matsushita, R. Uda

In this paper, we proposed a method for detecting hacked accounts in SNS without predetermined features since trend of topics and slang expressions always change and hackers can make messages which are matched with the predetermined features. There are some researches in which a hacked account or impersonation in SNS is detected. However, they have problems that predetermined features were used in their method or evaluation procedure was not appropriate. On the other hand, in our method, a feature named 'category' is automatically extracted among recent tweets by machine learning. We evaluated the categories with 1,000 test accounts. As a result, 74.4% of the test accounts can be detected with the rate up to 96.0% when they are hacked and only one new message is posted. Moreover, 73.4% of the test accounts can be detected with the rate up to 99.2% by one new posted message. Furthermore, other hacked accounts can also be detected with the same rate when several messages are sequentially posted.

在本文中，我们提出了一种检测SNS中没有预定特征的被黑账户的方法，因为话题趋势和俚语表达总是在变化，黑客可以制作与预定特征相匹配的消息。也有一些研究发现了SNS上的账户被黑客攻击或冒充的情况。然而，他们的问题是预先确定的特征在他们的方法或评价程序中使用是不合适的。另一方面，在我们的方法中，一个名为“类别”的特征是通过机器学习从最近的推文中自动提取出来的。我们用1000个测试账户评估了这些类别。结果，74.4%的测试账户在被黑客攻击时，只发布一条新消息，可被发现的比率高达96.0%。此外，73.4%的测试账户可以通过一条新发布的消息被检测到，检测率高达99.2%。此外，当连续发布几条消息时，其他被黑客攻击的账户也可以以相同的速度被检测到。

引用次数: 0

Ensemble Random Forests Classifier for Detecting Coincidentally Correct Test Cases 用于检测巧合正确测试用例的集成随机森林分类器

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.00-72

Shuvalaxmi Dass, Xiaozhen Xue, A. Namin

The performance of coverage-based fault localization greatly depends on the quality of test cases being executed. These test cases execute some lines of the given program and determine whether the underlying tests are passed or failed. In particular, some test cases may be well-behaved (i.e., passed) while executing faulty statements. These test cases, also known as coincidentally correct test cases, may negatively influence the performance of the spectra-based fault localization and thus be less helpful as a tool for the purpose of automated debugging. In other words, the involvement of these coincidentally correct test cases may introduce noises to the fault localization computation and thus cause in divergence of effectively localizing the location of possible bugs in the given code. In this paper, we propose a hybrid approach of ensemble learning combined with a supervised learning algorithm namely, Random Forests (RF) for the purpose of correctly identifying test cases that are mislabeled to be the passing test cases. A cost-effective analysis of flipping the test status or trimming (i.e., eliminating from the computation) the coincidental correct test cases is also reported.

基于覆盖率的故障定位的性能在很大程度上取决于所执行的测试用例的质量。这些测试用例执行给定程序的某些行，并确定底层测试是否通过或失败。特别是，一些测试用例在执行错误语句时可能表现良好(即通过)。这些测试用例，也被称为巧合正确的测试用例，可能会对基于谱的故障定位的性能产生负面影响，因此作为自动调试工具的帮助不大。换句话说，这些碰巧正确的测试用例的参与可能会给故障定位计算带来噪声，从而导致有效定位给定代码中可能存在的错误的位置的分歧。在本文中，我们提出了一种集成学习与监督学习算法相结合的混合方法，即随机森林(RF)，目的是正确识别被错误标记为通过测试用例的测试用例。还报告了翻转测试状态或修剪(即，从计算中消除)碰巧正确的测试用例的成本效益分析。

{"title":"Ensemble Random Forests Classifier for Detecting Coincidentally Correct Test Cases","authors":"Shuvalaxmi Dass, Xiaozhen Xue, A. Namin","doi":"10.1109/COMPSAC48688.2020.00-72","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.00-72","url":null,"abstract":"The performance of coverage-based fault localization greatly depends on the quality of test cases being executed. These test cases execute some lines of the given program and determine whether the underlying tests are passed or failed. In particular, some test cases may be well-behaved (i.e., passed) while executing faulty statements. These test cases, also known as coincidentally correct test cases, may negatively influence the performance of the spectra-based fault localization and thus be less helpful as a tool for the purpose of automated debugging. In other words, the involvement of these coincidentally correct test cases may introduce noises to the fault localization computation and thus cause in divergence of effectively localizing the location of possible bugs in the given code. In this paper, we propose a hybrid approach of ensemble learning combined with a supervised learning algorithm namely, Random Forests (RF) for the purpose of correctly identifying test cases that are mislabeled to be the passing test cases. A cost-effective analysis of flipping the test status or trimming (i.e., eliminating from the computation) the coincidental correct test cases is also reported.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122931270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Adaptive OS Switching for Improving Availability During Web Traffic Surges: A Feasibility Study 自适应操作系统切换提高网络流量激增期间的可用性:可行性研究

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.00-97

Katsuya Matsubara, Yuhei Takagawa

Web services have seen early adoption and rapid growth with the introduction of various web system frameworks, and not only large companies but also smaller businesses or individuals can provide their own web services offering access to their products, entertainment, and information. Therefore, as the number of Internet users increases, especially with the spread of smartphones, even relatively small web service infrastructure need to have both high access performance and availability, although the cost of additional computational resources and redundant servers may be hard to bear depending on the load. We focus on the fact that the performance characteristics may differ depending on the internal implementation of the operating system (OS), even when the available computing resources are the same. This paper investigates the possibility of developing a system that achieves both high access performance and availability of the web server by dynamically switching the OS on which the web server is running, without requiring additional computing resources or using redundant servers. This paper identifies the differences between Linux and FreeBSD in terms of network processing and describes the mechanism of process migration among heterogeneous OSes to switch the OSes. It then demonstrates the feasibility of our approach with experimental results on the performance characteristics and load tolerance of a web server in operation when the OSes are dynamically switched.

随着各种Web系统框架的引入，Web服务已经得到了早期的采用和快速的发展，不仅是大公司，而且小型企业或个人也可以提供他们自己的Web服务，提供对他们的产品、娱乐和信息的访问。因此，随着互联网用户数量的增加，特别是随着智能手机的普及，即使是相对较小的web服务基础设施也需要具有高访问性能和可用性，尽管根据负载的不同，额外计算资源和冗余服务器的成本可能难以承受。我们关注的事实是，即使可用的计算资源相同，性能特征也可能因操作系统(OS)的内部实现而有所不同。本文研究了开发一种系统的可能性，通过动态切换运行web服务器的操作系统来实现高访问性能和web服务器的可用性，而不需要额外的计算资源或使用冗余服务器。本文指出了Linux和FreeBSD在网络处理方面的差异，并描述了在异构操作系统之间进行进程迁移以实现操作系统切换的机制。然后用实验结果证明了我们的方法在操作系统动态切换时web服务器的性能特征和负载容忍度的可行性。

{"title":"Adaptive OS Switching for Improving Availability During Web Traffic Surges: A Feasibility Study","authors":"Katsuya Matsubara, Yuhei Takagawa","doi":"10.1109/COMPSAC48688.2020.00-97","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.00-97","url":null,"abstract":"Web services have seen early adoption and rapid growth with the introduction of various web system frameworks, and not only large companies but also smaller businesses or individuals can provide their own web services offering access to their products, entertainment, and information. Therefore, as the number of Internet users increases, especially with the spread of smartphones, even relatively small web service infrastructure need to have both high access performance and availability, although the cost of additional computational resources and redundant servers may be hard to bear depending on the load. We focus on the fact that the performance characteristics may differ depending on the internal implementation of the operating system (OS), even when the available computing resources are the same. This paper investigates the possibility of developing a system that achieves both high access performance and availability of the web server by dynamically switching the OS on which the web server is running, without requiring additional computing resources or using redundant servers. This paper identifies the differences between Linux and FreeBSD in terms of network processing and describes the mechanism of process migration among heterogeneous OSes to switch the OSes. It then demonstrates the feasibility of our approach with experimental results on the performance characteristics and load tolerance of a web server in operation when the OSes are dynamically switched.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116459977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Transtracer: Socket-Based Tracing of Network Dependencies Among Processes in Distributed Applications Transtracer:基于套接字的分布式应用程序进程之间的网络依赖跟踪

2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)

Pub Date : 2020-07-01 DOI: 10.1109/COMPSAC48688.2020.00-92

Yuuki Tsubouchi, Masahiro Furukawa, Ryosuke Matsumoto

Distributed applications in web services have become increasingly complex in response to various user demands. Consequently, system administrators have difficulty understanding inter-process dependencies in distributed applications. When parts of the system are changed or augmented, they cannot identify the area of influence by the change, which might engender a more damaging outage than expected. Therefore, they must trace dependencies automatically among unknown processes. An earlier method discovered the dependency by detecting the transport connection using the Linux packet filter on the hosts at ends of the network connection. However, the extra delay to the application traffic increases because of the additional processing inherent in the packet processing in the Linux kernel. As described herein, we propose an architecture of monitoring network sockets, which are endpoints of TCP connections, to trace the dependency. As long as applications use the TCP protocol stack in the Linux kernel, the dependencies are discovered by our architecture. Therefore, monitoring processing only reads the connection information from network sockets. The processing is independent of the application communication. Therefore, the monitoring does not affect the network delay of the applications. Our experiments confirmed that our architecture reduced the delay overhead by 13–20 % and the resource load by 43.5 % compared to earlier reported methods.

为了响应各种用户需求，web服务中的分布式应用程序变得越来越复杂。因此，系统管理员很难理解分布式应用程序中的进程间依赖关系。当系统的某些部分被更改或扩展时，他们无法识别受更改影响的区域，这可能会导致比预期更具破坏性的中断。因此，它们必须自动跟踪未知进程之间的依赖关系。早期的一种方法是通过在网络连接末端的主机上使用Linux包过滤器检测传输连接来发现依赖关系。但是，由于Linux内核中数据包处理中固有的额外处理，应用程序流量的额外延迟会增加。如本文所述，我们提出了一种监控网络套接字(TCP连接的端点)的体系结构，以跟踪依赖关系。只要应用程序使用Linux内核中的TCP协议栈，我们的体系结构就会发现依赖关系。因此，监控处理只从网络套接字读取连接信息。处理独立于应用程序通信。因此，监控不会影响应用程序的网络延迟。我们的实验证实，与之前报道的方法相比，我们的架构减少了13 - 20%的延迟开销和43.5%的资源负载。

{"title":"Transtracer: Socket-Based Tracing of Network Dependencies Among Processes in Distributed Applications","authors":"Yuuki Tsubouchi, Masahiro Furukawa, Ryosuke Matsumoto","doi":"10.1109/COMPSAC48688.2020.00-92","DOIUrl":"https://doi.org/10.1109/COMPSAC48688.2020.00-92","url":null,"abstract":"Distributed applications in web services have become increasingly complex in response to various user demands. Consequently, system administrators have difficulty understanding inter-process dependencies in distributed applications. When parts of the system are changed or augmented, they cannot identify the area of influence by the change, which might engender a more damaging outage than expected. Therefore, they must trace dependencies automatically among unknown processes. An earlier method discovered the dependency by detecting the transport connection using the Linux packet filter on the hosts at ends of the network connection. However, the extra delay to the application traffic increases because of the additional processing inherent in the packet processing in the Linux kernel. As described herein, we propose an architecture of monitoring network sockets, which are endpoints of TCP connections, to trace the dependency. As long as applications use the TCP protocol stack in the Linux kernel, the dependencies are discovered by our architecture. Therefore, monitoring processing only reads the connection information from network sockets. The processing is independent of the application communication. Therefore, the monitoring does not affect the network delay of the applications. Our experiments confirmed that our architecture reduced the delay overhead by 13–20 % and the resource load by 43.5 % compared to earlier reported methods.","PeriodicalId":430098,"journal":{"name":"2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125268840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1