Inf. Comput.最新文献_第3页

Smart Wearable to Prevent Injuries in Amateur Athletes in Squats Exercise by Using Lightweight Machine Learning Model 利用轻量级机器学习模型预防业余运动员深蹲运动损伤的智能穿戴设备

Inf. Comput.

Pub Date : 2023-07-14 DOI: 10.3390/info14070402

Ricardo P. Arciniega-Rocha, Vanessa C. Erazo-Chamorro, P. Rosero-Montalvo, G. Szabó

An erroneous squat movement might cause different injuries in amateur athletes who are not experts in workout exercises. Even when personal trainers watch out for the athletes’ workout performance, light variations in ankles, knees, and lower back movements might not be recognized. Therefore, we present a smart wearable to alert athletes whether their squats performance is correct. We collect data from people experienced with workout exercises and from learners, supervising personal trainers in annotation of data. Then, we use data preprocessing techniques to reduce noisy samples and train Machine Learning models with a small memory footprint to be exported to microcontrollers to classify squats’ movements. As a result, the k-Nearest Neighbors algorithm with k = 5 achieves an 85% performance and weight of 40 KB of RAM.

一个错误的深蹲动作可能会对业余运动员造成不同程度的伤害。即使当私人教练密切关注运动员的锻炼表现时，脚踝、膝盖和下背部运动的轻微变化可能也不会被发现。因此，我们提出了一种智能可穿戴设备来提醒运动员他们的深蹲表现是否正确。我们从有锻炼经验的人那里收集数据，从学习者那里收集数据，监督私人教练对数据的注释。然后，我们使用数据预处理技术来减少噪声样本，并训练具有小内存占用的机器学习模型，以导出到微控制器来分类深蹲运动。因此，当k = 5时，k- nearest Neighbors算法的性能为85%，权重为40kb RAM。

引用次数: 1

Enhancing CSI-Based Human Activity Recognition by Edge Detection Techniques 利用边缘检测技术增强基于csi的人体活动识别

Inf. Comput.

Pub Date : 2023-07-14 DOI: 10.3390/info14070404

Hossein Shahverdi, M. Nabati, P. Moshiri, R. Asvadi, S. Ghorashi

Human Activity Recognition (HAR) has been a popular area of research in the Internet of Things (IoT) and Human–Computer Interaction (HCI) over the past decade. The objective of this field is to detect human activities through numeric or visual representations, and its applications include smart homes and buildings, action prediction, crowd counting, patient rehabilitation, and elderly monitoring. Traditionally, HAR has been performed through vision-based, sensor-based, or radar-based approaches. However, vision-based and sensor-based methods can be intrusive and raise privacy concerns, while radar-based methods require special hardware, making them more expensive. WiFi-based HAR is a cost-effective alternative, where WiFi access points serve as transmitters and users’ smartphones serve as receivers. The HAR in this method is mainly performed using two wireless-channel metrics: Received Signal Strength Indicator (RSSI) and Channel State Information (CSI). CSI provides more stable and comprehensive information about the channel compared to RSSI. In this research, we used a convolutional neural network (CNN) as a classifier and applied edge-detection techniques as a preprocessing phase to improve the quality of activity detection. We used CSI data converted into RGB images and tested our methodology on three available CSI datasets. The results showed that the proposed method achieved better accuracy and faster training times than the simple RGB-represented data. In order to justify the effectiveness of our approach, we repeated the experiment by applying raw CSI data to long short-term memory (LSTM) and Bidirectional LSTM classifiers.

在过去的十年中，人类活动识别(HAR)已经成为物联网(IoT)和人机交互(HCI)领域的一个热门研究领域。该领域的目标是通过数字或视觉表示来检测人类活动，其应用包括智能家居和建筑、行动预测、人群计数、患者康复和老年人监测。传统上，HAR是通过基于视觉、基于传感器或基于雷达的方法进行的。然而，基于视觉和传感器的方法可能会侵入并引起隐私问题，而基于雷达的方法需要特殊的硬件，这使得它们更加昂贵。基于WiFi的HAR是一种具有成本效益的替代方案，其中WiFi接入点作为发射器，用户智能手机作为接收器。该方法中的HAR主要使用两个无线信道指标:接收信号强度指标(RSSI)和信道状态信息(CSI)。与RSSI相比，CSI提供了更稳定、更全面的通道信息。在本研究中，我们使用卷积神经网络(CNN)作为分类器，并应用边缘检测技术作为预处理阶段，以提高活动检测的质量。我们将CSI数据转换为RGB图像，并在三个可用的CSI数据集上测试了我们的方法。结果表明，与简单的rgb表示数据相比，该方法具有更好的准确率和更快的训练时间。为了证明我们方法的有效性，我们通过将原始CSI数据应用于长短期记忆(LSTM)和双向LSTM分类器来重复实验。

{"title":"Enhancing CSI-Based Human Activity Recognition by Edge Detection Techniques","authors":"Hossein Shahverdi, M. Nabati, P. Moshiri, R. Asvadi, S. Ghorashi","doi":"10.3390/info14070404","DOIUrl":"https://doi.org/10.3390/info14070404","url":null,"abstract":"Human Activity Recognition (HAR) has been a popular area of research in the Internet of Things (IoT) and Human–Computer Interaction (HCI) over the past decade. The objective of this field is to detect human activities through numeric or visual representations, and its applications include smart homes and buildings, action prediction, crowd counting, patient rehabilitation, and elderly monitoring. Traditionally, HAR has been performed through vision-based, sensor-based, or radar-based approaches. However, vision-based and sensor-based methods can be intrusive and raise privacy concerns, while radar-based methods require special hardware, making them more expensive. WiFi-based HAR is a cost-effective alternative, where WiFi access points serve as transmitters and users’ smartphones serve as receivers. The HAR in this method is mainly performed using two wireless-channel metrics: Received Signal Strength Indicator (RSSI) and Channel State Information (CSI). CSI provides more stable and comprehensive information about the channel compared to RSSI. In this research, we used a convolutional neural network (CNN) as a classifier and applied edge-detection techniques as a preprocessing phase to improve the quality of activity detection. We used CSI data converted into RGB images and tested our methodology on three available CSI datasets. The results showed that the proposed method achieved better accuracy and faster training times than the simple RGB-represented data. In order to justify the effectiveness of our approach, we repeated the experiment by applying raw CSI data to long short-term memory (LSTM) and Bidirectional LSTM classifiers.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"250 1","pages":"404"},"PeriodicalIF":0.0,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85185591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting Exploitation 漏洞利用:基于主题的机器学习框架，用于解释和预测漏洞利用

Inf. Comput.

Pub Date : 2023-07-14 DOI: 10.3390/info14070403

Konstantinos Charmanas, N. Mittas, L. Angelis

Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.

安全漏洞是硬件和软件安全最重要的弱点之一，可能会对系统、应用程序和用户造成严重损害。因此，软件供应商应该通过开发适当的对策来优先处理最危险和最具影响力的安全漏洞。由于我们认识到漏洞优先级的重要性，在本研究中，我们提出了一个框架，该框架通过词聚类将新披露的漏洞与主题分布进行映射，并进一步预测该新条目是否与潜在的漏洞相关概念验证(POC)。我们还通过广义线性模型(GLM)提供了关于当前最易利用的弱点和产品的见解，该模型将漏洞的主题成员关系与漏洞利用指标联系起来，从而区分出与相对频繁的最近漏洞利用相关的五个主题。我们的实验表明，通过将LDA模型提高55%，该框架在主题一致性方面优于两种基线主题建模算法。在分类性能方面，在一个相当平衡的数据集(57%的负面观察值，43%的正面观察值)上进行的实验表明，漏洞描述可以作为评估漏洞可利用性的唯一特征，因为“最佳”模型的准确率接近87%。总的来说，我们的研究通过提供弱点的文本细节与潜在的应用程序/系统漏洞之间的关系的指导方针，有助于实现漏洞的优先级。

{"title":"Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting Exploitation","authors":"Konstantinos Charmanas, N. Mittas, L. Angelis","doi":"10.3390/info14070403","DOIUrl":"https://doi.org/10.3390/info14070403","url":null,"abstract":"Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"1 1","pages":"403"},"PeriodicalIF":0.0,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87459400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Application of Z-Numbers in Fuzzy Decision Making: The State of the Art z数在模糊决策中的应用:研究现状

Inf. Comput.

Pub Date : 2023-07-13 DOI: 10.3390/info14070400

N. Alam, K. Khalif, N. Jaini, A. Gegov

A Z-number is very powerful in describing imperfect information, in which fuzzy numbers are paired such that the partially reliable information is properly processed. During a decision-making process, human beings always use natural language to describe their preferences, and the decision information is usually imprecise and partially reliable. The nature of the Z-number, which is composed of the restriction and reliability components, has made it a powerful tool for depicting certain decision information. Its strengths and advantages have attracted many researchers worldwide to further study and extend its theory and applications. The current research trend on Z-numbers has shown an increasing interest among researchers in the fuzzy set theory, especially its application to decision making. This paper reviews the application of Z-numbers in decision making, in which previous decision-making models based on Z-numbers are analyzed to identify their strengths and contributions. The decision making based on Z-numbers improves the reliability of the decision information and makes it more meaningful. Another scope that is closely related to decision making, namely, the ranking of Z-numbers, is also reviewed. Then, the evaluative analysis of the Z-numbers is conducted to evaluate the performance of Z-numbers in decision making. Future directions and recommendations on the applications of Z-numbers in decision making are provided at the end of this review.

z数在描述不完全信息方面非常强大，其中模糊数配对使得部分可靠信息得到适当处理。在决策过程中，人类总是使用自然语言来描述自己的偏好，而决策信息通常是不精确和部分可靠的。z数由约束分量和可靠性分量组成，其性质使其成为描述某些决策信息的有力工具。它的优势和优势吸引了世界各地许多研究者进一步研究和扩展其理论和应用。目前关于z数的研究趋势表明，模糊集理论，特别是模糊集理论在决策中的应用越来越受到研究者的关注。本文回顾了z数在决策中的应用，分析了以往基于z数的决策模型的优势和贡献。基于z数的决策提高了决策信息的可靠性，使决策更有意义。另一个与决策密切相关的范围，即z数的排名，也进行了审查。然后，对z数进行评价分析，评价z数在决策中的表现。最后对z数在决策中的应用提出了今后的发展方向和建议。

{"title":"The Application of Z-Numbers in Fuzzy Decision Making: The State of the Art","authors":"N. Alam, K. Khalif, N. Jaini, A. Gegov","doi":"10.3390/info14070400","DOIUrl":"https://doi.org/10.3390/info14070400","url":null,"abstract":"A Z-number is very powerful in describing imperfect information, in which fuzzy numbers are paired such that the partially reliable information is properly processed. During a decision-making process, human beings always use natural language to describe their preferences, and the decision information is usually imprecise and partially reliable. The nature of the Z-number, which is composed of the restriction and reliability components, has made it a powerful tool for depicting certain decision information. Its strengths and advantages have attracted many researchers worldwide to further study and extend its theory and applications. The current research trend on Z-numbers has shown an increasing interest among researchers in the fuzzy set theory, especially its application to decision making. This paper reviews the application of Z-numbers in decision making, in which previous decision-making models based on Z-numbers are analyzed to identify their strengths and contributions. The decision making based on Z-numbers improves the reliability of the decision information and makes it more meaningful. Another scope that is closely related to decision making, namely, the ranking of Z-numbers, is also reviewed. Then, the evaluative analysis of the Z-numbers is conducted to evaluate the performance of Z-numbers in decision making. Future directions and recommendations on the applications of Z-numbers in decision making are provided at the end of this review.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"87 1","pages":"400"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76344453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Addiction and Spending in Gacha Games Gacha游戏中的成瘾和消费

Inf. Comput.

Pub Date : 2023-07-13 DOI: 10.3390/info14070399

N. Lakic, A. Bernik, Andrej Čep

Gacha games are the most dominant games on the mobile market. These are free-to-play games with a lottery-like system, where the user pays with in-game currency to enter a draw in order to obtain the character or item they want. If a player does not obtain what he hoped for, there is the option of paying with his own money for more draws, and this is the main way to monetize the Gacha game. The purpose of this study is to show the playing and spending habits of Gacha players: the reasons they like such games, the reasons for spending, how much they spend, what they spend on, how long they have been spending, and whether they are aware of their spending. The paper includes studies by other researchers on various aspects of Gacha games as well. The aim of the paper is to conduct a study with the hypothesis that players who play the same game for a while and have a habit of playing it are willing to give more of their money to enter a draw. Therefore, two research questions and two hypotheses were analyzed. A total of 713 participants took part in the study.

Gacha游戏是手机市场上最具主导地位的游戏。这些都是带有类似彩票系统的免费游戏，即用户需要支付游戏内货币才能获得他们想要的角色或道具。如果玩家没有得到自己想要的东西，他们可以选择用自己的钱来获得更多的奖励，这是Gacha游戏的主要盈利方式。这项研究的目的是展示Gacha玩家的游戏和消费习惯:他们喜欢这类游戏的原因，他们消费的原因，他们花了多少钱，他们花了什么，他们花了多长时间，他们是否意识到自己的消费。这篇论文还包括其他研究人员对Gacha游戏各个方面的研究。这篇论文的目的是进行一项研究，假设玩同一款游戏一段时间并有玩游戏习惯的玩家愿意花更多钱参加平局。因此，本文分析了两个研究问题和两个假设。共有713名参与者参加了这项研究。

引用次数: 0

Trademark Similarity Evaluation Using a Combination of ViT and Local Features 结合ViT和局部特征的商标相似度评价

Inf. Comput.

Pub Date : 2023-07-12 DOI: 10.3390/info14070398

Dmitry Vesnin, D. Levshun, A. Chechulin

The origin of the trademark similarity analysis problem lies within the legal area, specifically the protection of intellectual property. One of the possible technical solutions for this issue is the trademark similarity evaluation pipeline based on the content-based image retrieval approach. CNN-based off-the-shelf features have shown themselves as a good baseline for trademark retrieval. However, in recent years, the computer vision area has been transitioning from CNNs to a new architecture, namely, Vision Transformer. In this paper, we investigate the performance of off-the-shelf features extracted with vision transformers and explore the effects of pre-, post-processing, and pre-training on big datasets. We propose the enhancement of the trademark similarity evaluation pipeline by joint usage of global and local features, which leverages the best aspects of both approaches. Experimental results on the METU Trademark Dataset show that off-the-shelf features extracted with ViT-based models outperform off-the-shelf features from CNN-based models. The proposed method achieves a mAP value of 31.23, surpassing previous state-of-the-art results. We assume that the usage of an enhanced trademark similarity evaluation pipeline allows for the improvement of the protection of intellectual property with the help of artificial intelligence methods. Moreover, this approach enables one to identify cases of unfair use of such data and form an evidence base for litigation.

商标相似分析问题的根源在于法律领域，特别是知识产权保护问题。基于基于内容的图像检索方法的商标相似度评估管道是解决这一问题的一种可能的技术方案。基于cnn的现成特征已经被证明是一个很好的商标检索基线。然而，近年来，计算机视觉领域已经从cnn过渡到一种新的架构，即视觉变压器。在本文中，我们研究了用视觉变压器提取的现成特征的性能，并探讨了预处理、后处理和预训练对大数据集的影响。我们建议通过联合使用全局特征和局部特征来增强商标相似度评估管道，从而充分利用这两种方法的优点。在METU商标数据集上的实验结果表明，基于vit的模型提取的现成特征优于基于cnn的模型提取的现成特征。该方法的mAP值为31.23，超过了以往最先进的结果。我们假设使用增强的商标相似度评估管道可以在人工智能方法的帮助下改善知识产权保护。此外，这种方法使人们能够识别不公平使用这些数据的案件，并形成诉讼的证据基础。

{"title":"Trademark Similarity Evaluation Using a Combination of ViT and Local Features","authors":"Dmitry Vesnin, D. Levshun, A. Chechulin","doi":"10.3390/info14070398","DOIUrl":"https://doi.org/10.3390/info14070398","url":null,"abstract":"The origin of the trademark similarity analysis problem lies within the legal area, specifically the protection of intellectual property. One of the possible technical solutions for this issue is the trademark similarity evaluation pipeline based on the content-based image retrieval approach. CNN-based off-the-shelf features have shown themselves as a good baseline for trademark retrieval. However, in recent years, the computer vision area has been transitioning from CNNs to a new architecture, namely, Vision Transformer. In this paper, we investigate the performance of off-the-shelf features extracted with vision transformers and explore the effects of pre-, post-processing, and pre-training on big datasets. We propose the enhancement of the trademark similarity evaluation pipeline by joint usage of global and local features, which leverages the best aspects of both approaches. Experimental results on the METU Trademark Dataset show that off-the-shelf features extracted with ViT-based models outperform off-the-shelf features from CNN-based models. The proposed method achieves a mAP value of 31.23, surpassing previous state-of-the-art results. We assume that the usage of an enhanced trademark similarity evaluation pipeline allows for the improvement of the protection of intellectual property with the help of artificial intelligence methods. Moreover, this approach enables one to identify cases of unfair use of such data and form an evidence base for litigation.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"19 1","pages":"398"},"PeriodicalIF":0.0,"publicationDate":"2023-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80848616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Satisfiability Modulo Theory Solvers for Verification of Neural Networks in Predictive Maintenance Applications 利用可满足模理论求解器验证预测性维护应用中的神经网络

Inf. Comput.

Pub Date : 2023-07-12 DOI: 10.3390/info14070397

Dario Guidotti, L. Pandolfo, Luca Pulina

Interest in machine learning and neural networks has increased significantly in recent years. However, their applications are limited in safety-critical domains due to the lack of formal guarantees on their reliability and behavior. This paper shows recent advances in satisfiability modulo theory solvers used in the context of the verification of neural networks with piece-wise linear and transcendental activation functions. An experimental analysis is conducted using neural networks trained on a real-world predictive maintenance dataset. This study contributes to the research on enhancing the safety and reliability of neural networks through formal verification, enabling their deployment in safety-critical domains.

近年来，人们对机器学习和神经网络的兴趣显著增加。然而，由于缺乏对其可靠性和行为的正式保证，它们的应用在安全关键领域受到限制。本文介绍了用于验证具有分段线性和超越激活函数的神经网络的可满足模理论解算器的最新进展。实验分析使用在现实世界预测性维护数据集上训练的神经网络进行。本研究有助于通过形式化验证提高神经网络的安全性和可靠性，使其能够在安全关键领域部署。

引用次数: 1

Measuring and Understanding Crowdturfing in the App Store 衡量和理解App Store中的众筹

Inf. Comput.

Pub Date : 2023-07-11 DOI: 10.3390/info14070393

Qi-Qi Hu, Xiaomei Zhang, Fangqi Li, Zhushou Tang, Shilin Wang

Application marketplaces collect ratings and reviews from users to provide references for other consumers. Many crowdturfing activities abuse user reviews to manipulate the reputation of an app and mislead other consumers. To understand and improve the ecosystem of reviews in the app market, we investigate the existence of crowdturfing based on the App Store. This paper reports a measurement study of crowdturfing and its reviews in the App Store. We use a sliding window to obtain the relationship graph between users and the community detection method to binary classify the detected communities. Then, we measure and analyze the crowdturfing obtained from the classification and compare them with genuine users. We analyze several features of crowdturfing, such as ratings, sentiment scores, text similarity, and common words. We also investigate which apps crowdturfing often appears in and reveal their role in app ranking. These insights are used as features in machine learning models, and the results show that they can effectively train classifiers and detect crowdturfing reviews with an accuracy of up to 98.13%. This study reveals malicious crowdfunding practices in the App Store and helps to strengthen the review security of app marketplaces.

应用程序市场收集用户的评分和评论，为其他消费者提供参考。许多众筹活动滥用用户评论来操纵应用的声誉，误导其他消费者。为了理解和完善应用市场中的评论生态系统，我们调查了基于app Store的众筹的存在。本文报告了一项关于众筹及其在App Store中的评价的测量研究。我们使用滑动窗口获得用户间的关系图，并使用社区检测方法对检测到的社区进行二值分类。然后，我们对分类得到的众包进行测量和分析，并与真实用户进行比较。我们分析了众筹的几个特征，如评分、情感得分、文本相似度和常用词。我们还调查了众筹经常出现在哪些应用中，并揭示了它们在应用排名中的作用。这些见解被用作机器学习模型中的特征，结果表明它们可以有效地训练分类器并检测众筹评论，准确率高达98.13%。该研究揭示了App Store中的恶意众筹行为，并有助于加强应用市场的审查安全性。

{"title":"Measuring and Understanding Crowdturfing in the App Store","authors":"Qi-Qi Hu, Xiaomei Zhang, Fangqi Li, Zhushou Tang, Shilin Wang","doi":"10.3390/info14070393","DOIUrl":"https://doi.org/10.3390/info14070393","url":null,"abstract":"Application marketplaces collect ratings and reviews from users to provide references for other consumers. Many crowdturfing activities abuse user reviews to manipulate the reputation of an app and mislead other consumers. To understand and improve the ecosystem of reviews in the app market, we investigate the existence of crowdturfing based on the App Store. This paper reports a measurement study of crowdturfing and its reviews in the App Store. We use a sliding window to obtain the relationship graph between users and the community detection method to binary classify the detected communities. Then, we measure and analyze the crowdturfing obtained from the classification and compare them with genuine users. We analyze several features of crowdturfing, such as ratings, sentiment scores, text similarity, and common words. We also investigate which apps crowdturfing often appears in and reveal their role in app ranking. These insights are used as features in machine learning models, and the results show that they can effectively train classifiers and detect crowdturfing reviews with an accuracy of up to 98.13%. This study reveals malicious crowdfunding practices in the App Store and helps to strengthen the review security of app marketplaces.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"23 1","pages":"393"},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74220166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic 3D Building Model Generation from Airborne LiDAR Data and OpenStreetMap Using Procedural Modeling 使用程序建模的机载LiDAR数据和OpenStreetMap自动生成3D建筑模型

Inf. Comput.

Pub Date : 2023-07-11 DOI: 10.3390/info14070394

R. Zupan, Adam Vinković, Rexhep Nikçi, Bernarda Pinjatela

This research is primarily focused on utilizing available airborne LiDAR data and spatial data from the OpenStreetMap (OSM) database to generate 3D models of buildings for a large-scale urban area. The city center of Ljubljana, Slovenia, was selected for the study area due to data availability and diversity of building shapes, heights, and functions, which presented a challenge for the automated generation of 3D models. To extract building heights, a range of data sources were utilized, including OSM attribute data, as well as georeferenced and classified point clouds and a digital elevation model (DEM) obtained from openly available LiDAR survey data of the Slovenian Environment Agency. A digital surface model (DSM) and digital terrain model (DTM) were derived from the processed LiDAR data. Building outlines and attributes were extracted from OSM and processed using QGIS. Spatial coverage of OSM data for buildings in the study area is excellent, whereas only 18% have attributes describing external appearance of the building and 6% describing roof type. LASTools software (rapidlasso GmbH, Friedrichshafener Straße 1, 82205 Gilching, GERMANY) was used to derive and assign building heights from 3D coordinates of the segmented point clouds. Various software options for procedural modeling were compared and Blender was selected due to the ability to process OSM data, availability of documentation, and low computing requirements. Using procedural modeling, a 3D model with level of detail (LOD) 1 was created fully automated. After analyzing roof types, a 3D model with LOD2 was created fully automated for 87.64% of buildings. For the remaining buildings, a comparison of procedural roof modeling and manual roof editing was performed. Finally, a visual comparison between the resulting 3D model and Google Earth’s model was performed. The main objective of this study is to demonstrate the efficient modeling process using open data and free software and resulting in an enhanced accuracy of the 3D building models compared to previous LOD2 iterations.

本研究主要集中在利用可用的机载激光雷达数据和来自OpenStreetMap (OSM)数据库的空间数据来生成大规模城市地区建筑物的3D模型。斯洛文尼亚卢布尔雅那市中心被选为研究区域，因为数据的可用性和建筑形状、高度和功能的多样性，这对自动生成3D模型提出了挑战。为了提取建筑高度，使用了一系列数据源，包括OSM属性数据、地理参考和分类点云和数字高程模型(DEM)，这些数据来自斯洛文尼亚环境署公开提供的LiDAR调查数据。利用处理后的激光雷达数据建立了数字地表模型(DSM)和数字地形模型(DTM)。从OSM中提取建筑物轮廓和属性，并使用QGIS进行处理。研究区域建筑物的OSM数据的空间覆盖非常好，而只有18%的属性描述了建筑物的外观，6%的属性描述了屋顶类型。使用LASTools软件(rapidlasso GmbH, Friedrichshafener Straße 1,82205 Gilching, GERMANY)从分割点云的三维坐标中导出并分配建筑物高度。对程序建模的各种软件选项进行了比较，由于能够处理OSM数据，文档的可用性和低计算要求，选择了Blender。使用程序建模，一个3D模型与细节水平(LOD) 1是完全自动化创建的。在分析屋顶类型后，使用LOD2为87.64%的建筑物全自动创建了3D模型。对于剩余的建筑，进行了程序屋顶建模和手动屋顶编辑的比较。最后，将生成的三维模型与Google Earth模型进行视觉比较。本研究的主要目的是展示使用开放数据和免费软件的高效建模过程，并与之前的LOD2迭代相比，提高了3D建筑模型的准确性。

{"title":"Automatic 3D Building Model Generation from Airborne LiDAR Data and OpenStreetMap Using Procedural Modeling","authors":"R. Zupan, Adam Vinković, Rexhep Nikçi, Bernarda Pinjatela","doi":"10.3390/info14070394","DOIUrl":"https://doi.org/10.3390/info14070394","url":null,"abstract":"This research is primarily focused on utilizing available airborne LiDAR data and spatial data from the OpenStreetMap (OSM) database to generate 3D models of buildings for a large-scale urban area. The city center of Ljubljana, Slovenia, was selected for the study area due to data availability and diversity of building shapes, heights, and functions, which presented a challenge for the automated generation of 3D models. To extract building heights, a range of data sources were utilized, including OSM attribute data, as well as georeferenced and classified point clouds and a digital elevation model (DEM) obtained from openly available LiDAR survey data of the Slovenian Environment Agency. A digital surface model (DSM) and digital terrain model (DTM) were derived from the processed LiDAR data. Building outlines and attributes were extracted from OSM and processed using QGIS. Spatial coverage of OSM data for buildings in the study area is excellent, whereas only 18% have attributes describing external appearance of the building and 6% describing roof type. LASTools software (rapidlasso GmbH, Friedrichshafener Straße 1, 82205 Gilching, GERMANY) was used to derive and assign building heights from 3D coordinates of the segmented point clouds. Various software options for procedural modeling were compared and Blender was selected due to the ability to process OSM data, availability of documentation, and low computing requirements. Using procedural modeling, a 3D model with level of detail (LOD) 1 was created fully automated. After analyzing roof types, a 3D model with LOD2 was created fully automated for 87.64% of buildings. For the remaining buildings, a comparison of procedural roof modeling and manual roof editing was performed. Finally, a visual comparison between the resulting 3D model and Google Earth’s model was performed. The main objective of this study is to demonstrate the efficient modeling process using open data and free software and resulting in an enhanced accuracy of the 3D building models compared to previous LOD2 iterations.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"39 1","pages":"394"},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89759239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Incorporating an Unsupervised Text Mining Approach into Studying Logistics Risk Management: Insights from Corporate Annual Reports and Topic Modeling 将无监督文本挖掘方法纳入物流风险管理研究:来自公司年度报告和主题建模的见解

Inf. Comput.

Pub Date : 2023-07-11 DOI: 10.3390/info14070395

David L. Olson, Bongsug Chae

This study examined the Security and Exchange Commission (SEC) annual reports of selected logistics firms over the period from 2006 through 2021 for risk management terms. The purpose was to identify which risks are considered most important in supply chain logistics operations. Section 1A of the SEC reports includes risk factors. The COVID-19 pandemic has had a heavy impact on global supply chains. We also know that trucking firms have long had difficulties recruiting drivers. Fuel price has always been a major risk for airlines but also can impact shipping, trucking, and railroads. We were especially interested in pandemic, personnel, and fuel risks. We applied topic modeling, enabling us to identify some of the capabilities of unsupervised text mining as applied to SEC reports. We demonstrate the identification of terms, the time dimension, and correlation across topics by the topic model. Our analysis confirmed expectations about COVID-19’s impact, personnel shortages, and fuel. It also revealed common themes regarding the risks involved in international trade and perceived regulatory risks. We conclude with the supply chain management risks identified and discuss means of mitigation.

本研究检查了美国证券交易委员会(SEC) 2006年至2021年期间选定物流公司的年度报告，以了解风险管理条款。目的是确定哪些风险在供应链物流操作中被认为是最重要的。SEC报告的1A部分包括风险因素。新冠肺炎疫情对全球供应链造成严重影响。我们也知道，卡车运输公司长期以来一直在招聘司机方面遇到困难。燃料价格一直是航空公司的主要风险，但也会影响航运、卡车运输和铁路。我们对大流行、人员和燃料风险特别感兴趣。我们应用了主题建模，使我们能够识别应用于SEC报告的无监督文本挖掘的一些功能。我们通过主题模型演示了术语的识别、时间维度和跨主题的相关性。我们的分析证实了对COVID-19影响、人员短缺和燃料的预期。它还揭示了有关国际贸易所涉及的风险和感知到的监管风险的共同主题。最后，我们确定了供应链管理风险，并讨论了减轻风险的方法。

{"title":"Incorporating an Unsupervised Text Mining Approach into Studying Logistics Risk Management: Insights from Corporate Annual Reports and Topic Modeling","authors":"David L. Olson, Bongsug Chae","doi":"10.3390/info14070395","DOIUrl":"https://doi.org/10.3390/info14070395","url":null,"abstract":"This study examined the Security and Exchange Commission (SEC) annual reports of selected logistics firms over the period from 2006 through 2021 for risk management terms. The purpose was to identify which risks are considered most important in supply chain logistics operations. Section 1A of the SEC reports includes risk factors. The COVID-19 pandemic has had a heavy impact on global supply chains. We also know that trucking firms have long had difficulties recruiting drivers. Fuel price has always been a major risk for airlines but also can impact shipping, trucking, and railroads. We were especially interested in pandemic, personnel, and fuel risks. We applied topic modeling, enabling us to identify some of the capabilities of unsupervised text mining as applied to SEC reports. We demonstrate the identification of terms, the time dimension, and correlation across topics by the topic model. Our analysis confirmed expectations about COVID-19’s impact, personnel shortages, and fuel. It also revealed common themes regarding the risks involved in international trade and perceived regulatory risks. We conclude with the supply chain management risks identified and discuss means of mitigation.","PeriodicalId":13622,"journal":{"name":"Inf. Comput.","volume":"118 1","pages":"395"},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90314817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0