大数据生命周期中异构资源基于组织目标的 Twitter/X 用户可信度评估

IF 8.9 1区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Computers in Human Behavior Pub Date : 2025-01-01 Epub Date: 2024-09-03 DOI:10.1016/j.chb.2024.108428

Sogand Dehghan , Rojiar Pir Mohammadiani , Shahriar Mohammadi

{"title":"大数据生命周期中异构资源基于组织目标的 Twitter/X 用户可信度评估","authors":"Sogand Dehghan , Rojiar Pir Mohammadiani , Shahriar Mohammadi","doi":"10.1016/j.chb.2024.108428","DOIUrl":null,"url":null,"abstract":"<div><p>Social network data, such as Twitter/X, is of Big Social Data type. Big social data describes people's social behaviors and interactions. They have high business value for decision-making in organizations. However, because of the anonymous nature of social network users, their credibility is ambiguous. Credibility expresses the accuracy and value of big social data. Despite extensive research on the credibility of big social data, most methods have not paid sufficient attention to the important dimensions of their assessment, including user expertise based on topic, selecting social network features, and labeling them. Furthermore, these methods cannot manage the time, high volume, and speed of big social data. To address these issues, this paper presents a novel model for assessing the credibility of Twitter/X users by integrating Twitter/X with Google Scholar. The model automatically defines users' credibility labels using Google Scholar. Machine learning feature selection methods also select features that affect the credibility of Twitter/X users based on the topic. This study uses Google Scholar and the BerTopic algorithm for effective topic modeling on Twitter/X. The model considers unrelated data management, dynamic user credibility, and organizing activities based on the Big Data lifecycle. Finally, using Linear Regression, Support Vector Regression, K-Nearest Neighbor, Random Forest, Classification and Regression Trees algorithms, the model predicts the credibility of Twitter/X users and proves that it performed better than similar models through Classification and Regression Trees. In addition, the model is generalizable for all organizational purposes due to the integration of heterogeneous resources and feature selection methods.</p></div>","PeriodicalId":48471,"journal":{"name":"Computers in Human Behavior","volume":"162 ","pages":"Article 108428"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The credibility assessment of Twitter/X users based organization objectives by heterogeneous resources in big data life cycle\",\"authors\":\"Sogand Dehghan , Rojiar Pir Mohammadiani , Shahriar Mohammadi\",\"doi\":\"10.1016/j.chb.2024.108428\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Social network data, such as Twitter/X, is of Big Social Data type. Big social data describes people's social behaviors and interactions. They have high business value for decision-making in organizations. However, because of the anonymous nature of social network users, their credibility is ambiguous. Credibility expresses the accuracy and value of big social data. Despite extensive research on the credibility of big social data, most methods have not paid sufficient attention to the important dimensions of their assessment, including user expertise based on topic, selecting social network features, and labeling them. Furthermore, these methods cannot manage the time, high volume, and speed of big social data. To address these issues, this paper presents a novel model for assessing the credibility of Twitter/X users by integrating Twitter/X with Google Scholar. The model automatically defines users' credibility labels using Google Scholar. Machine learning feature selection methods also select features that affect the credibility of Twitter/X users based on the topic. This study uses Google Scholar and the BerTopic algorithm for effective topic modeling on Twitter/X. The model considers unrelated data management, dynamic user credibility, and organizing activities based on the Big Data lifecycle. Finally, using Linear Regression, Support Vector Regression, K-Nearest Neighbor, Random Forest, Classification and Regression Trees algorithms, the model predicts the credibility of Twitter/X users and proves that it performed better than similar models through Classification and Regression Trees. In addition, the model is generalizable for all organizational purposes due to the integration of heterogeneous resources and feature selection methods.</p></div>\",\"PeriodicalId\":48471,\"journal\":{\"name\":\"Computers in Human Behavior\",\"volume\":\"162 \",\"pages\":\"Article 108428\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in Human Behavior\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0747563224002966\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563224002966","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

Twitter/X 等社交网络数据属于大社交数据类型。大社交数据描述了人们的社交行为和互动。这些数据对企业的决策具有很高的商业价值。然而，由于社交网络用户的匿名性，其可信度并不明确。可信度表示社交大数据的准确性和价值。尽管对社交大数据的可信度进行了大量研究，但大多数方法对其评估的重要维度关注不够，包括基于主题的用户专业知识、社交网络特征的选择和标记。此外，这些方法无法管理时间长、数量大、速度快的社交大数据。为了解决这些问题，本文通过将 Twitter/X 与谷歌学术整合，提出了一种评估 Twitter/X 用户可信度的新型模型。该模型利用谷歌学术自动定义用户的可信度标签。机器学习特征选择方法还能根据主题选择影响 Twitter/X 用户可信度的特征。本研究利用 Google Scholar 和 BerTopic 算法对 Twitter/X 进行了有效的主题建模。该模型考虑了不相关的数据管理、动态用户可信度以及基于大数据生命周期的组织活动。最后，利用线性回归、支持向量回归、K-近邻、随机森林、分类和回归树算法，该模型预测了Twitter/X用户的可信度，并证明其表现优于通过分类和回归树建立的同类模型。此外，由于整合了异构资源和特征选择方法，该模型可通用于所有组织目的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The credibility assessment of Twitter/X users based organization objectives by heterogeneous resources in big data life cycle

Social network data, such as Twitter/X, is of Big Social Data type. Big social data describes people's social behaviors and interactions. They have high business value for decision-making in organizations. However, because of the anonymous nature of social network users, their credibility is ambiguous. Credibility expresses the accuracy and value of big social data. Despite extensive research on the credibility of big social data, most methods have not paid sufficient attention to the important dimensions of their assessment, including user expertise based on topic, selecting social network features, and labeling them. Furthermore, these methods cannot manage the time, high volume, and speed of big social data. To address these issues, this paper presents a novel model for assessing the credibility of Twitter/X users by integrating Twitter/X with Google Scholar. The model automatically defines users' credibility labels using Google Scholar. Machine learning feature selection methods also select features that affect the credibility of Twitter/X users based on the topic. This study uses Google Scholar and the BerTopic algorithm for effective topic modeling on Twitter/X. The model considers unrelated data management, dynamic user credibility, and organizing activities based on the Big Data lifecycle. Finally, using Linear Regression, Support Vector Regression, K-Nearest Neighbor, Random Forest, Classification and Regression Trees algorithms, the model predicts the credibility of Twitter/X users and proves that it performed better than similar models through Classification and Regression Trees. In addition, the model is generalizable for all organizational purposes due to the integration of heterogeneous resources and feature selection methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in Human Behavior Multiple-

CiteScore

19.10

自引率

4.00%

发文量

381

审稿时长

40 days

期刊介绍： Computers in Human Behavior is a scholarly journal that explores the psychological aspects of computer use. It covers original theoretical works, research reports, literature reviews, and software and book reviews. The journal examines both the use of computers in psychology, psychiatry, and related fields, and the psychological impact of computer use on individuals, groups, and society. Articles discuss topics such as professional practice, training, research, human development, learning, cognition, personality, and social interactions. It focuses on human interactions with computers, considering the computer as a medium through which human behaviors are shaped and expressed. Professionals interested in the psychological aspects of computer use will find this journal valuable, even with limited knowledge of computers.