ACM Transactions on the Web最新文献_第2页

Introduction to the Special Issue on Advanced Graph Mining on the Web: Theory, Algorithms, and Applications: Part 2 网络高级图挖掘特刊简介：理论、算法和应用：第 2 部分

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2024-01-08 DOI: 10.1145/3631941

Hao Peng, Jian Yang, Jia Wu, Philip S. Yu

No abstract available.

无摘要。

引用次数: 0

BNoteHelper: A Note-Based Outline Generation Tool for Structured Learning on Video Sharing Platforms BNoteHelper：基于笔记的提纲生成工具，用于视频共享平台上的结构化学习

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-12-27 DOI: 10.1145/3638775

Fangyu Yu, Peng Zhang, Xianghua Ding, Tun Lu, Ning Gu

Usually generated by ordinary users and often not particularly designed for learning, the videos on video sharing platforms are mostly not structured enough to support learning purposes, although they are increasingly leveraged for that. Most existing studies attempt to structure the video using video summarization techniques. However, these methods focus on extracting information from within the video and aiming to consume the video itself. In this paper, we design and implement BNoteHelper, a note-based video outline prototype which generates outline titles by extracting user-generated notes on Bilibili, using the BART model fine-tuned on a built dataset. As a browser plugin, BNoteHelper provides users with video overview and navigation as well as note-taking template, via two main features: outline table and navigation marker. The model and prototype are evaluated through automatic and human evaluations. The automatic evaluation reveals that, both before and after fine-tuning, the BART model outperforms T5-Pegasus in BLEU and Perplexity metrics. Also, the results from user feedback reveal that the generation outline sourced from notes is preferred by users than that sourced from video captions due to its more concise, clear, and accurate characteristics, but also too general with less details and diversities sometimes. Two features of the video outline are also found to have respective advantages specially in holistic and fine-grained aspects. Based on these results, we propose insights into designing a video summary from the user-generated creation perspective, customizing it based on video types, and strengthening the advantages of its different visual styles on video sharing platforms.

视频共享平台上的视频通常是由普通用户生成的，而且往往不是特别为学习而设计，因此，尽管它们越来越多地被用于学习目的，但其结构大多不足以支持学习。大多数现有研究都尝试使用视频摘要技术来构建视频结构。然而，这些方法侧重于从视频中提取信息，目的是消费视频本身。在本文中，我们设计并实现了基于笔记的视频摘要原型 BNoteHelper，该原型通过提取用户在 Bilibili 上生成的笔记来生成摘要标题，并使用在构建的数据集上微调的 BART 模型。作为一个浏览器插件，BNoteHelper 通过两个主要功能：大纲表和导航标记，为用户提供视频概览和导航以及笔记模板。我们通过自动和人工评估对模型和原型进行了评估。自动评估结果显示，在微调前后，BART 模型的 BLEU 和 Perplexity 指标均优于 T5-Pegasus。此外，用户反馈结果显示，用户更喜欢来自笔记的生成大纲，因为它比来自视频字幕的生成大纲更简洁、清晰和准确，但有时也过于笼统，细节和多样性较少。此外，我们还发现视频大纲的两个特征在整体性和细粒度方面各有优势。基于这些结果，我们提出了从用户生成创作的角度设计视频概要、根据视频类型定制视频概要以及在视频共享平台上加强其不同视觉风格优势的见解。

{"title":"BNoteHelper: A Note-Based Outline Generation Tool for Structured Learning on Video Sharing Platforms","authors":"Fangyu Yu, Peng Zhang, Xianghua Ding, Tun Lu, Ning Gu","doi":"10.1145/3638775","DOIUrl":"https://doi.org/10.1145/3638775","url":null,"abstract":"Usually generated by ordinary users and often not particularly designed for learning, the videos on video sharing platforms are mostly not structured enough to support learning purposes, although they are increasingly leveraged for that. Most existing studies attempt to structure the video using video summarization techniques. However, these methods focus on extracting information from within the video and aiming to consume the video itself. In this paper, we design and implement BNoteHelper, a note-based video outline prototype which generates outline titles by extracting user-generated notes on Bilibili, using the BART model fine-tuned on a built dataset. As a browser plugin, BNoteHelper provides users with video overview and navigation as well as note-taking template, via two main features: outline table and navigation marker. The model and prototype are evaluated through automatic and human evaluations. The automatic evaluation reveals that, both before and after fine-tuning, the BART model outperforms T5-Pegasus in BLEU and Perplexity metrics. Also, the results from user feedback reveal that the generation outline sourced from notes is preferred by users than that sourced from video captions due to its more concise, clear, and accurate characteristics, but also too general with less details and diversities sometimes. Two features of the video outline are also found to have respective advantages specially in holistic and fine-grained aspects. Based on these results, we propose insights into designing a video summary from the user-generated creation perspective, customizing it based on video types, and strengthening the advantages of its different visual styles on video sharing platforms.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"64 1","pages":""},"PeriodicalIF":3.5,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139053162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nudges to Mitigate Confirmation Bias during Web Search on Debated Topics: Support vs. Manipulation 在争议话题的网络搜索中减轻确认偏差的推动:支持与操纵

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-11-30 DOI: 10.1145/3635034

Alisa Rieger, Tim Draws, Mariët Theune, Nava Tintarev

When people use web search engines to find information on debated topics, the search results they encounter can influence opinion formation and practical decision-making with potentially far-reaching consequences for the individual and society. However, current web search engines lack support for information-seeking strategies that enable responsible opinion formation, e.g., by mitigating confirmation bias and motivating engagement with diverse viewpoints. We conducted two preregistered user studies to test the benefits and risks of an intervention aimed at confirmation bias mitigation. In the first study, we tested the effect of warning labels, warning of the risk of confirmation bias, combined with obfuscations, hiding selected search results per default. We observed that obfuscations with warning labels effectively reduce engagement with search results. These initial findings did not allow conclusions about the extent to which the reduced engagement was caused by the warning label (reflective nudging element) versus the obfuscation (automatic nudging element). If obfuscation was the primary cause, this would raise concerns about harming user autonomy. We thus conducted a follow-up study to test the effect of warning labels and obfuscations separately.

According to our findings, obfuscations run the risk of manipulating behavior instead of guiding it, while warning labels without obfuscations (purely reflective) do not exhaust processing capacities but encourage users to actively choose to decrease engagement with attitude-confirming search results. Therefore, given the risks and unclear benefits of obfuscations and potentially other automatic nudging elements to guide engagement with information, we call for prioritizing interventions that aim to enhance human cognitive skills and agency instead.

当人们使用网络搜索引擎寻找有关争议话题的信息时，他们遇到的搜索结果可能会影响意见的形成和实际决策，对个人和社会产生潜在的深远影响。然而，目前的网络搜索引擎缺乏对信息寻求策略的支持，这些策略可以通过减轻确认偏见和激励不同观点的参与来实现负责任的意见形成。我们进行了两项预先注册的用户研究，以测试旨在缓解确认偏倚的干预措施的益处和风险。在第一项研究中，我们测试了警告标签的效果，警告确认偏差的风险，结合混淆，默认隐藏选定的搜索结果。我们观察到，带有警告标签的混淆有效地降低了搜索结果的参与度。这些最初的发现并不能让我们得出结论，到底是警告标签(反射推动因素)还是混淆(自动推动因素)导致了参与度的降低。如果混淆是主要原因，这将引起对损害用户自主权的担忧。因此，我们进行了一项后续研究，分别测试警告标签和混淆的效果。根据我们的发现，混淆存在操纵行为的风险，而不是引导行为，而没有混淆的警告标签(纯粹的反射)不会耗尽处理能力，而是鼓励用户主动选择减少对态度确认搜索结果的参与。因此，考虑到混淆和潜在的其他自动推动因素指导信息参与的风险和不明确的好处，我们呼吁优先考虑旨在提高人类认知技能和能动性的干预措施。

{"title":"Nudges to Mitigate Confirmation Bias during Web Search on Debated Topics: Support vs. Manipulation","authors":"Alisa Rieger, Tim Draws, Mariët Theune, Nava Tintarev","doi":"10.1145/3635034","DOIUrl":"https://doi.org/10.1145/3635034","url":null,"abstract":"When people use web search engines to find information on debated topics, the search results they encounter can influence opinion formation and practical decision-making with potentially far-reaching consequences for the individual and society. However, current web search engines lack support for information-seeking strategies that enable responsible opinion formation, e.g., by mitigating confirmation bias and motivating engagement with diverse viewpoints. We conducted two preregistered user studies to test the benefits and risks of an intervention aimed at confirmation bias mitigation. In the first study, we tested the effect of warning labels, warning of the risk of confirmation bias, combined with obfuscations, hiding selected search results per default. We observed that obfuscations with warning labels effectively reduce engagement with search results. These initial findings did not allow conclusions about the extent to which the reduced engagement was caused by the warning label (reflective nudging element) versus the obfuscation (automatic nudging element). If obfuscation was the primary cause, this would raise concerns about harming user autonomy. We thus conducted a follow-up study to test the effect of warning labels and obfuscations separately. According to our findings, obfuscations run the risk of manipulating behavior instead of guiding it, while warning labels without obfuscations (purely reflective) do not exhaust processing capacities but encourage users to actively choose to decrease engagement with attitude-confirming search results. Therefore, given the risks and unclear benefits of obfuscations and potentially other automatic nudging elements to guide engagement with information, we call for prioritizing interventions that aim to enhance human cognitive skills and agency instead.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"41 23","pages":""},"PeriodicalIF":3.5,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bridging Performance of Twitter Users: A Predictor of Subjective Well-Being during the Pandemic Twitter用户的桥接性能:大流行期间主观幸福感的预测因子

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-11-30 DOI: 10.1145/3635033

Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang

The outbreak of the COVID-19 pandemic triggered the perils of misinformation over social media. By amplifying the spreading speed and popularity of trustworthy information, influential social media users have been helping overcome the negative impacts of such flooding misinformation. In this paper, we use the COVID-19 pandemic as a representative global health crisis and and examine the impact of the COVID-19 pandemic on these influential users’ subjective well-being (SWB), one of the most important indicators of mental health. We leverage Twitter as a representative social media platform and conduct the analysis with our collection of 37,281,824 tweets spanning almost two years. To identify influential Twitter users, we propose a new measurement called user bridging performance (UBM) to evaluate the speed and wideness gain of information transmission due to their sharing. With our tweet collection, we manage to reveal the more significant mental sufferings of influential users during the COVID-19 pandemic. According to this observation, through comprehensive hierarchical multiple regression analysis, we are the first to discover the strong relationship between individual social users’ subjective well-being and their bridging performance. We proceed to extend bridging performance from individuals to user subgroups. The new measurement allows us to conduct a subgroup analysis according to users’ multilingualism and confirm the bridging role of multilingual users in the COVID-19 information propagation. We also find that multilingual users not only suffer from a much lower SWB in the pandemic, but also experienced a more significant SWB drop.

COVID-19大流行的爆发引发了社交媒体上错误信息的危险。通过放大可信信息的传播速度和受欢迎程度，有影响力的社交媒体用户一直在帮助克服这种泛滥的错误信息的负面影响。在本文中，我们将COVID-19大流行作为具有代表性的全球健康危机，并研究了COVID-19大流行对这些有影响力的用户主观幸福感(SWB)的影响，这是心理健康最重要的指标之一。我们利用Twitter作为一个具有代表性的社交媒体平台，对近两年来收集的37,281,824条推文进行了分析。为了识别有影响力的Twitter用户，我们提出了一个名为用户桥接性能(UBM)的新度量来评估由于他们的共享而导致的信息传播的速度和广度增益。通过我们的推文收集，我们设法揭示了在COVID-19大流行期间有影响力的用户更重大的精神痛苦。根据这一观察，通过综合层次多元回归分析，我们首次发现了个人社交用户的主观幸福感与其桥接绩效之间存在很强的关系。我们继续将桥接性能从个人扩展到用户子组。新的衡量标准使我们能够根据用户的多语种进行分组分析，并确认多语种用户在COVID-19信息传播中的桥梁作用。我们还发现，在大流行期间，多语种用户不仅SWB低得多，而且SWB下降幅度更大。

{"title":"Bridging Performance of Twitter Users: A Predictor of Subjective Well-Being during the Pandemic","authors":"Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang","doi":"10.1145/3635033","DOIUrl":"https://doi.org/10.1145/3635033","url":null,"abstract":"The outbreak of the COVID-19 pandemic triggered the perils of misinformation over social media. By amplifying the spreading speed and popularity of trustworthy information, influential social media users have been helping overcome the negative impacts of such flooding misinformation. In this paper, we use the COVID-19 pandemic as a representative global health crisis and and examine the impact of the COVID-19 pandemic on these influential users’ subjective well-being (SWB), one of the most important indicators of mental health. We leverage Twitter as a representative social media platform and conduct the analysis with our collection of 37,281,824 tweets spanning almost two years. To identify influential Twitter users, we propose a new measurement called user bridging performance (UBM) to evaluate the speed and wideness gain of information transmission due to their sharing. With our tweet collection, we manage to reveal the more significant mental sufferings of influential users during the COVID-19 pandemic. According to this observation, through comprehensive hierarchical multiple regression analysis, we are the first to discover the strong relationship between individual social users’ subjective well-being and their bridging performance. We proceed to extend bridging performance from individuals to user subgroups. The new measurement allows us to conduct a subgroup analysis according to users’ multilingualism and confirm the bridging role of multilingual users in the COVID-19 information propagation. We also find that multilingual users not only suffer from a much lower SWB in the pandemic, but also experienced a more significant SWB drop.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"41 22","pages":""},"PeriodicalIF":3.5,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiresolution Local Spectral Attributed Community Search 多分辨率局部光谱属性社区搜索

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-11-03 DOI: 10.1145/3624580

Qingqing Li, Huifang Ma, Zhixin Li, Liang Chang

Community search has become especially important in graph analysis task, which aims to identify latent members of a particular community from a few given nodes. Most of the existing efforts in community search focus on exploring the community structure with a single scale in which the given nodes are located. Despite promising results, the following two insights are often neglected. First, node attributes provide rich and highly related auxiliary information apart from network interactions for characterizing the node properties. Attributes may indicate the community assignment of a node with very few links, which would be difficult to determine from the network structure alone. Second, the multiresolution community affords latent information to depict the hierarchical relation of the network and ensure that one of them is closest to the real one. It is essential for users to understand the underlying structure of the network and explore the community with strong structure and attribute cohesiveness at disparate scales. These aspects motivate us to develop a new community search framework called Multiresolution Local Spectral Attributed Community Search (MLSACS). Specifically, inspired by the local modularity, graph wavelets, and scaling functions, we propose a new Multiresolution Local modularity (MLQ) based on a reconstructed node attribute graph. Furthermore, to detect local communities with cohesive structures and attributes at different scales, a sparse indicator vector is developed based on MLQ by solving a linear programming problem. Extensive experimental results on both synthetic and real-world attributed graphs have demonstrated the detected communities are meaningful and the scale can be changed reasonably.

社区搜索在图分析任务中变得尤为重要，它旨在从几个给定的节点中识别特定社区的潜在成员。现有的社区搜索大多集中在探索给定节点所在的单一尺度的社区结构。尽管结果很有希望，但以下两个见解经常被忽视。首先，除了网络交互之外，节点属性还提供了丰富且高度相关的辅助信息来描述节点属性。属性可能表示链路很少的节点的团体分配，这很难单独从网络结构中确定。其次，多分辨率社区提供了潜在的信息来描述网络的层次关系，并确保其中一个最接近真实的网络。用户必须了解网络的底层结构，并在不同尺度上探索具有强结构和属性内聚性的社区。这些方面促使我们开发了一个新的社区搜索框架，称为多分辨率局部光谱属性社区搜索(MLSACS)。具体来说，在局部模块化、图小波和尺度函数的启发下，我们提出了一种基于重构节点属性图的多分辨率局部模块化(MLQ)方法。在此基础上，通过求解线性规划问题，建立了基于MLQ的稀疏指示向量，用于在不同尺度下检测具有内聚结构和属性的局部群落。在合成属性图和真实属性图上的大量实验结果表明，检测到的群落是有意义的，尺度可以合理地改变。

{"title":"Multiresolution Local Spectral Attributed Community Search","authors":"Qingqing Li, Huifang Ma, Zhixin Li, Liang Chang","doi":"10.1145/3624580","DOIUrl":"https://doi.org/10.1145/3624580","url":null,"abstract":"Community search has become especially important in graph analysis task, which aims to identify latent members of a particular community from a few given nodes. Most of the existing efforts in community search focus on exploring the community structure with a single scale in which the given nodes are located. Despite promising results, the following two insights are often neglected. First, node attributes provide rich and highly related auxiliary information apart from network interactions for characterizing the node properties. Attributes may indicate the community assignment of a node with very few links, which would be difficult to determine from the network structure alone. Second, the multiresolution community affords latent information to depict the hierarchical relation of the network and ensure that one of them is closest to the real one. It is essential for users to understand the underlying structure of the network and explore the community with strong structure and attribute cohesiveness at disparate scales. These aspects motivate us to develop a new community search framework called Multiresolution Local Spectral Attributed Community Search (MLSACS). Specifically, inspired by the local modularity, graph wavelets, and scaling functions, we propose a new Multiresolution Local modularity (MLQ) based on a reconstructed node attribute graph. Furthermore, to detect local communities with cohesive structures and attributes at different scales, a sparse indicator vector is developed based on MLQ by solving a linear programming problem. Extensive experimental results on both synthetic and real-world attributed graphs have demonstrated the detected communities are meaningful and the scale can be changed reasonably.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"185 1‐6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135775908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Triangle-oriented Community Detection considering Node Features and Network Topology 考虑节点特征和网络拓扑的面向三角形的社区检测

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-11-03 DOI: 10.1145/3626190

Guangliang Gao, Weichao Liang, Ming Yuan, Hanwei Qian, Qun Wang, Jie Cao

The joint use of node features and network topology to detect communities is called community detection in attributed networks. Most of the existing work along this line has been carried out through objective function optimization and has proposed numerous approaches. However, they tend to focus only on lower-order details, i.e., capture node features and network topology from node and edge views, and purely seek a higher degree of optimization to guarantee the quality of the found communities, which exacerbates unbalanced communities and free-rider effect. To further clarify and reveal the intrinsic nature of networks, we conduct triangle-oriented community detection considering node features and network topology. Specifically, we first introduce a triangle-based quality metric to preserve higher-order details of node features and network topology, and then formulate so-called two-level constraints to encode lower-order details of node features and network topology. Finally, we develop a local search framework based on optimizing our objective function consisting of the proposed quality metric and two-level constraints to achieve both non-overlapping and overlapping community detection in attributed networks. Extensive experiments demonstrate the effectiveness and efficiency of our framework and its potential in alleviating unbalanced communities and free-rider effect.

在属性网络中，联合利用节点特征和网络拓扑结构来检测社区被称为社区检测。这方面的大部分现有工作都是通过目标函数优化进行的，并提出了许多方法。然而，他们往往只关注低阶细节，即从节点和边缘视图中捕捉节点特征和网络拓扑，单纯寻求更高程度的优化来保证所发现社区的质量，从而加剧了社区的不平衡和搭便车效应。为了进一步阐明和揭示网络的内在本质，我们考虑节点特征和网络拓扑进行面向三角形的社区检测。具体来说，我们首先引入基于三角形的质量度量来保留节点特征和网络拓扑的高阶细节，然后制定所谓的两级约束来编码节点特征和网络拓扑的低阶细节。最后，我们开发了一个基于优化目标函数的局部搜索框架，该目标函数由提出的质量度量和两级约束组成，以实现属性网络中的非重叠和重叠社区检测。广泛的实验证明了我们的框架的有效性和效率，以及它在缓解不平衡社区和搭便车效应方面的潜力。

{"title":"Triangle-oriented Community Detection considering Node Features and Network Topology","authors":"Guangliang Gao, Weichao Liang, Ming Yuan, Hanwei Qian, Qun Wang, Jie Cao","doi":"10.1145/3626190","DOIUrl":"https://doi.org/10.1145/3626190","url":null,"abstract":"The joint use of node features and network topology to detect communities is called community detection in attributed networks. Most of the existing work along this line has been carried out through objective function optimization and has proposed numerous approaches. However, they tend to focus only on lower-order details, i.e., capture node features and network topology from node and edge views, and purely seek a higher degree of optimization to guarantee the quality of the found communities, which exacerbates unbalanced communities and free-rider effect. To further clarify and reveal the intrinsic nature of networks, we conduct triangle-oriented community detection considering node features and network topology. Specifically, we first introduce a triangle-based quality metric to preserve higher-order details of node features and network topology, and then formulate so-called two-level constraints to encode lower-order details of node features and network topology. Finally, we develop a local search framework based on optimizing our objective function consisting of the proposed quality metric and two-level constraints to achieve both non-overlapping and overlapping community detection in attributed networks. Extensive experiments demonstrate the effectiveness and efficiency of our framework and its potential in alleviating unbalanced communities and free-rider effect.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"180 S451","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135775407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Graph-Based Context-Aware Model to Understand Online Conversations 一个基于图的上下文感知模型来理解在线对话

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-11-03 DOI: 10.1145/3624579

Vibhor Agarwal, Anthony P. Young, Sagar Joglekar, Nishanth Sastry

Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment or a pair of comments depending upon whether the task concerns the inference of properties of the individual comments or the replies between pairs of comments, respectively. However, in online conversations, comments and replies may be based on external context beyond the immediately relevant information that is input to the model. Therefore, being aware of the conversations’ surrounding contexts should improve the model’s performance for the inference task at hand. We propose GraphNLI , 1 a novel graph-based deep learning architecture that uses graph walks to incorporate the wider context of a conversation in a principled manner. Specifically, a graph walk starts from a given comment and samples “nearby” comments in the same or parallel conversation threads, which results in additional embeddings that are aggregated together with the initial comment’s embedding. We then use these enriched embeddings for downstream NLP prediction tasks that are important for online conversations. We evaluate GraphNLI on two such tasks - polarity prediction and misogynistic hate speech detection - and find that our model consistently outperforms all relevant baselines for both tasks. Specifically, GraphNLI with a biased root-seeking random walk performs with a macro- F 1 score of 3 and 6 percentage points better than the best-performing BERT-based baselines for the polarity prediction and hate speech detection tasks, respectively. We also perform extensive ablative experiments and hyperparameter searches to understand the efficacy of GraphNLI. This demonstrates the potential of context-aware models to capture the global context along with the local context of online conversations for these two tasks.

允许用户参与的在线论坛对许多重要问题的公开讨论具有变革性作用。然而，这样的对话有时会升级为全面的仇恨和错误信息的交流。现有的自然语言处理(NLP)方法，如用于分类任务的深度学习模型，只使用单个评论或一对评论作为输入，这取决于任务是否涉及单个评论的属性推理或评论对之间的回复。然而，在在线对话中，评论和回复可能基于输入到模型的直接相关信息之外的外部上下文。因此，意识到对话的周围上下文应该可以提高模型在手头推理任务中的性能。我们提出了GraphNLI, 1这是一种新颖的基于图的深度学习架构，它使用图行走以一种有原则的方式将更广泛的对话上下文结合起来。具体来说，图遍历从给定的评论开始，并在相同或并行的会话线程中采样“附近”的评论，这将导致与初始评论的嵌入聚合在一起的附加嵌入。然后，我们将这些丰富的嵌入用于下游NLP预测任务，这对在线对话很重要。我们在两个这样的任务上评估了GraphNLI——极性预测和厌女仇恨言论检测——并发现我们的模型在这两个任务上的表现始终优于所有相关的基线。具体来说，在极性预测和仇恨言论检测任务中，带有偏见寻根随机漫步的graphhnli的宏观f1得分分别比表现最好的基于bert的基线高3和6个百分点。我们还进行了广泛的烧蚀实验和超参数搜索，以了解GraphNLI的功效。这证明了上下文感知模型在捕获全局上下文以及用于这两个任务的在线对话的本地上下文方面的潜力。

{"title":"A Graph-Based Context-Aware Model to Understand Online Conversations","authors":"Vibhor Agarwal, Anthony P. Young, Sagar Joglekar, Nishanth Sastry","doi":"10.1145/3624579","DOIUrl":"https://doi.org/10.1145/3624579","url":null,"abstract":"Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment or a pair of comments depending upon whether the task concerns the inference of properties of the individual comments or the replies between pairs of comments, respectively. However, in online conversations, comments and replies may be based on external context beyond the immediately relevant information that is input to the model. Therefore, being aware of the conversations’ surrounding contexts should improve the model’s performance for the inference task at hand. We propose GraphNLI , 1 a novel graph-based deep learning architecture that uses graph walks to incorporate the wider context of a conversation in a principled manner. Specifically, a graph walk starts from a given comment and samples “nearby” comments in the same or parallel conversation threads, which results in additional embeddings that are aggregated together with the initial comment’s embedding. We then use these enriched embeddings for downstream NLP prediction tasks that are important for online conversations. We evaluate GraphNLI on two such tasks - polarity prediction and misogynistic hate speech detection - and find that our model consistently outperforms all relevant baselines for both tasks. Specifically, GraphNLI with a biased root-seeking random walk performs with a macro- F 1 score of 3 and 6 percentage points better than the best-performing BERT-based baselines for the polarity prediction and hate speech detection tasks, respectively. We also perform extensive ablative experiments and hyperparameter searches to understand the efficacy of GraphNLI. This demonstrates the potential of context-aware models to capture the global context along with the local context of online conversations for these two tasks.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"185 S499","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135775907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

UNDERSTANDING RUG PULLS: AN IN-DEPTH BEHAVIORAL ANALYSIS OF FRAUDULENT NFT CREATORS 理解小伎俩:对欺诈性NFT创造者的深入行为分析

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-10-11 DOI: 10.1145/3623376

Trishie Sharma, Rachit Agarwal, Sandeep Kumar Shukla

The explosive growth of non-fungible tokens (NFTs) on Web3 has created a new frontier for digital art and collectibles and an emerging space for fraudulent activities. This study provides an in-depth analysis of NFT rug pulls, the fraudulent schemes that steal investors’ funds. From a curated dataset of 760 rug pulls across 10 NFT marketplaces, we examine these schemes’ structural and behavioral properties, identify the characteristics and motivations of rug pullers, and classify NFT projects into 20 groups based on creators’ association with their accounts. Our findings reveal that repeated rug pulls account for a significant proportion of the rise in NFT-related cryptocurrency crimes, with one NFT creator attempting 37 rug pulls within 3 months. Additionally, we identify the largest group of creators influencing the majority of rug pulls and demonstrate the connection between rug pullers of different NFT projects using the same wallets to store and move money. Our study contributes to understanding NFT market risks and provides insights for designing preventative strategies to mitigate future losses.

Web3上不可替代代币(nft)的爆炸式增长为数字艺术和收藏品创造了一个新的前沿，也为欺诈活动创造了一个新兴的空间。这项研究提供了一个深入的分析NFT地毯拉，欺诈计划，窃取投资者的资金。从10个NFT市场的760个拉地毯的精心整理的数据集中，我们研究了这些方案的结构和行为特性，确定了拉地毯者的特征和动机，并根据创造者与他们的账户的关联将NFT项目分为20组。我们的研究结果显示，在与NFT相关的加密货币犯罪的上升中，反复的“拉地毯”占很大比例，一名NFT创建者在3个月内尝试了37次“拉地毯”。此外，我们确定了影响大多数“地毯拉”的最大创造者群体，并展示了使用相同钱包存储和转移资金的不同NFT项目的“地毯拉”者之间的联系。我们的研究有助于理解NFT市场风险，并为设计预防策略以减轻未来损失提供见解。

引用次数: 2

Adoption of Recurrent Innovations: A Large-Scale Case Study on Mobile App Updates 采用循环创新:手机应用更新的大规模案例研究

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-10-10 DOI: 10.1145/3626189

Fuqi Lin, Xuan Lu, Wei Ai, Huoran Li, Yun Ma, Yulian Yang, Hongfei Deng, Qingxiang Wang, Qiaozhu Mei, Xuanzhe Liu

The diffusion of innovations theory has been studied for years. Previous research efforts mainly focus on key elements, adopter categories, and the process of innovation diffusion. However, most of them only consider single innovations. With the development of modern technology, recurrent innovations gradually come into vogue. In order to reveal the characteristics of recurrent innovations, we present the first large-scale analysis of the adoption of recurrent innovations in the context of mobile app updates. Our analysis reveals the adoption behavior and new adopter categories of recurrent innovations as well as the features that have impact on the process of adoption.

创新扩散理论已被研究多年。以往的研究主要集中在创新扩散的关键要素、采用者类别和过程等方面。然而，他们中的大多数只考虑单一的创新。随着现代技术的发展，反复创新逐渐流行起来。为了揭示循环创新的特征，我们首次对移动应用程序更新背景下的循环创新采用情况进行了大规模分析。我们的分析揭示了重复创新的采用行为和新采用者类别，以及对采用过程产生影响的特征。

引用次数: 0

CLHHN: Category-aware Lossless Heterogeneous Hypergraph Neural Network for Session-based Recommendation 基于会话推荐的类别感知无损异构超图神经网络

4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on the Web

Pub Date : 2023-10-04 DOI: 10.1145/3626569

Yutao Ma, Zesheng Wang, Liwei Huang, Jian Wang

In recent years, session-based recommendation (SBR), which seeks to predict the target user’s next click based on anonymous interaction sequences, has drawn increasing interest for its practicality. The key to completing the SBR task is modeling user intent accurately. Due to the popularity of graph neural networks (GNNs), most state-of-the-art (SOTA) SBR approaches attempt to model user intent from the transitions among items in a session with GNNs. Despite their accomplishments, there are still two limitations. Firstly, most existing SBR approaches utilize limited information from short user-item interaction sequences and suffer from the data sparsity problem of session data. Secondly, most GNN-based SBR approaches describe pairwise relations between items while neglecting complex and high-order data relations. Although some recent studies based on hypergraph neural networks (HGNNs) have been proposed to model complex and high-order relations, they usually output unsatisfactory results due to insufficient relation modeling and information loss. To this end, we propose a category-aware lossless heterogeneous hypergraph neural network (CLHHN) in this article to recommend possible items to the target users by leveraging the category of items. More specifically, we convert each category-aware session sequence with repeated user clicks into a lossless heterogeneous hypergraph consisting of item and category nodes as well as three types of hyperedges, each of which can capture specific relations to reflect various user intents. Then, we design an attention-based lossless hypergraph convolutional network to generate session-wise and multi-granularity intent-aware item representations. Experiments on three real-world datasets indicate that CLHHN can outperform the SOTA models in making a better trade-off between prediction performance and training efficiency. An ablation study also demonstrates the necessity of CLHHN’s key components.

基于会话的推荐(session-based recommendation, SBR)是一种基于匿名交互序列预测目标用户下一次点击的方法，近年来由于其实用性而受到越来越多的关注。完成SBR任务的关键是对用户意图进行准确建模。由于图神经网络(gnn)的流行，大多数最先进的(SOTA) SBR方法试图通过使用gnn的会话中项目之间的转换来建模用户意图。尽管他们取得了成就，但仍有两个局限性。首先，大多数现有的SBR方法利用了短用户-项目交互序列的有限信息，并且存在会话数据的数据稀疏性问题。其次，大多数基于gnn的SBR方法描述了项目之间的成对关系，而忽略了复杂的高阶数据关系。近年来，一些基于超图神经网络(hypergraph neural networks, hgnn)的研究提出了对复杂的高阶关系进行建模，但由于关系建模不足和信息丢失，结果往往不理想。为此，我们在本文中提出了一个类别感知无损异构超图神经网络(CLHHN)，通过利用物品的类别向目标用户推荐可能的物品。更具体地说，我们将每个具有重复用户点击的类别感知会话序列转换为一个无损的异构超图，该超图由项目和类别节点以及三种类型的超边组成，每种超边都可以捕获特定关系以反映各种用户意图。然后，我们设计了一个基于注意力的无损超图卷积网络来生成会话智能和多粒度意图感知的项目表示。在三个真实数据集上的实验表明，CLHHN在预测性能和训练效率之间取得了更好的平衡，优于SOTA模型。消融研究也证明了CLHHN关键组成部分的必要性。

{"title":"CLHHN: Category-aware Lossless Heterogeneous Hypergraph Neural Network for Session-based Recommendation","authors":"Yutao Ma, Zesheng Wang, Liwei Huang, Jian Wang","doi":"10.1145/3626569","DOIUrl":"https://doi.org/10.1145/3626569","url":null,"abstract":"In recent years, session-based recommendation (SBR), which seeks to predict the target user’s next click based on anonymous interaction sequences, has drawn increasing interest for its practicality. The key to completing the SBR task is modeling user intent accurately. Due to the popularity of graph neural networks (GNNs), most state-of-the-art (SOTA) SBR approaches attempt to model user intent from the transitions among items in a session with GNNs. Despite their accomplishments, there are still two limitations. Firstly, most existing SBR approaches utilize limited information from short user-item interaction sequences and suffer from the data sparsity problem of session data. Secondly, most GNN-based SBR approaches describe pairwise relations between items while neglecting complex and high-order data relations. Although some recent studies based on hypergraph neural networks (HGNNs) have been proposed to model complex and high-order relations, they usually output unsatisfactory results due to insufficient relation modeling and information loss. To this end, we propose a category-aware lossless heterogeneous hypergraph neural network (CLHHN) in this article to recommend possible items to the target users by leveraging the category of items. More specifically, we convert each category-aware session sequence with repeated user clicks into a lossless heterogeneous hypergraph consisting of item and category nodes as well as three types of hyperedges, each of which can capture specific relations to reflect various user intents. Then, we design an attention-based lossless hypergraph convolutional network to generate session-wise and multi-granularity intent-aware item representations. Experiments on three real-world datasets indicate that CLHHN can outperform the SOTA models in making a better trade-off between prediction performance and training efficiency. An ablation study also demonstrates the necessity of CLHHN’s key components.","PeriodicalId":50940,"journal":{"name":"ACM Transactions on the Web","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135645181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0