首页 > 最新文献

Proceedings of the Web Conference 2021最新文献

英文 中文
A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Learning 一种用于低秩矩阵学习的可伸缩、自适应、健全的非凸正则化器
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3450142
Yaqing Wang, Quanming Yao, J. Kwok
Matrix learning is at the core of many machine learning problems. A number of real-world applications such as collaborative filtering and text mining can be formulated as a low-rank matrix completion problems, which recovers incomplete matrix using low-rank assumptions. To ensure that the matrix solution has a low rank, a recent trend is to use nonconvex regularizers that adaptively penalize singular values. They offer good recovery performance and have nice theoretical properties, but are computationally expensive due to repeated access to individual singular values. In this paper, based on the key insight that adaptive shrinkage on singular values improve empirical performance, we propose a new nonconvex low-rank regularizer called ”nuclear norm minus Frobenius norm” regularizer, which is scalable, adaptive and sound. We first show it provably holds the adaptive shrinkage property. Further, we discover its factored form which bypasses the computation of singular values and allows fast optimization by general optimization algorithms. Stable recovery and convergence are guaranteed. Extensive low-rank matrix completion experiments on a number of synthetic and real-world data sets show that the proposed method obtains state-of-the-art recovery performance while being the fastest in comparison to existing low-rank matrix learning methods. 1
矩阵学习是许多机器学习问题的核心。许多现实世界的应用,如协同过滤和文本挖掘,可以被表述为一个低秩矩阵补全问题,它使用低秩假设来恢复不完整矩阵。为了确保矩阵解具有低秩,最近的趋势是使用自适应惩罚奇异值的非凸正则化器。它们提供了良好的恢复性能和良好的理论性质,但由于重复访问单个奇异值,计算成本很高。本文基于对奇异值的自适应收缩提高经验性能的关键见解,提出了一种新的非凸低秩正则化器,称为“核范数减去Frobenius范数”正则化器,该正则化器具有可扩展性、自适应性和可靠性。我们首先证明它具有可证明的自适应收缩特性。进一步,我们发现了它的分解形式,它绕过了奇异值的计算,并允许通过一般优化算法进行快速优化。保证稳定的恢复和收敛。在大量合成和真实数据集上进行的大量低秩矩阵补全实验表明,与现有的低秩矩阵学习方法相比,该方法获得了最先进的恢复性能,同时速度最快。1
{"title":"A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Learning","authors":"Yaqing Wang, Quanming Yao, J. Kwok","doi":"10.1145/3442381.3450142","DOIUrl":"https://doi.org/10.1145/3442381.3450142","url":null,"abstract":"Matrix learning is at the core of many machine learning problems. A number of real-world applications such as collaborative filtering and text mining can be formulated as a low-rank matrix completion problems, which recovers incomplete matrix using low-rank assumptions. To ensure that the matrix solution has a low rank, a recent trend is to use nonconvex regularizers that adaptively penalize singular values. They offer good recovery performance and have nice theoretical properties, but are computationally expensive due to repeated access to individual singular values. In this paper, based on the key insight that adaptive shrinkage on singular values improve empirical performance, we propose a new nonconvex low-rank regularizer called ”nuclear norm minus Frobenius norm” regularizer, which is scalable, adaptive and sound. We first show it provably holds the adaptive shrinkage property. Further, we discover its factored form which bypasses the computation of singular values and allows fast optimization by general optimization algorithms. Stable recovery and convergence are guaranteed. Extensive low-rank matrix completion experiments on a number of synthetic and real-world data sets show that the proposed method obtains state-of-the-art recovery performance while being the fastest in comparison to existing low-rank matrix learning methods. 1","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130382470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
GNEM: A Generic One-to-Set Neural Entity Matching Framework 通用的一对集神经实体匹配框架
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3450119
Runjin Chen, Yanyan Shen, Dongxiang Zhang
Entity Matching is a classic research problem in any data analytics pipeline, aiming to identify records referring to the same real-world entity. It plays an important role in data cleansing and integration. Advanced entity matching techniques focus on extracting syntactic or semantic features from record pairs via complex neural architectures or pre-trained language models. However, the performances always suffer from noisy or missing attribute values in the records. We observe that comparing one record with several relevant records in a collective manner allows each pairwise matching decision to be made by borrowing valuable insights from other pairs, which is beneficial to the overall matching performance. In this paper, we propose a generic one-to-set neural framework named GNEM for entity matching. GNEM predicts matching labels between one record and a set of relevant records simultaneously. It constructs a record pair graph with weighted edges and adopts the graph neural network to propagate information among pairs. We further show that GNEM can be interpreted as an extension and generalization of the existing pairwise matching techniques. Extensive experiments on real-world data sets demonstrate that GNEM consistently outperforms the existing pairwise entity matching techniques and achieves up to 8.4% improvement on F1-Score compared with the state-of-the-art neural methods.
实体匹配是任何数据分析管道中的经典研究问题,旨在识别引用相同现实世界实体的记录。它在数据清理和集成中起着重要的作用。高级实体匹配技术侧重于通过复杂的神经结构或预训练的语言模型从记录对中提取语法或语义特征。然而,表演总是受到记录中嘈杂或缺失属性值的影响。我们观察到,以集体的方式将一条记录与几条相关记录进行比较,可以使每对配对决策通过借鉴其他对的有价值的见解来做出,这有利于整体匹配性能。本文提出了一种通用的一对集神经网络框架GNEM用于实体匹配。GNEM同时预测一条记录和一组相关记录之间的匹配标签。构造了带加权边的记录对图,并采用图神经网络在记录对之间传播信息。我们进一步证明GNEM可以被解释为现有成对匹配技术的扩展和推广。在真实数据集上的大量实验表明,GNEM始终优于现有的成对实体匹配技术,与最先进的神经方法相比,F1-Score提高了8.4%。
{"title":"GNEM: A Generic One-to-Set Neural Entity Matching Framework","authors":"Runjin Chen, Yanyan Shen, Dongxiang Zhang","doi":"10.1145/3442381.3450119","DOIUrl":"https://doi.org/10.1145/3442381.3450119","url":null,"abstract":"Entity Matching is a classic research problem in any data analytics pipeline, aiming to identify records referring to the same real-world entity. It plays an important role in data cleansing and integration. Advanced entity matching techniques focus on extracting syntactic or semantic features from record pairs via complex neural architectures or pre-trained language models. However, the performances always suffer from noisy or missing attribute values in the records. We observe that comparing one record with several relevant records in a collective manner allows each pairwise matching decision to be made by borrowing valuable insights from other pairs, which is beneficial to the overall matching performance. In this paper, we propose a generic one-to-set neural framework named GNEM for entity matching. GNEM predicts matching labels between one record and a set of relevant records simultaneously. It constructs a record pair graph with weighted edges and adopts the graph neural network to propagate information among pairs. We further show that GNEM can be interpreted as an extension and generalization of the existing pairwise matching techniques. Extensive experiments on real-world data sets demonstrate that GNEM consistently outperforms the existing pairwise entity matching techniques and achieves up to 8.4% improvement on F1-Score compared with the state-of-the-art neural methods.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Twin Peaks, a Model for Recurring Cascades 双峰,一个重复级联的模型
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449807
Matteo Almanza, Silvio Lattanzi, A. Panconesi, G. Re
Understanding information dynamics and their resulting cascades is a central topic in social network analysis. In a recent seminal work, Cheng et al. analyzed multiples cascades on Facebook over several months, and noticed that many of them exhibit a recurring behaviour. They tend to have multiple peaks of popularity, with periods of quiescence in between. In this paper, we propose the first mathematical model that provably explains this interesting phenomenon, besides exhibiting other fundamental properties of information cascades. Our model is simple and shows that it is enough to have a good clustering structure to observe this interesting recurring behaviour with a standard information diffusion model. Furthermore, we complement our theoretical analysis with an experimental evaluation where we show that our model is able to reproduce the observed phenomenon on several social networks.
理解信息动态及其产生的级联是社会网络分析的中心主题。在最近的一项开创性工作中,Cheng等人在几个月内分析了Facebook上的多个级联,并注意到其中许多人表现出反复出现的行为。他们往往有多个人气高峰,中间有一段沉寂期。在本文中,我们提出了第一个可以证明解释这一有趣现象的数学模型,并展示了信息级联的其他基本特性。我们的模型很简单,并且表明用标准的信息扩散模型来观察这种有趣的重复行为,只要有一个良好的聚类结构就足够了。此外,我们用实验评估来补充我们的理论分析,我们表明我们的模型能够在几个社交网络上重现观察到的现象。
{"title":"Twin Peaks, a Model for Recurring Cascades","authors":"Matteo Almanza, Silvio Lattanzi, A. Panconesi, G. Re","doi":"10.1145/3442381.3449807","DOIUrl":"https://doi.org/10.1145/3442381.3449807","url":null,"abstract":"Understanding information dynamics and their resulting cascades is a central topic in social network analysis. In a recent seminal work, Cheng et al. analyzed multiples cascades on Facebook over several months, and noticed that many of them exhibit a recurring behaviour. They tend to have multiple peaks of popularity, with periods of quiescence in between. In this paper, we propose the first mathematical model that provably explains this interesting phenomenon, besides exhibiting other fundamental properties of information cascades. Our model is simple and shows that it is enough to have a good clustering structure to observe this interesting recurring behaviour with a standard information diffusion model. Furthermore, we complement our theoretical analysis with an experimental evaluation where we show that our model is able to reproduce the observed phenomenon on several social networks.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132419160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
ATJ-Net: Auto-Table-Join Network for Automatic Learning on Relational Databases 关系型数据库自动学习的自动表连接网络
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449980
Jinze Bai, Jialin Wang, Zhao Li, Donghui Ding, Ji Zhang, Jun Gao
A relational database, consisting of multiple tables, provides heterogeneous information across various entities, widely used in real-world services. This paper studies the supervised learning task on multiple tables, aiming to predict one label column with the help of multiple-tabular data. However, classical ML techniques mainly focus on single-tabular data. Multiple-tabular data refers to many-to-many mapping among joinable attributes and n-ary relations, which cannot be utilized directly by classical ML techniques. Besides, current graph techniques, like heterogeneous information network (HIN) and graph neural networks (GNN), are infeasible to be deployed directly and automatically in a multi-table environment, which limits the learning on databases. For automatic learning on relational databases, we propose an auto-table-join network (ATJ-Net). Multiple tables with relationships are considered as a hypergraph, where vertices are joinable attributes and hyperedges are tuples of tables. Then, ATJ-Net builds a graph neural network on the heterogeneous hypergraph, which samples and aggregates the vertices and hyperedges on n-hop sub-graphs as the receptive field. In order to enable ATJ-Net to be automatically deployed to different datasets and avoid the ”no free lunch” dilemma, we use random architecture search to select optimal aggregators and prune redundant paths in the network. For verifying the effectiveness of our methods across various tasks and schema, we conduct extensive experiments on 4 tasks, 8 various schemas, and 19 sub-datasets w.r.t. citing prediction, review classification, recommendation, and task-blind challenge. ATJ-Net achieves the best performance over state-of-the-art approaches on three tasks and is competitive with KddCup Winner solution on task-blind challenge.
关系数据库由多个表组成,提供跨各种实体的异构信息,广泛用于实际服务中。本文研究了多表的监督学习任务,旨在利用多表数据预测一个标签列。然而,经典的ML技术主要关注单表数据。多表数据是指可连接属性和n元关系之间的多对多映射,这是经典ML技术无法直接利用的。此外,目前的图技术,如异构信息网络(HIN)和图神经网络(GNN),都无法在多表环境下直接自动部署,这限制了对数据库的学习。对于关系型数据库的自动学习,我们提出了一个自动表连接网络(ATJ-Net)。具有关系的多个表被视为一个超图,其中顶点是可连接的属性,超边是表的元组。然后,ATJ-Net在异构超图上构建图神经网络,对n跳子图上的顶点和超边进行采样和聚合,作为接收场。为了使ATJ-Net能够自动部署到不同的数据集上,避免“没有免费的午餐”的困境,我们使用随机架构搜索来选择最优聚合器,并修剪网络中的冗余路径。为了验证我们的方法在不同任务和模式下的有效性,我们在4个任务、8种不同模式和19个子数据集上进行了广泛的实验,引用了预测、评论分类、推荐和任务盲挑战。ATJ-Net在三个任务上实现了最先进的性能,在任务盲挑战上与KddCup Winner解决方案竞争。
{"title":"ATJ-Net: Auto-Table-Join Network for Automatic Learning on Relational Databases","authors":"Jinze Bai, Jialin Wang, Zhao Li, Donghui Ding, Ji Zhang, Jun Gao","doi":"10.1145/3442381.3449980","DOIUrl":"https://doi.org/10.1145/3442381.3449980","url":null,"abstract":"A relational database, consisting of multiple tables, provides heterogeneous information across various entities, widely used in real-world services. This paper studies the supervised learning task on multiple tables, aiming to predict one label column with the help of multiple-tabular data. However, classical ML techniques mainly focus on single-tabular data. Multiple-tabular data refers to many-to-many mapping among joinable attributes and n-ary relations, which cannot be utilized directly by classical ML techniques. Besides, current graph techniques, like heterogeneous information network (HIN) and graph neural networks (GNN), are infeasible to be deployed directly and automatically in a multi-table environment, which limits the learning on databases. For automatic learning on relational databases, we propose an auto-table-join network (ATJ-Net). Multiple tables with relationships are considered as a hypergraph, where vertices are joinable attributes and hyperedges are tuples of tables. Then, ATJ-Net builds a graph neural network on the heterogeneous hypergraph, which samples and aggregates the vertices and hyperedges on n-hop sub-graphs as the receptive field. In order to enable ATJ-Net to be automatically deployed to different datasets and avoid the ”no free lunch” dilemma, we use random architecture search to select optimal aggregators and prune redundant paths in the network. For verifying the effectiveness of our methods across various tasks and schema, we conduct extensive experiments on 4 tasks, 8 various schemas, and 19 sub-datasets w.r.t. citing prediction, review classification, recommendation, and task-blind challenge. ATJ-Net achieves the best performance over state-of-the-art approaches on three tasks and is competitive with KddCup Winner solution on task-blind challenge.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133039576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
DAPter: Preventing User Data Abuse in Deep Learning Inference Services DAPter:防止深度学习推理服务中的用户数据滥用
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449907
Hao Wu, Xuejin Tian, Yuhang Gong, Xing Su, Minghao Li, Fengyuan Xu
The data abuse issue has risen along with the widespread development of the deep learning inference service (DLIS). Specifically, mobile users worry about their input data being labeled to secretly train new deep learning models that are unrelated to the DLIS they subscribe to. This unique issue, unlike the privacy problem, is about the rights of data owners in the context of deep learning. However, preventing data abuse is demanding when considering the usability and generality in the mobile scenario. In this work, we propose, to our best knowledge, the first data abuse prevention mechanism called DAPter. DAPter is a user-side DLIS-input converter, which removes unnecessary information with respect to the targeted DLIS. The converted input data by DAPter maintains good inference accuracy and is difficult to be labeled manually or automatically for the new model training. DAPter’s conversion is empowered by our lightweight generative model trained with a novel loss function to minimize abusable information in the input data. Furthermore, adapting DAPter requires no change in the existing DLIS backend and models. We conduct comprehensive experiments with our DAPter prototype on mobile devices and demonstrate that DAPter can substantially raise the bar of the data abuse difficulty with little impact on the service quality and overhead.
随着深度学习推理服务(DLIS)的广泛发展,数据滥用问题也随之出现。具体来说,移动用户担心他们的输入数据被标记为秘密训练新的深度学习模型,而这些模型与他们订阅的DLIS无关。与隐私问题不同,这个独特的问题是关于深度学习背景下数据所有者的权利。然而,在考虑移动场景的可用性和通用性时,防止数据滥用是一项要求。在这项工作中,据我们所知,我们提出了第一个数据滥用预防机制,称为DAPter。dapper是一种用户端DLIS输入转换器,它可以去除与目标DLIS相关的不必要信息。经过dapper转换后的输入数据保持了良好的推理精度,难以对新模型训练进行人工或自动标注。我们的轻量级生成模型训练了一个新的损失函数,以最小化输入数据中的可滥用信息,从而增强了dapper的转换能力。此外,调整dapper不需要更改现有的lis后端和模型。我们在移动设备上对我们的DAPter原型进行了全面的实验,并证明DAPter可以在对服务质量和开销影响很小的情况下大幅提高数据滥用难度。
{"title":"DAPter: Preventing User Data Abuse in Deep Learning Inference Services","authors":"Hao Wu, Xuejin Tian, Yuhang Gong, Xing Su, Minghao Li, Fengyuan Xu","doi":"10.1145/3442381.3449907","DOIUrl":"https://doi.org/10.1145/3442381.3449907","url":null,"abstract":"The data abuse issue has risen along with the widespread development of the deep learning inference service (DLIS). Specifically, mobile users worry about their input data being labeled to secretly train new deep learning models that are unrelated to the DLIS they subscribe to. This unique issue, unlike the privacy problem, is about the rights of data owners in the context of deep learning. However, preventing data abuse is demanding when considering the usability and generality in the mobile scenario. In this work, we propose, to our best knowledge, the first data abuse prevention mechanism called DAPter. DAPter is a user-side DLIS-input converter, which removes unnecessary information with respect to the targeted DLIS. The converted input data by DAPter maintains good inference accuracy and is difficult to be labeled manually or automatically for the new model training. DAPter’s conversion is empowered by our lightweight generative model trained with a novel loss function to minimize abusable information in the input data. Furthermore, adapting DAPter requires no change in the existing DLIS backend and models. We conduct comprehensive experiments with our DAPter prototype on mobile devices and demonstrate that DAPter can substantially raise the bar of the data abuse difficulty with little impact on the service quality and overhead.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117232381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
WiseTrans: Adaptive Transport Protocol Selection for Mobile Web Service WiseTrans:移动Web服务自适应传输协议选择
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449958
Jia Zhang, Enhuan Dong, Zili Meng, Yuan Yang, Mingwei Xu, Sijie Yang, Miao Zhang, Yang Yue
To improve the performance of mobile web service, a new transport protocol, QUIC, has been recently proposed. However, for large-scale real-world deployments, deciding whether and when to use QUIC in mobile web service is challenging. Complex temporal correlation of network conditions, high spatial heterogeneity of users in a nationwide deployment, and limited resources on mobile devices all affect the selection of transport protocols. In this paper, we present WiseTrans to adaptively switch transport protocols for mobile web service online and improve the completion time of web requests. WiseTrans introduces machine learning techniques to deal with temporal heterogeneity, makes decisions with historical information to handle spatial heterogeneity, and switches transport protocols at the request level to reach both high performance and acceptable overhead. We implement WiseTrans on two platforms (Android and iOS) in a popular mobile web service application of Baidu. Comprehensive experiments demonstrate that WiseTrans can reduce request completion time by up to 26.5% on average compared to the usage of a single protocol.
为了提高移动web服务的性能,最近提出了一种新的传输协议QUIC。然而,对于大规模的实际部署,决定是否以及何时在移动web服务中使用QUIC是具有挑战性的。网络条件的复杂时间相关性、全国部署中用户的高空间异质性以及移动设备资源的有限性都会影响传输协议的选择。在本文中,我们提出了WiseTrans在线自适应切换移动web服务的传输协议,并提高了web请求的完成时间。WiseTrans引入了机器学习技术来处理时间异质性,根据历史信息做出决策来处理空间异质性,并在请求级别切换传输协议以达到高性能和可接受的开销。我们在百度的一个流行的移动web服务应用中实现了两个平台(Android和iOS)的WiseTrans。综合实验表明,与使用单一协议相比,WiseTrans可以将请求完成时间平均减少26.5%。
{"title":"WiseTrans: Adaptive Transport Protocol Selection for Mobile Web Service","authors":"Jia Zhang, Enhuan Dong, Zili Meng, Yuan Yang, Mingwei Xu, Sijie Yang, Miao Zhang, Yang Yue","doi":"10.1145/3442381.3449958","DOIUrl":"https://doi.org/10.1145/3442381.3449958","url":null,"abstract":"To improve the performance of mobile web service, a new transport protocol, QUIC, has been recently proposed. However, for large-scale real-world deployments, deciding whether and when to use QUIC in mobile web service is challenging. Complex temporal correlation of network conditions, high spatial heterogeneity of users in a nationwide deployment, and limited resources on mobile devices all affect the selection of transport protocols. In this paper, we present WiseTrans to adaptively switch transport protocols for mobile web service online and improve the completion time of web requests. WiseTrans introduces machine learning techniques to deal with temporal heterogeneity, makes decisions with historical information to handle spatial heterogeneity, and switches transport protocols at the request level to reach both high performance and acceptable overhead. We implement WiseTrans on two platforms (Android and iOS) in a popular mobile web service application of Baidu. Comprehensive experiments demonstrate that WiseTrans can reduce request completion time by up to 26.5% on average compared to the usage of a single protocol.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114144424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Mixup for Node and Graph Classification 混合节点和图分类
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449796
Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, Bryan Hooi
Mixup is an advanced data augmentation method for training neural network based image classifiers, which interpolates both features and labels of a pair of images to produce synthetic samples. However, devising the Mixup methods for graph learning is challenging due to the irregularity and connectivity of graph data. In this paper, we propose the Mixup methods for two fundamental tasks in graph learning: node and graph classification. To interpolate the irregular graph topology, we propose the two-branch graph convolution to mix the receptive field subgraphs for the paired nodes. Mixup on different node pairs can interfere with the mixed features for each other due to the connectivity between nodes. To block this interference, we propose the two-stage Mixup framework, which uses each node’s neighbors’ representations before Mixup for graph convolutions. For graph classification, we interpolate complex and diverse graphs in the semantic space. Qualitatively, our Mixup methods enable GNNs to learn more discriminative features and reduce over-fitting. Quantitative results show that our method yields consistent gains in terms of test accuracy and F1-micro scores on standard datasets, for both node and graph classification. Overall, our method effectively regularizes popular graph neural networks for better generalization without increasing their time complexity.
Mixup是一种用于训练基于神经网络的图像分类器的高级数据增强方法,它对一对图像的特征和标签进行插值来生成合成样本。然而,由于图数据的不规则性和连通性,设计用于图学习的Mixup方法具有挑战性。在本文中,我们针对图学习中的两个基本任务:节点和图分类提出了Mixup方法。为了插值不规则图拓扑,我们提出了双分支图卷积来混合成对节点的接受域子图。由于节点间的连通性,不同节点对上的混合会对混合特征产生相互干扰。为了阻止这种干扰,我们提出了两阶段的Mixup框架,该框架在Mixup之前使用每个节点的邻居表示进行图卷积。对于图的分类,我们在语义空间内插入复杂和多样的图。从质量上讲,我们的Mixup方法使gnn能够学习更多的判别特征并减少过拟合。定量结果表明,对于节点和图分类,我们的方法在标准数据集上的测试精度和F1-micro分数方面取得了一致的收益。总的来说,我们的方法在不增加时间复杂度的情况下有效地正则化了流行的图神经网络,以获得更好的泛化效果。
{"title":"Mixup for Node and Graph Classification","authors":"Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, Bryan Hooi","doi":"10.1145/3442381.3449796","DOIUrl":"https://doi.org/10.1145/3442381.3449796","url":null,"abstract":"Mixup is an advanced data augmentation method for training neural network based image classifiers, which interpolates both features and labels of a pair of images to produce synthetic samples. However, devising the Mixup methods for graph learning is challenging due to the irregularity and connectivity of graph data. In this paper, we propose the Mixup methods for two fundamental tasks in graph learning: node and graph classification. To interpolate the irregular graph topology, we propose the two-branch graph convolution to mix the receptive field subgraphs for the paired nodes. Mixup on different node pairs can interfere with the mixed features for each other due to the connectivity between nodes. To block this interference, we propose the two-stage Mixup framework, which uses each node’s neighbors’ representations before Mixup for graph convolutions. For graph classification, we interpolate complex and diverse graphs in the semantic space. Qualitatively, our Mixup methods enable GNNs to learn more discriminative features and reduce over-fitting. Quantitative results show that our method yields consistent gains in terms of test accuracy and F1-micro scores on standard datasets, for both node and graph classification. Overall, our method effectively regularizes popular graph neural networks for better generalization without increasing their time complexity.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121640563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 106
CoopEdge: A Decentralized Blockchain-based Platform for Cooperative Edge Computing CoopEdge:一个去中心化的基于区块链的协作边缘计算平台
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449994
Liang Yuan, Qiang He, Siyu Tan, Bo Li, Jiangshan Yu, Feifei Chen, Hai Jin, Yun Yang
Edge computing (EC) has recently emerged as a novel computing paradigm that offers users low-latency services. Suffering from constrained computing resources due to their limited physical sizes, edge servers cannot always handle all the incoming computation tasks timely when they operate independently. They often need to cooperate through peer-offloading. Deployed and managed by different stakeholders, edge servers operate in a distrusted environment. Trust and incentive are the two main issues that challenge cooperative computing between them. Another unique challenge in the EC environment is to facilitate trust and incentive in a decentralized manner. To tackle these challenges systematically, this paper proposes CoopEdge, a novel blockchain-based decentralized platform, to drive and support cooperative edge computing. On CoopEdge, an edge server can publish a computation task for other edge servers to contend for. A winner is selected from candidate edge servers based on their reputations. After that, a consensus is reached among edge servers to record the performance in task execution on blockchain. We implement CoopEdge based on Hyperledger Sawtooth and evaluate it experimentally against a baseline and two state-of-the-art implementations in a simulated EC environment. The results validate the usefulness of CoopEdge and demonstrate its performance.
边缘计算(EC)最近作为一种新的计算范式出现,为用户提供低延迟服务。边缘服务器由于物理大小有限,计算资源有限,当它们独立运行时,不能总是及时处理所有传入的计算任务。他们通常需要通过同伴卸载进行合作。边缘服务器由不同的利益相关者部署和管理,在不受信任的环境中运行。信任和激励是挑战它们之间协同计算的两个主要问题。欧共体环境中的另一个独特挑战是以分散的方式促进信任和激励。为了系统地解决这些挑战,本文提出了一种新的基于区块链的去中心化平台CoopEdge来驱动和支持协作边缘计算。在CoopEdge上,一个边缘服务器可以发布一个计算任务供其他边缘服务器竞争。获胜者将根据其声誉从候选边缘服务器中选出。之后,边缘服务器之间达成共识,记录区块链上任务执行的性能。我们实现了基于Hyperledger锯齿的CoopEdge,并在模拟EC环境中针对基线和两种最先进的实现进行了实验评估。结果验证了CoopEdge的有效性和性能。
{"title":"CoopEdge: A Decentralized Blockchain-based Platform for Cooperative Edge Computing","authors":"Liang Yuan, Qiang He, Siyu Tan, Bo Li, Jiangshan Yu, Feifei Chen, Hai Jin, Yun Yang","doi":"10.1145/3442381.3449994","DOIUrl":"https://doi.org/10.1145/3442381.3449994","url":null,"abstract":"Edge computing (EC) has recently emerged as a novel computing paradigm that offers users low-latency services. Suffering from constrained computing resources due to their limited physical sizes, edge servers cannot always handle all the incoming computation tasks timely when they operate independently. They often need to cooperate through peer-offloading. Deployed and managed by different stakeholders, edge servers operate in a distrusted environment. Trust and incentive are the two main issues that challenge cooperative computing between them. Another unique challenge in the EC environment is to facilitate trust and incentive in a decentralized manner. To tackle these challenges systematically, this paper proposes CoopEdge, a novel blockchain-based decentralized platform, to drive and support cooperative edge computing. On CoopEdge, an edge server can publish a computation task for other edge servers to contend for. A winner is selected from candidate edge servers based on their reputations. After that, a consensus is reached among edge servers to record the performance in task execution on blockchain. We implement CoopEdge based on Hyperledger Sawtooth and evaluate it experimentally against a baseline and two state-of-the-art implementations in a simulated EC environment. The results validate the usefulness of CoopEdge and demonstrate its performance.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116452506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
One Detector to Rule Them All: Towards a General Deepfake Attack Detection Framework 一个检测器统治所有:走向一个通用的深度伪造攻击检测框架
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3449809
Shahroz Tariq, Sangyup Lee, Simon S. Woo
Deep learning-based video manipulation methods have become widely accessible to the masses. With little to no effort, people can quickly learn how to generate deepfake (DF) videos. While deep learning-based detection methods have been proposed to identify specific types of DFs, their performance suffers for other types of deepfake methods, including real-world deepfakes, on which they are not sufficiently trained. In other words, most of the proposed deep learning-based detection methods lack transferability and generalizability. Beyond detecting a single type of DF from benchmark deepfake datasets, we focus on developing a generalized approach to detect multiple types of DFs, including deepfakes from unknown generation methods such as DeepFake-in-the-Wild (DFW) videos. To better cope with unknown and unseen deepfakes, we introduce a Convolutional LSTM-based Residual Network (CLRNet), which adopts a unique model training strategy and explores spatial as well as the temporal information in a deepfakes. Through extensive experiments, we show that existing defense methods are not ready for real-world deployment. Whereas our defense method (CLRNet) achieves far better generalization when detecting various benchmark deepfake methods (97.57% on average). Furthermore, we evaluate our approach with a high-quality DeepFake-in-the-Wild dataset, collected from the Internet containing numerous videos and having more than 150,000 frames. Our CLRNet model demonstrated that it generalizes well against high-quality DFW videos by achieving 93.86% detection accuracy, outperforming existing state-of-the-art defense methods by a considerable margin.
基于深度学习的视频处理方法已经被大众广泛使用。人们可以毫不费力地快速学习如何生成深度造假(DF)视频。虽然已经提出了基于深度学习的检测方法来识别特定类型的df,但它们的性能在其他类型的深度伪造方法中受到影响,包括现实世界的深度伪造,因为它们没有得到充分的训练。换句话说,大多数提出的基于深度学习的检测方法缺乏可转移性和泛化性。除了从基准deepfake数据集检测单一类型的DF之外,我们还专注于开发一种通用的方法来检测多种类型的DF,包括来自未知生成方法的深度伪造,例如deepfake -in- wild (DFW)视频。为了更好地处理未知和不可见的深度伪造,我们引入了一种基于卷积lstm的残差网络(CLRNet),该网络采用独特的模型训练策略,探索深度伪造中的空间和时间信息。通过大量的实验,我们表明现有的防御方法还没有为现实世界的部署做好准备。而我们的防御方法(CLRNet)在检测各种基准深度伪造方法时实现了更好的泛化(平均为97.57%)。此外,我们用一个高质量的DeepFake-in-the-Wild数据集来评估我们的方法,该数据集收集自互联网,包含大量视频,超过15万帧。我们的CLRNet模型表明,通过达到93.86%的检测准确率,它可以很好地泛化高质量的DFW视频,大大优于现有的最先进的防御方法。
{"title":"One Detector to Rule Them All: Towards a General Deepfake Attack Detection Framework","authors":"Shahroz Tariq, Sangyup Lee, Simon S. Woo","doi":"10.1145/3442381.3449809","DOIUrl":"https://doi.org/10.1145/3442381.3449809","url":null,"abstract":"Deep learning-based video manipulation methods have become widely accessible to the masses. With little to no effort, people can quickly learn how to generate deepfake (DF) videos. While deep learning-based detection methods have been proposed to identify specific types of DFs, their performance suffers for other types of deepfake methods, including real-world deepfakes, on which they are not sufficiently trained. In other words, most of the proposed deep learning-based detection methods lack transferability and generalizability. Beyond detecting a single type of DF from benchmark deepfake datasets, we focus on developing a generalized approach to detect multiple types of DFs, including deepfakes from unknown generation methods such as DeepFake-in-the-Wild (DFW) videos. To better cope with unknown and unseen deepfakes, we introduce a Convolutional LSTM-based Residual Network (CLRNet), which adopts a unique model training strategy and explores spatial as well as the temporal information in a deepfakes. Through extensive experiments, we show that existing defense methods are not ready for real-world deployment. Whereas our defense method (CLRNet) achieves far better generalization when detecting various benchmark deepfake methods (97.57% on average). Furthermore, we evaluate our approach with a high-quality DeepFake-in-the-Wild dataset, collected from the Internet containing numerous videos and having more than 150,000 frames. Our CLRNet model demonstrated that it generalizes well against high-quality DFW videos by achieving 93.86% detection accuracy, outperforming existing state-of-the-art defense methods by a considerable margin.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124305182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Sketch-based Algorithms for Approximate Shortest Paths in Road Networks 道路网络中基于草图的近似最短路径算法
Pub Date : 2021-04-19 DOI: 10.1145/3442381.3450083
Gaurav Aggarwal, Sreenivas Gollapudi, Raghavender, A. Sinop
Constructing efficient data structures (distance oracles) for fast computation of shortest paths and other connectivity measures in graphs has been a promising area of study in computer science [23, 24, 28]. In this paper, we propose very efficient algorithms, based on a distance oracle, for computing approximate shortest paths and alternate paths in road networks. Specifically, we adopt a distance oracle construction that exploits the existence of small separators in such networks. In other words, the existence of a small cut in a graph admits a partitioning of the graph into balanced components with a small number of inter-component edges. We demonstrate the efficacy of our algorithm by using it to find near optimal shortest paths and show that it also has the desired properties of well-studied goal-oriented path search algorithms such as ALT [12]. We further demonstrate the use of our distance oracle to produce multiple alternative routes in addition to the shortest path. Finally, we empirically demonstrate that our method, while exploring few edges, produces high quality alternates with respect to metrics such as optimality-loss and diversity of paths.
构建高效的数据结构(距离预言器)来快速计算图中最短路径和其他连通性度量是计算机科学中一个很有前途的研究领域[23,24,28]。在本文中,我们提出了非常有效的算法,基于距离预言,计算道路网络中的近似最短路径和备用路径。具体地说,我们采用了一种远程oracle结构,利用这种网络中存在的小分隔符。换句话说,图中存在一个小切口,就允许将图划分为具有少量组件间边的平衡组件。我们通过使用我们的算法寻找近最优最短路径来证明我们算法的有效性,并表明它也具有充分研究的面向目标的路径搜索算法(如ALT[12])所需的特性。我们进一步演示了使用距离预测器来生成除了最短路径之外的多个备选路径。最后,我们通过经验证明,我们的方法在探索少数边缘的同时,可以产生高质量的替代方案,例如最优性损失和路径多样性。
{"title":"Sketch-based Algorithms for Approximate Shortest Paths in Road Networks","authors":"Gaurav Aggarwal, Sreenivas Gollapudi, Raghavender, A. Sinop","doi":"10.1145/3442381.3450083","DOIUrl":"https://doi.org/10.1145/3442381.3450083","url":null,"abstract":"Constructing efficient data structures (distance oracles) for fast computation of shortest paths and other connectivity measures in graphs has been a promising area of study in computer science [23, 24, 28]. In this paper, we propose very efficient algorithms, based on a distance oracle, for computing approximate shortest paths and alternate paths in road networks. Specifically, we adopt a distance oracle construction that exploits the existence of small separators in such networks. In other words, the existence of a small cut in a graph admits a partitioning of the graph into balanced components with a small number of inter-component edges. We demonstrate the efficacy of our algorithm by using it to find near optimal shortest paths and show that it also has the desired properties of well-studied goal-oriented path search algorithms such as ALT [12]. We further demonstrate the use of our distance oracle to produce multiple alternative routes in addition to the shortest path. Finally, we empirically demonstrate that our method, while exploring few edges, produces high quality alternates with respect to metrics such as optimality-loss and diversity of paths.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"455 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124315671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of the Web Conference 2021
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1