One-shot automated essay scoring (AES) aims to assign scores to a set of essays written specific to a certain prompt, with only one manually scored essay per distinct score. Compared to the previous-studied prompt-specific AES which usually requires a large number of manually scored essays for model training (e.g., about 600 manually scored essays out of totally 1000 essays), one-shot AES can greatly reduce the workload of manual scoring. In this paper, we propose a Transductive Graph-based Ordinal Distillation (TGOD) framework to tackle the task of one-shot AES. Specifically, we design a transductive graph-based model as a teacher model to generate pseudo labels of unlabeled essays based on the one-shot labeled essays. Then, we distill the knowledge in the teacher model into a neural student model by learning from the high confidence pseudo labels. Different from the general knowledge distillation, we propose an ordinal-aware unimodal distillation which makes a unimodal distribution constraint on the output of student model, to tolerate the minor errors existed in pseudo labels. Experimental results on the public dataset ASAP show that TGOD can improve the performance of existing neural AES models under the one-shot AES setting and achieve an acceptable average QWK of 0.69.
{"title":"Learning from Graph Propagation via Ordinal Distillation for One-Shot Automated Essay Scoring","authors":"Zhiwei Jiang, Meng Liu, Yafeng Yin, Hua Yu, Zifeng Cheng, Qing Gu","doi":"10.1145/3442381.3450017","DOIUrl":"https://doi.org/10.1145/3442381.3450017","url":null,"abstract":"One-shot automated essay scoring (AES) aims to assign scores to a set of essays written specific to a certain prompt, with only one manually scored essay per distinct score. Compared to the previous-studied prompt-specific AES which usually requires a large number of manually scored essays for model training (e.g., about 600 manually scored essays out of totally 1000 essays), one-shot AES can greatly reduce the workload of manual scoring. In this paper, we propose a Transductive Graph-based Ordinal Distillation (TGOD) framework to tackle the task of one-shot AES. Specifically, we design a transductive graph-based model as a teacher model to generate pseudo labels of unlabeled essays based on the one-shot labeled essays. Then, we distill the knowledge in the teacher model into a neural student model by learning from the high confidence pseudo labels. Different from the general knowledge distillation, we propose an ordinal-aware unimodal distillation which makes a unimodal distribution constraint on the output of student model, to tolerate the minor errors existed in pseudo labels. Experimental results on the public dataset ASAP show that TGOD can improve the performance of existing neural AES models under the one-shot AES setting and achieve an acceptable average QWK of 0.69.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132171808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification1.
{"title":"Knowledge Embedding Based Graph Convolutional Network","authors":"Donghan Yu, Yiming Yang, Ruohong Zhang, Yuexin Wu","doi":"10.1145/3442381.3449925","DOIUrl":"https://doi.org/10.1145/3442381.3449925","url":null,"abstract":"Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification1.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114667904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziwei Gu, Jing Nathan Yan, Jeffrey M. Rzeszotarski
A variety of systems have been proposed to assist users in detecting machine learning (ML) fairness issues. These systems approach bias reduction from a number of perspectives, including recommender systems, exploratory tools, and dashboards. In this paper, we seek to inform the design of these systems by examining how individuals make sense of fairness issues as they use different de-biasing affordances. In particular, we consider the tension between de-biasing recommendations which are quick but may lack nuance and ”what-if” style exploration which is time consuming but may lead to deeper understanding and transferable insights. Using logs, think-aloud data, and semi-structured interviews we find that exploratory systems promote a rich pattern of hypothesis generation and testing, while recommendations deliver quick answers which satisfy participants at the cost of reduced information exposure. We highlight design requirements and trade-offs in the design of ML fairness systems to promote accurate and explainable assessments.
{"title":"Understanding User Sensemaking in Machine Learning Fairness Assessment Systems","authors":"Ziwei Gu, Jing Nathan Yan, Jeffrey M. Rzeszotarski","doi":"10.1145/3442381.3450092","DOIUrl":"https://doi.org/10.1145/3442381.3450092","url":null,"abstract":"A variety of systems have been proposed to assist users in detecting machine learning (ML) fairness issues. These systems approach bias reduction from a number of perspectives, including recommender systems, exploratory tools, and dashboards. In this paper, we seek to inform the design of these systems by examining how individuals make sense of fairness issues as they use different de-biasing affordances. In particular, we consider the tension between de-biasing recommendations which are quick but may lack nuance and ”what-if” style exploration which is time consuming but may lead to deeper understanding and transferable insights. Using logs, think-aloud data, and semi-structured interviews we find that exploratory systems promote a rich pattern of hypothesis generation and testing, while recommendations deliver quick answers which satisfy participants at the cost of reduced information exposure. We highlight design requirements and trade-offs in the design of ML fairness systems to promote accurate and explainable assessments.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114872641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Query and Point-of-Interest (POI) matching, aiming at recommending the most relevant POIs from partial query keywords, has become one of the most essential functions in online navigation and ride-hailing applications. Existing methods for query-POI matching, such as Google Maps and Uber, have a natural focus on measuring the static semantic similarity between contextual information of queries and geographical information of POIs. However, it remains challenging for dynamic and personalized online query-POI matching because of the non-stationary and situational context-dependent query-POI relevance. Moreover, the large volume of online queries requires an adaptive and incremental model training strategy that is efficient and scalable in the online scenario. To this end, in this paper, we propose an Incremental Spatio-Temporal Graph Learning (IncreSTGL) framework for intelligent online query-POI matching. Specifically, we first model dynamic query-POI interactions as microscopic and macroscopic graphs. Then, we propose an incremental graph representation learning module to refine and update query-POI interaction graphs in an online incremental fashion, which includes: (i) a contextual graph attention operation quantifying query-POI correlation based on historical queries under dynamic situational context, (ii) a graph discrimination operation capturing the sequential query-POI relevance drift from a holistic view of personalized preference and social homophily, and (iii) a multi-level temporal attention operation summarizing the temporal variations of query-POI interaction graphs for subsequent query-POI matching. Finally, we introduce a lightweight semantic matching module for online query-POI similarity measurement. To demonstrate the effectiveness and efficiency of the proposed algorithm, we conduct extensive experiments on two real-world datasets collected from a leading online navigation and map service provider in China.
{"title":"Incremental Spatio-Temporal Graph Learning for Online Query-POI Matching","authors":"Zixuan Yuan, Hao Liu, Junming Liu, Yanchi Liu, Yang Yang, Renjun Hu, Hui Xiong","doi":"10.1145/3442381.3449810","DOIUrl":"https://doi.org/10.1145/3442381.3449810","url":null,"abstract":"Query and Point-of-Interest (POI) matching, aiming at recommending the most relevant POIs from partial query keywords, has become one of the most essential functions in online navigation and ride-hailing applications. Existing methods for query-POI matching, such as Google Maps and Uber, have a natural focus on measuring the static semantic similarity between contextual information of queries and geographical information of POIs. However, it remains challenging for dynamic and personalized online query-POI matching because of the non-stationary and situational context-dependent query-POI relevance. Moreover, the large volume of online queries requires an adaptive and incremental model training strategy that is efficient and scalable in the online scenario. To this end, in this paper, we propose an Incremental Spatio-Temporal Graph Learning (IncreSTGL) framework for intelligent online query-POI matching. Specifically, we first model dynamic query-POI interactions as microscopic and macroscopic graphs. Then, we propose an incremental graph representation learning module to refine and update query-POI interaction graphs in an online incremental fashion, which includes: (i) a contextual graph attention operation quantifying query-POI correlation based on historical queries under dynamic situational context, (ii) a graph discrimination operation capturing the sequential query-POI relevance drift from a holistic view of personalized preference and social homophily, and (iii) a multi-level temporal attention operation summarizing the temporal variations of query-POI interaction graphs for subsequent query-POI matching. Finally, we introduce a lightweight semantic matching module for online query-POI similarity measurement. To demonstrate the effectiveness and efficiency of the proposed algorithm, we conduct extensive experiments on two real-world datasets collected from a leading online navigation and map service provider in China.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114908790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruohan Zhan, Konstantina Christakopoulou, Ya Le, Jayden Ooi, Martin Mladenov, Alex Beutel, Craig Boutilier, Ed H. Chi, Minmin Chen
Most existing recommender systems focus primarily on matching users (content consumers) to content which maximizes user satisfaction on the platform. It is increasingly obvious, however, that content providers have a critical influence on user satisfaction through content creation, largely determining the content pool available for recommendation. A natural question thus arises: can we design recommenders taking into account the long-term utility of both users and content providers? By doing so, we hope to sustain more content providers and a more diverse content pool for long-term user satisfaction. Understanding the full impact of recommendations on both user and content provider groups is challenging. This paper aims to serve as a research investigation of one approach toward building a content provider aware recommender, and evaluating its impact in a simulated setup. To characterize the user-recommender-provider interdependence, we complement user modeling by formalizing provider dynamics as well. The resulting joint dynamical system gives rise to a weakly-coupled partially observable Markov decision process driven by recommender actions and user feedback to providers. We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the content provider associated with the recommended content, which we show to be equivalent to maximizing overall user utility and the utilities of all content providers on the platform under some mild assumptions. To evaluate our approach, we introduce a simulation environment capturing the key interactions among users, providers, and the recommender. We offer a number of simulated experiments that shed light on both the benefits and the limitations of our approach. These results help understand how and when a content provider aware recommender agent is of benefit in building multi-stakeholder recommender systems.
{"title":"Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities","authors":"Ruohan Zhan, Konstantina Christakopoulou, Ya Le, Jayden Ooi, Martin Mladenov, Alex Beutel, Craig Boutilier, Ed H. Chi, Minmin Chen","doi":"10.1145/3442381.3449889","DOIUrl":"https://doi.org/10.1145/3442381.3449889","url":null,"abstract":"Most existing recommender systems focus primarily on matching users (content consumers) to content which maximizes user satisfaction on the platform. It is increasingly obvious, however, that content providers have a critical influence on user satisfaction through content creation, largely determining the content pool available for recommendation. A natural question thus arises: can we design recommenders taking into account the long-term utility of both users and content providers? By doing so, we hope to sustain more content providers and a more diverse content pool for long-term user satisfaction. Understanding the full impact of recommendations on both user and content provider groups is challenging. This paper aims to serve as a research investigation of one approach toward building a content provider aware recommender, and evaluating its impact in a simulated setup. To characterize the user-recommender-provider interdependence, we complement user modeling by formalizing provider dynamics as well. The resulting joint dynamical system gives rise to a weakly-coupled partially observable Markov decision process driven by recommender actions and user feedback to providers. We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the content provider associated with the recommended content, which we show to be equivalent to maximizing overall user utility and the utilities of all content providers on the platform under some mild assumptions. To evaluate our approach, we introduce a simulation environment capturing the key interactions among users, providers, and the recommender. We offer a number of simulated experiments that shed light on both the benefits and the limitations of our approach. These results help understand how and when a content provider aware recommender agent is of benefit in building multi-stakeholder recommender systems.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123453194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommender systems (RS) have started to employ knowledge distillation, which is a model compression technique training a compact model (student) with the knowledge transferred from a cumbersome model (teacher). The state-of-the-art methods rely on unidirectional distillation transferring the knowledge only from the teacher to the student, with an underlying assumption that the teacher is always superior to the student. However, we demonstrate that the student performs better than the teacher on a significant proportion of the test set, especially for RS. Based on this observation, we propose Bidirectional Distillation (BD) framework whereby both the teacher and the student collaboratively improve with each other. Specifically, each model is trained with the distillation loss that makes to follow the other’s prediction along with its original loss function. For effective bidirectional distillation, we propose rank discrepancy-aware sampling scheme to distill only the informative knowledge that can fully enhance each other. The proposed scheme is designed to effectively cope with a large performance gap between the teacher and the student. Trained in the bidirectional way, it turns out that both the teacher and the student are significantly improved compared to when being trained separately. Our extensive experiments on real-world datasets show that our proposed framework consistently outperforms the state-of-the-art competitors. We also provide analyses for an in-depth understanding of BD and ablation studies to verify the effectiveness of each proposed component.
{"title":"Bidirectional Distillation for Top-K Recommender System","authors":"Wonbin Kweon, SeongKu Kang, Hwanjo Yu","doi":"10.1145/3442381.3449878","DOIUrl":"https://doi.org/10.1145/3442381.3449878","url":null,"abstract":"Recommender systems (RS) have started to employ knowledge distillation, which is a model compression technique training a compact model (student) with the knowledge transferred from a cumbersome model (teacher). The state-of-the-art methods rely on unidirectional distillation transferring the knowledge only from the teacher to the student, with an underlying assumption that the teacher is always superior to the student. However, we demonstrate that the student performs better than the teacher on a significant proportion of the test set, especially for RS. Based on this observation, we propose Bidirectional Distillation (BD) framework whereby both the teacher and the student collaboratively improve with each other. Specifically, each model is trained with the distillation loss that makes to follow the other’s prediction along with its original loss function. For effective bidirectional distillation, we propose rank discrepancy-aware sampling scheme to distill only the informative knowledge that can fully enhance each other. The proposed scheme is designed to effectively cope with a large performance gap between the teacher and the student. Trained in the bidirectional way, it turns out that both the teacher and the student are significantly improved compared to when being trained separately. Our extensive experiments on real-world datasets show that our proposed framework consistently outperforms the state-of-the-art competitors. We also provide analyses for an in-depth understanding of BD and ablation studies to verify the effectiveness of each proposed component.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121979899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiannan Cheng, Z. Ren, Yujie Lin, Pengjie Ren, Zhumin Chen, Xiangyuan Liu, M. de Rijke
DR and next query prediction (NQP) are two core tasks in session search. They are often driven by the same search intent and, hence, it is natural to jointly optimize both tasks. So far, most models proposed for jointly optimizing document reranking (DR) and NQP have focused on users’ short-term intent in an ongoing search session. Because of this limitation, these models fail to account for users’ long-term intent as captured in their historical search sessions. In contrast, we consider a personalized mechanism for learning a user’s profile from their long-term and short-term behavior to simultaneously enhance the performance of DR and NQP in an ongoing search session. We propose a personalized session search model, called Long short-term session search, Network (LostNet), that jointly learns to rerank documents for the current query and predict the next query. LostNet consists of three modules: The hierarchical session-based attention mechanism tracks the fine-grained short-term intent in an ongoing session. The personalized multi-hop memory network tracks a user’s dynamic profile information from their prior search sessions so as to infer their personal search intent. Jointly learning of DR and NQP is aimed at simultaneously reranking documents and predicting the next query based on outputs from the above two modules. We conduct experiments on two large-scale session search benchmark datasets. The results show that LostNet achieves significant improvements over state-of-the-art baselines.
{"title":"Long Short-Term Session Search: Joint Personalized Reranking and Next Query Prediction","authors":"Qiannan Cheng, Z. Ren, Yujie Lin, Pengjie Ren, Zhumin Chen, Xiangyuan Liu, M. de Rijke","doi":"10.1145/3442381.3449941","DOIUrl":"https://doi.org/10.1145/3442381.3449941","url":null,"abstract":"DR and next query prediction (NQP) are two core tasks in session search. They are often driven by the same search intent and, hence, it is natural to jointly optimize both tasks. So far, most models proposed for jointly optimizing document reranking (DR) and NQP have focused on users’ short-term intent in an ongoing search session. Because of this limitation, these models fail to account for users’ long-term intent as captured in their historical search sessions. In contrast, we consider a personalized mechanism for learning a user’s profile from their long-term and short-term behavior to simultaneously enhance the performance of DR and NQP in an ongoing search session. We propose a personalized session search model, called Long short-term session search, Network (LostNet), that jointly learns to rerank documents for the current query and predict the next query. LostNet consists of three modules: The hierarchical session-based attention mechanism tracks the fine-grained short-term intent in an ongoing session. The personalized multi-hop memory network tracks a user’s dynamic profile information from their prior search sessions so as to infer their personal search intent. Jointly learning of DR and NQP is aimed at simultaneously reranking documents and predicting the next query based on outputs from the above two modules. We conduct experiments on two large-scale session search benchmark datasets. The results show that LostNet achieves significant improvements over state-of-the-art baselines.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125765528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amr Azzam, Christian Aebeloe, Gabriela Montoya, Ilkcan Keles, A. Polleres, K. Hose
SPARQL query services that balance processing between clients and servers become more and more essential to handle the increasing load for open and decentralized knowledge graphs on the Web. To this end, Linked Data Fragments (LDF) have introduced a foundational framework that has sparked research exploring a spectrum of potential Web querying interfaces in between server-side query processing via SPARQL endpoints and client-side query processing of data dumps. Current proposals in between typically suffer from imbalanced load on either the client or the server. In this paper, to the best of our knowledge, we present the first work that combines both client-side and server-side query optimization techniques in a truly dynamic fashion: we introduce WiseKG, a system that employs a cost model that dynamically delegates the load between servers and clients by combining client-side processing of shipped partitions with efficient server-side processing of star-shaped sub-queries, based on current server workload and client capabilities. Our experiments show that WiseKG significantly outperforms state-of-the-art solutions in terms of average total query execution time per client, while at the same time decreasing network traffic and increasing server-side availability.
SPARQL查询服务平衡了客户机和服务器之间的处理,对于处理Web上开放和分散的知识图日益增加的负载变得越来越重要。为此,关联数据片段(Linked Data Fragments, LDF)引入了一个基础框架,该框架引发了对通过SPARQL端点进行服务器端查询处理和对数据转储进行客户端查询处理之间潜在Web查询接口的研究。当前处于两者之间的提案通常会受到客户机或服务器上负载不平衡的影响。在本文中,据我们所知,我们展示了第一个以真正动态的方式结合客户端和服务器端查询优化技术的工作:我们介绍了WiseKG,这是一个采用成本模型的系统,它根据当前服务器工作负载和客户端功能,将已交付分区的客户端处理与有效的服务器端星形子查询处理结合起来,动态地在服务器和客户端之间分配负载。我们的实验表明,WiseKG在每个客户机的平均总查询执行时间方面明显优于最先进的解决方案,同时减少了网络流量并提高了服务器端可用性。
{"title":"WiseKG: Balanced Access to Web Knowledge Graphs","authors":"Amr Azzam, Christian Aebeloe, Gabriela Montoya, Ilkcan Keles, A. Polleres, K. Hose","doi":"10.1145/3442381.3449911","DOIUrl":"https://doi.org/10.1145/3442381.3449911","url":null,"abstract":"SPARQL query services that balance processing between clients and servers become more and more essential to handle the increasing load for open and decentralized knowledge graphs on the Web. To this end, Linked Data Fragments (LDF) have introduced a foundational framework that has sparked research exploring a spectrum of potential Web querying interfaces in between server-side query processing via SPARQL endpoints and client-side query processing of data dumps. Current proposals in between typically suffer from imbalanced load on either the client or the server. In this paper, to the best of our knowledge, we present the first work that combines both client-side and server-side query optimization techniques in a truly dynamic fashion: we introduce WiseKG, a system that employs a cost model that dynamically delegates the load between servers and clients by combining client-side processing of shipped partitions with efficient server-side processing of star-shaped sub-queries, based on current server workload and client capabilities. Our experiments show that WiseKG significantly outperforms state-of-the-art solutions in terms of average total query execution time per client, while at the same time decreasing network traffic and increasing server-side availability.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127272691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fashion style analysis is of the utmost importance for fashion professionals. However, it has an issue of having different style classification criteria that rely heavily on professionals’ subjective experiences with no quantitative criteria. We present FANCY (Fashion Attributes detectioN for Clustering stYle), a human-centered, deep learning-based framework to support fashion professionals’ analytic tasks using a computational method integrated with their insights. We work closely with fashion professionals in the whole study process to reflect their domain knowledge and experience as much as possible. We redefine fashion attributes, demonstrate a strong association with fashion attributes and styles, and develop a deep learning model that detects attributes in a given fashion image and reflects fashion professionals’ insight. Based on attribute-annotated 302,772 runway fashion images, we developed 25 new fashion styles (FANCY dataset 1). We summarize quantitative standards of the fashion style groups and present fashion trends based on time, location, and brand.
时尚风格分析对时尚专业人士来说是至关重要的。然而,它有一个问题,即有不同的风格分类标准,严重依赖于专业人员的主观经验,没有定量的标准。我们提出了FANCY (Fashion Attributes detectioN for Clustering stYle),这是一个以人为中心的、基于深度学习的框架,它使用一种与时尚专业人士的见解相结合的计算方法来支持时尚专业人士的分析任务。在整个学习过程中,我们与时尚专业人士密切合作,尽可能多地反映他们的领域知识和经验。我们重新定义了时尚属性,展示了与时尚属性和风格的强烈关联,并开发了一个深度学习模型,可以检测给定时尚图像中的属性,并反映时尚专业人士的见解。基于属性标注的302772张t台时尚图像,我们开发了25种新的时尚风格(FANCY数据集1)。我们总结了时尚风格组的定量标准,并基于时间、地点和品牌呈现了时尚趋势。
{"title":"FANCY: Human-centered, Deep Learning-based Framework for Fashion Style Analysis","authors":"Youngseung Jeon, Seungwan Jin, Kyungsik Han","doi":"10.1145/3442381.3449833","DOIUrl":"https://doi.org/10.1145/3442381.3449833","url":null,"abstract":"Fashion style analysis is of the utmost importance for fashion professionals. However, it has an issue of having different style classification criteria that rely heavily on professionals’ subjective experiences with no quantitative criteria. We present FANCY (Fashion Attributes detectioN for Clustering stYle), a human-centered, deep learning-based framework to support fashion professionals’ analytic tasks using a computational method integrated with their insights. We work closely with fashion professionals in the whole study process to reflect their domain knowledge and experience as much as possible. We redefine fashion attributes, demonstrate a strong association with fashion attributes and styles, and develop a deep learning model that detects attributes in a given fashion image and reflects fashion professionals’ insight. Based on attribute-annotated 302,772 runway fashion images, we developed 25 new fashion styles (FANCY dataset 1). We summarize quantitative standards of the fashion style groups and present fashion trends based on time, location, and brand.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131944200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online reviews, which contain the quality information and user experience about products, always affect the consumption decisions of customers. Unfortunately, quite a number of spammers attempt to mislead consumers by writing fake reviews for some intents. Existing methods for detecting spam reviews mainly focus on constructing discriminative features, which heavily depend on experts and may miss some complex but effective features. Recently, some models attempt to learn the latent representations of reviews, users, and items. However, the learned embeddings usually lack interpretability. Moreover, most of existing methods are based on single classification model while ignoring the complementarity of different classification models. To solve these problems, we propose IFSpard, a novel information fusion-based framework that aims at exploring and exploiting useful information from various aspects for spam review detection. First, we design a graph-based feature extraction method and an interaction-mining-based feature crossing method to automatically extract basic and complex features with consideration of different sources of data. Then, we propose a mutual-information-based feature selection and representation learning method to remove the irrelevant and redundant information contained in the automatically constructed features. Finally, we devise an adaptive ensemble model to make use of the information of constructed features and the abilities of different classifiers for spam review detection. Experimental results on several public datasets show that the proposed model performs better than state-of-the-art methods.
{"title":"IFSpard: An Information Fusion-based Framework for Spam Review Detection","authors":"Yao Zhu, Hongzhi Liu, Yingpeng Du, Zhonghai Wu","doi":"10.1145/3442381.3449920","DOIUrl":"https://doi.org/10.1145/3442381.3449920","url":null,"abstract":"Online reviews, which contain the quality information and user experience about products, always affect the consumption decisions of customers. Unfortunately, quite a number of spammers attempt to mislead consumers by writing fake reviews for some intents. Existing methods for detecting spam reviews mainly focus on constructing discriminative features, which heavily depend on experts and may miss some complex but effective features. Recently, some models attempt to learn the latent representations of reviews, users, and items. However, the learned embeddings usually lack interpretability. Moreover, most of existing methods are based on single classification model while ignoring the complementarity of different classification models. To solve these problems, we propose IFSpard, a novel information fusion-based framework that aims at exploring and exploiting useful information from various aspects for spam review detection. First, we design a graph-based feature extraction method and an interaction-mining-based feature crossing method to automatically extract basic and complex features with consideration of different sources of data. Then, we propose a mutual-information-based feature selection and representation learning method to remove the irrelevant and redundant information contained in the automatically constructed features. Finally, we devise an adaptive ensemble model to make use of the information of constructed features and the abilities of different classifiers for spam review detection. Experimental results on several public datasets show that the proposed model performs better than state-of-the-art methods.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130235801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}