In this article, we propose an approach to improve quality in crowdsourcing (CS) tasks using Task Completion Time (TCT) as a source of information about the reliability of workers in a game-theoretical competitive scenario. Our approach is based on the hypothesis that some workers are more risk-inclined and tend to gamble with their use of time when put to compete with other workers. This hypothesis is supported by our previous simulation study. We test our approach with 35 topics from experiments on the TREC-8 collection being assessed as relevant or non-relevant by crowdsourced workers both in a competitive (referred to as “Game”) and non-competitive (referred to as “Base”) scenario. We find that competition changes the distributions of TCT, making them sensitive to the quality (i.e., wrong or right) and outcome (i.e., relevant or non-relevant) of the assessments. We also test an optimal function of TCT as weights in a weighted majority voting scheme. From probabilistic considerations, we derive a theoretical upper bound for the weighted majority performance of cohorts of 2, 3, 4, and 5 workers, which we use as a criterion to evaluate the performance of our weighting scheme. We find our approach achieves a remarkable performance, significantly closing the gap between the accuracy of the obtained relevance judgements and the upper bound. Since our approach takes advantage of TCT, which is an available quantity in any CS tasks, we believe it is cost-effective and, therefore, can be applied for quality assurance in crowdsourcing for micro-tasks.
{"title":"A Game Theory Approach for Estimating Reliability of Crowdsourced Relevance Assessments","authors":"Yashar Moshfeghi, Alvaro Francisco Huertas-Rosero","doi":"10.1145/3480965","DOIUrl":"https://doi.org/10.1145/3480965","url":null,"abstract":"In this article, we propose an approach to improve quality in crowdsourcing (CS) tasks using Task Completion Time (TCT) as a source of information about the reliability of workers in a game-theoretical competitive scenario. Our approach is based on the hypothesis that some workers are more risk-inclined and tend to gamble with their use of time when put to compete with other workers. This hypothesis is supported by our previous simulation study. We test our approach with 35 topics from experiments on the TREC-8 collection being assessed as relevant or non-relevant by crowdsourced workers both in a competitive (referred to as “Game”) and non-competitive (referred to as “Base”) scenario. We find that competition changes the distributions of TCT, making them sensitive to the quality (i.e., wrong or right) and outcome (i.e., relevant or non-relevant) of the assessments. We also test an optimal function of TCT as weights in a weighted majority voting scheme. From probabilistic considerations, we derive a theoretical upper bound for the weighted majority performance of cohorts of 2, 3, 4, and 5 workers, which we use as a criterion to evaluate the performance of our weighting scheme. We find our approach achieves a remarkable performance, significantly closing the gap between the accuracy of the obtained relevance judgements and the upper bound. Since our approach takes advantage of TCT, which is an available quantity in any CS tasks, we believe it is cost-effective and, therefore, can be applied for quality assurance in crowdsourcing for micro-tasks.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"24 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74230161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile advertising has undoubtedly become one of the fastest-growing industries in the world. The influx of capital attracts increasing fraudsters to defraud money from advertisers. Fraudsters can leverage many techniques, where bots install fraud is the most difficult to detect due to its ability to emulate normal users by implementing sophisticated behavioral patterns to evade from detection rules defined by human experts. Therefore, we proposed BotSpot1 for bots install fraud detection previously. However, there are some drawbacks in BotSpot, such as the sparsity of the devices’ neighbors, weak interactive information of leaf nodes, and noisy labels. In this work, we propose BotSpot++ to improve these drawbacks: (1) for the sparsity of the devices’ neighbors, we propose to construct a super device node to enrich the graph structure and information flow utilizing domain knowledge and a clustering algorithm; (2) for the weak interactive information, we propose to incorporate a self-attention mechanism to enhance the interaction of various leaf nodes; and (3) for the noisy labels, we apply a label smoothing mechanism to alleviate it. Comprehensive experimental results show that BotSpot++ yields the best performance compared with six state-of-the-art baselines. Furthermore, we deploy our model to the advertising platform of Mobvista,2 a leading global mobile advertising company. The online experiments also demonstrate the effectiveness of our proposed method.
{"title":"BotSpot++: A Hierarchical Deep Ensemble Model for Bots Install Fraud Detection in Mobile Advertising","authors":"Yadong Zhu, Xiliang Wang, Qing Li, Tianjun Yao, Shangsong Liang","doi":"10.1145/3476107","DOIUrl":"https://doi.org/10.1145/3476107","url":null,"abstract":"Mobile advertising has undoubtedly become one of the fastest-growing industries in the world. The influx of capital attracts increasing fraudsters to defraud money from advertisers. Fraudsters can leverage many techniques, where bots install fraud is the most difficult to detect due to its ability to emulate normal users by implementing sophisticated behavioral patterns to evade from detection rules defined by human experts. Therefore, we proposed BotSpot1 for bots install fraud detection previously. However, there are some drawbacks in BotSpot, such as the sparsity of the devices’ neighbors, weak interactive information of leaf nodes, and noisy labels. In this work, we propose BotSpot++ to improve these drawbacks: (1) for the sparsity of the devices’ neighbors, we propose to construct a super device node to enrich the graph structure and information flow utilizing domain knowledge and a clustering algorithm; (2) for the weak interactive information, we propose to incorporate a self-attention mechanism to enhance the interaction of various leaf nodes; and (3) for the noisy labels, we apply a label smoothing mechanism to alleviate it. Comprehensive experimental results show that BotSpot++ yields the best performance compared with six state-of-the-art baselines. Furthermore, we deploy our model to the advertising platform of Mobvista,2 a leading global mobile advertising company. The online experiments also demonstrate the effectiveness of our proposed method.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"60 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81448674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cross-lingual entity alignment has attracted considerable attention in recent years. Past studies using conventional approaches to match entities share the common problem of missing important structural information beyond entities in the modeling process. This allows graph neural network models to step in. Most existing graph neural network approaches model individual knowledge graphs (KGs) separately with a small amount of pre-aligned entities served as anchors to connect different KG embedding spaces. However, this characteristic can cause several major problems, including performance restraint due to the insufficiency of available seed alignments and ignorance of pre-aligned links that are useful in contextual information in-between nodes. In this article, we propose DuGa-DIT, a dual gated graph attention network with dynamic iterative training, to address these problems in a unified model. The DuGa-DIT model captures neighborhood and cross-KG alignment features by using intra-KG attention and cross-KG attention layers. With the dynamic iterative process, we can dynamically update the cross-KG attention score matrices, which enables our model to capture more cross-KG information. We conduct extensive experiments on two benchmark datasets and a case study in cross-lingual personalized search. Our experimental results demonstrate that DuGa-DIT outperforms state-of-the-art methods.
{"title":"Dual Gated Graph Attention Networks with Dynamic Iterative Training for Cross-Lingual Entity Alignment","authors":"Zhiwen Xie, Runjie Zhu, Kunsong Zhao, Jin Liu, Guangyou Zhou, Xiangji Huang","doi":"10.1145/3471165","DOIUrl":"https://doi.org/10.1145/3471165","url":null,"abstract":"Cross-lingual entity alignment has attracted considerable attention in recent years. Past studies using conventional approaches to match entities share the common problem of missing important structural information beyond entities in the modeling process. This allows graph neural network models to step in. Most existing graph neural network approaches model individual knowledge graphs (KGs) separately with a small amount of pre-aligned entities served as anchors to connect different KG embedding spaces. However, this characteristic can cause several major problems, including performance restraint due to the insufficiency of available seed alignments and ignorance of pre-aligned links that are useful in contextual information in-between nodes. In this article, we propose DuGa-DIT, a dual gated graph attention network with dynamic iterative training, to address these problems in a unified model. The DuGa-DIT model captures neighborhood and cross-KG alignment features by using intra-KG attention and cross-KG attention layers. With the dynamic iterative process, we can dynamically update the cross-KG attention score matrices, which enables our model to capture more cross-KG information. We conduct extensive experiments on two benchmark datasets and a case study in cross-lingual personalized search. Our experimental results demonstrate that DuGa-DIT outperforms state-of-the-art methods.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"27 1 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82707947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingshan Zeng, Jing Li, Lingzhi Wang, Kam-Fai Wong
The popularity of social media platforms results in a huge volume of online conversations produced every day. To help users better engage in online conversations, this article presents a novel framework to automatically recommend conversations to users based on what they said and how they behaved in their chatting histories. While prior work mostly focuses on post-level recommendation, we aim to explore conversation context and model the interaction patterns therein. Furthermore, to characterize personal interests from interleaving user interactions, we learn (1) global interactions, represented by topic and discourse word clusters to reflect users’ content and pragmatic preferences, and (2) local interactions, encoding replying relations and chronological order of conversation turns to characterize users’ prior behavior. Built on collaborative filtering, our model captures global interactions via discovering word distributions to represent users’ topical interests and discourse behaviors, while local interactions are explored with graph-structured networks exploiting both reply structure and temporal features. Extensive experiments on three datasets from Twitter and Reddit show that our model coupling global and local interactions significantly outperforms the state-of-the-art model. Further analyses show that our model is able to capture meaningful features from global and local interactions, which results in its superior performance in conversation recommendation.
{"title":"Modeling Global and Local Interactions for Online Conversation Recommendation","authors":"Xingshan Zeng, Jing Li, Lingzhi Wang, Kam-Fai Wong","doi":"10.1145/3473970","DOIUrl":"https://doi.org/10.1145/3473970","url":null,"abstract":"The popularity of social media platforms results in a huge volume of online conversations produced every day. To help users better engage in online conversations, this article presents a novel framework to automatically recommend conversations to users based on what they said and how they behaved in their chatting histories. While prior work mostly focuses on post-level recommendation, we aim to explore conversation context and model the interaction patterns therein. Furthermore, to characterize personal interests from interleaving user interactions, we learn (1) global interactions, represented by topic and discourse word clusters to reflect users’ content and pragmatic preferences, and (2) local interactions, encoding replying relations and chronological order of conversation turns to characterize users’ prior behavior. Built on collaborative filtering, our model captures global interactions via discovering word distributions to represent users’ topical interests and discourse behaviors, while local interactions are explored with graph-structured networks exploiting both reply structure and temporal features. Extensive experiments on three datasets from Twitter and Reddit show that our model coupling global and local interactions significantly outperforms the state-of-the-art model. Further analyses show that our model is able to capture meaningful features from global and local interactions, which results in its superior performance in conversation recommendation.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"55 1","pages":"1 - 33"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78674752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reducing user effort in finding relevant information is one of the key objectives of search systems. Existing approaches have been shown to effectively exploit the context from the current search session of users for automatically suggesting queries to reduce their search efforts. However, these approaches do not accomplish the end goal of a search system—that of retrieving a set of potentially relevant documents for the evolving information need during a search session. This article takes the problem of query prediction one step further by investigating the problem of contextual recommendation within a search session. More specifically, given the partial context information of a session in the form of a small number of queries, we investigate how a search system can effectively predict the documents that a user would have been presented with had he continued the search session by submitting subsequent queries. To address the problem, we propose a model of contextual recommendation that seeks to capture the underlying semantics of information need transitions of a current user’s search context. This model leverages information from a number of past interactions of other users with similar interactions from an existing search log. To identify similar interactions, as a novel contribution, we propose an embedding approach that jointly learns representations of both individual query terms and also those of queries (in their entirety) from a search log data by leveraging session-level containment relationships. Our experiments conducted on a large query log, namely the AOL, demonstrate that using a joint embedding of queries and their terms within our proposed framework of document retrieval outperforms a number of text-only and sequence modeling based baselines.
{"title":"I Know What You Need: Investigating Document Retrieval Effectiveness with Partial Session Contexts","authors":"Procheta Sen, Debasis Ganguly, G. Jones","doi":"10.1145/3488667","DOIUrl":"https://doi.org/10.1145/3488667","url":null,"abstract":"Reducing user effort in finding relevant information is one of the key objectives of search systems. Existing approaches have been shown to effectively exploit the context from the current search session of users for automatically suggesting queries to reduce their search efforts. However, these approaches do not accomplish the end goal of a search system—that of retrieving a set of potentially relevant documents for the evolving information need during a search session. This article takes the problem of query prediction one step further by investigating the problem of contextual recommendation within a search session. More specifically, given the partial context information of a session in the form of a small number of queries, we investigate how a search system can effectively predict the documents that a user would have been presented with had he continued the search session by submitting subsequent queries. To address the problem, we propose a model of contextual recommendation that seeks to capture the underlying semantics of information need transitions of a current user’s search context. This model leverages information from a number of past interactions of other users with similar interactions from an existing search log. To identify similar interactions, as a novel contribution, we propose an embedding approach that jointly learns representations of both individual query terms and also those of queries (in their entirety) from a search log data by leveraging session-level containment relationships. Our experiments conducted on a large query log, namely the AOL, demonstrate that using a joint embedding of queries and their terms within our proposed framework of document retrieval outperforms a number of text-only and sequence modeling based baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"5 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87045932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qianqian Xie, Yutao Zhu, Jimin Huang, Pan Du, J. Nie
Due to the overload of published scientific articles, citation recommendation has long been a critical research problem for automatically recommending the most relevant citations of given articles. Relational topic models (RTMs) have shown promise on citation prediction via joint modeling of document contents and citations. However, existing RTMs can only capture pairwise or direct (first-order) citation relationships among documents. The indirect (high-order) citation links have been explored in graph neural network–based methods, but these methods suffer from the well-known explainability problem. In this article, we propose a model called Graph Neural Collaborative Topic Model that takes advantage of both relational topic models and graph neural networks to capture high-order citation relationships and to have higher explainability due to the latent topic semantic structure. Experiments on three real-world citation datasets show that our model outperforms several competitive baseline methods on citation recommendation. In addition, we show that our approach can learn better topics than the existing approaches. The recommendation results can be well explained by the underlying topics.
{"title":"Graph Neural Collaborative Topic Model for Citation Recommendation","authors":"Qianqian Xie, Yutao Zhu, Jimin Huang, Pan Du, J. Nie","doi":"10.1145/3473973","DOIUrl":"https://doi.org/10.1145/3473973","url":null,"abstract":"Due to the overload of published scientific articles, citation recommendation has long been a critical research problem for automatically recommending the most relevant citations of given articles. Relational topic models (RTMs) have shown promise on citation prediction via joint modeling of document contents and citations. However, existing RTMs can only capture pairwise or direct (first-order) citation relationships among documents. The indirect (high-order) citation links have been explored in graph neural network–based methods, but these methods suffer from the well-known explainability problem. In this article, we propose a model called Graph Neural Collaborative Topic Model that takes advantage of both relational topic models and graph neural networks to capture high-order citation relationships and to have higher explainability due to the latent topic semantic structure. Experiments on three real-world citation datasets show that our model outperforms several competitive baseline methods on citation recommendation. In addition, we show that our approach can learn better topics than the existing approaches. The recommendation results can be well explained by the underlying topics.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"19 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89657008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning to rank from logged user feedback, such as clicks or purchases, is a central component of many real-world information systems. Different from human-annotated relevance labels, the user feedback is always noisy and biased. Many existing learning to rank methods infer the underlying relevance of query–item pairs based on different assumptions of examination, and still optimize a relevance based objective. Such methods rely heavily on the correct estimation of examination, which is often difficult to achieve in practice. In this work, we propose a general framework U-rank+ for learning to rank with logged user feedback from the perspective of graph matching. We systematically analyze the biases in user feedback, including examination bias and selection bias. Then, we take both biases into consideration for unbiased utility estimation that directly based on user feedback, instead of relevance. In order to maximize the estimated utility in an efficient manner, we design two different solvers based on Sinkhorn and LambdaLoss for U-rank+. The former is based on a standard graph matching algorithm, and the latter is inspired by the traditional method of learning to rank. Both of the algorithms have good theoretical properties to optimize the unbiased utility objective while the latter is proved to be empirically more effective and efficient in practice. Our framework U-rank+ can deal with a general utility function and can be used in a widespread of applications including web search, recommendation, and online advertising. Semi-synthetic experiments on three benchmark learning to rank datasets demonstrate the effectiveness of U-rank+. Furthermore, our proposed framework has been deployed on two different scenarios of a mainstream App store, where the online A/B testing shows that U-rank+ achieves an average improvement of 19.2% on click-through rate and 20.8% improvement on conversion rate in recommendation scenario, and 5.12% on platform revenue in online advertising scenario over the production baselines.
{"title":"Beyond Relevance Ranking: A General Graph Matching Framework for Utility-Oriented Learning to Rank","authors":"Xinyi Dai, Yunjia Xi, Weinan Zhang, Qing Liu, Ruiming Tang, Xiuqiang He, Jiawei Hou, Jun Wang, Yong Yu","doi":"10.1145/3464303","DOIUrl":"https://doi.org/10.1145/3464303","url":null,"abstract":"Learning to rank from logged user feedback, such as clicks or purchases, is a central component of many real-world information systems. Different from human-annotated relevance labels, the user feedback is always noisy and biased. Many existing learning to rank methods infer the underlying relevance of query–item pairs based on different assumptions of examination, and still optimize a relevance based objective. Such methods rely heavily on the correct estimation of examination, which is often difficult to achieve in practice. In this work, we propose a general framework U-rank+ for learning to rank with logged user feedback from the perspective of graph matching. We systematically analyze the biases in user feedback, including examination bias and selection bias. Then, we take both biases into consideration for unbiased utility estimation that directly based on user feedback, instead of relevance. In order to maximize the estimated utility in an efficient manner, we design two different solvers based on Sinkhorn and LambdaLoss for U-rank+. The former is based on a standard graph matching algorithm, and the latter is inspired by the traditional method of learning to rank. Both of the algorithms have good theoretical properties to optimize the unbiased utility objective while the latter is proved to be empirically more effective and efficient in practice. Our framework U-rank+ can deal with a general utility function and can be used in a widespread of applications including web search, recommendation, and online advertising. Semi-synthetic experiments on three benchmark learning to rank datasets demonstrate the effectiveness of U-rank+. Furthermore, our proposed framework has been deployed on two different scenarios of a mainstream App store, where the online A/B testing shows that U-rank+ achieves an average improvement of 19.2% on click-through rate and 20.8% improvement on conversion rate in recommendation scenario, and 5.12% on platform revenue in online advertising scenario over the production baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"47 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80567894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sheng Zhou, Xin Wang, M. Ester, Bolang Li, Chen Ye, Zhen Zhang, Can Wang, Jiajun Bu
User recommendation aims at recommending users with potential interests in the social network. Previous works have mainly focused on the undirected social networks with symmetric relationship such as friendship, whereas recent advances have been made on the asymmetric relationship such as the following and followed by relationship. Among the few existing direction-aware user recommendation methods, the random walk strategy has been widely adopted to extract the asymmetric proximity between users. However, according to our analysis on real-world directed social networks, we argue that the asymmetric proximity captured by existing random walk based methods are insufficient due to the inbalance in-degree and out-degree of nodes. To tackle this challenge, we propose InfoWalk, a novel informative walk strategy to efficiently capture the asymmetric proximity solely based on random walks. By transferring the direction information into the weights of each step, InfoWalk is able to overcome the limitation of edges while simultaneously maintain both the direction and proximity. Based on the asymmetric proximity captured by InfoWalk, we further propose the qualitative (DNE-L) and quantitative (DNE-T) directed network embedding methods, capable of preserving the two properties in the embedding space. Extensive experiments conducted on six real-world benchmark datasets demonstrate the superiority of the proposed DNE model over several state-of-the-art approaches in various tasks.
{"title":"Direction-Aware User Recommendation Based on Asymmetric Network Embedding","authors":"Sheng Zhou, Xin Wang, M. Ester, Bolang Li, Chen Ye, Zhen Zhang, Can Wang, Jiajun Bu","doi":"10.1145/3466754","DOIUrl":"https://doi.org/10.1145/3466754","url":null,"abstract":"User recommendation aims at recommending users with potential interests in the social network. Previous works have mainly focused on the undirected social networks with symmetric relationship such as friendship, whereas recent advances have been made on the asymmetric relationship such as the following and followed by relationship. Among the few existing direction-aware user recommendation methods, the random walk strategy has been widely adopted to extract the asymmetric proximity between users. However, according to our analysis on real-world directed social networks, we argue that the asymmetric proximity captured by existing random walk based methods are insufficient due to the inbalance in-degree and out-degree of nodes. To tackle this challenge, we propose InfoWalk, a novel informative walk strategy to efficiently capture the asymmetric proximity solely based on random walks. By transferring the direction information into the weights of each step, InfoWalk is able to overcome the limitation of edges while simultaneously maintain both the direction and proximity. Based on the asymmetric proximity captured by InfoWalk, we further propose the qualitative (DNE-L) and quantitative (DNE-T) directed network embedding methods, capable of preserving the two properties in the embedding space. Extensive experiments conducted on six real-world benchmark datasets demonstrate the superiority of the proposed DNE model over several state-of-the-art approaches in various tasks.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"190 1","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82530989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyed Ali Bahrainian, George Zerveas, F. Crestani, Carsten Eickhoff
Neural sequence-to-sequence models are the state-of-the-art approach used in abstractive summarization of textual documents, useful for producing condensed versions of source text narratives without being restricted to using only words from the original text. Despite the advances in abstractive summarization, custom generation of summaries (e.g., towards a user’s preference) remains unexplored. In this article, we present CATS, an abstractive neural summarization model that summarizes content in a sequence-to-sequence fashion while also introducing a new mechanism to control the underlying latent topic distribution of the produced summaries. We empirically illustrate the efficacy of our model in producing customized summaries and present findings that facilitate the design of such systems. We use the well-known CNN/DailyMail dataset to evaluate our model. Furthermore, we present a transfer-learning method and demonstrate the effectiveness of our approach in a low resource setting, i.e., abstractive summarization of meetings minutes, where combining the main available meetings’ transcripts datasets, AMI and International Computer Science Institute(ICSI), results in merely a few hundred training documents.
{"title":"CATS: Customizable Abstractive Topic-based Summarization","authors":"Seyed Ali Bahrainian, George Zerveas, F. Crestani, Carsten Eickhoff","doi":"10.1145/3464299","DOIUrl":"https://doi.org/10.1145/3464299","url":null,"abstract":"Neural sequence-to-sequence models are the state-of-the-art approach used in abstractive summarization of textual documents, useful for producing condensed versions of source text narratives without being restricted to using only words from the original text. Despite the advances in abstractive summarization, custom generation of summaries (e.g., towards a user’s preference) remains unexplored. In this article, we present CATS, an abstractive neural summarization model that summarizes content in a sequence-to-sequence fashion while also introducing a new mechanism to control the underlying latent topic distribution of the produced summaries. We empirically illustrate the efficacy of our model in producing customized summaries and present findings that facilitate the design of such systems. We use the well-known CNN/DailyMail dataset to evaluate our model. Furthermore, we present a transfer-learning method and demonstrate the effectiveness of our approach in a low resource setting, i.e., abstractive summarization of meetings minutes, where combining the main available meetings’ transcripts datasets, AMI and International Computer Science Institute(ICSI), results in merely a few hundred training documents.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"449 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77351566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daricia Wilkinson, Öznur Alkan, Q. Liao, Massimiliano Mattetti, Inge Vejsbjerg, Bart P. Knijnenburg, Elizabeth M. Daly
Chatbots or conversational recommenders have gained increasing popularity as a new paradigm for Recommender Systems (RS). Prior work on RS showed that providing explanations can improve transparency and trust, which are critical for the adoption of RS. Their interactive and engaging nature makes conversational recommenders a natural platform to not only provide recommendations but also justify the recommendations through explanations. The recent surge of interest inexplainable AI enables diverse styles of justification, and also invites questions on how styles of justification impact user perception. In this article, we explore the effect of “why” justifications and “why not” justifications on users’ perceptions of explainability and trust. We developed and tested a movie-recommendation chatbot that provides users with different types of justifications for the recommended items. Our online experiment (n = 310) demonstrates that the “why” justifications (but not the “why not” justifications) have a significant impact on users’ perception of the conversational recommender. Particularly, “why” justifications increase users’ perception of system transparency, which impacts perceived control, trusting beliefs and in turn influences users’ willingness to depend on the system’s advice. Finally, we discuss the design implications for decision-assisting chatbots.
{"title":"Why or Why Not? The Effect of Justification Styles on Chatbot Recommendations","authors":"Daricia Wilkinson, Öznur Alkan, Q. Liao, Massimiliano Mattetti, Inge Vejsbjerg, Bart P. Knijnenburg, Elizabeth M. Daly","doi":"10.1145/3441715","DOIUrl":"https://doi.org/10.1145/3441715","url":null,"abstract":"Chatbots or conversational recommenders have gained increasing popularity as a new paradigm for Recommender Systems (RS). Prior work on RS showed that providing explanations can improve transparency and trust, which are critical for the adoption of RS. Their interactive and engaging nature makes conversational recommenders a natural platform to not only provide recommendations but also justify the recommendations through explanations. The recent surge of interest inexplainable AI enables diverse styles of justification, and also invites questions on how styles of justification impact user perception. In this article, we explore the effect of “why” justifications and “why not” justifications on users’ perceptions of explainability and trust. We developed and tested a movie-recommendation chatbot that provides users with different types of justifications for the recommended items. Our online experiment (n = 310) demonstrates that the “why” justifications (but not the “why not” justifications) have a significant impact on users’ perception of the conversational recommender. Particularly, “why” justifications increase users’ perception of system transparency, which impacts perceived control, trusting beliefs and in turn influences users’ willingness to depend on the system’s advice. Finally, we discuss the design implications for decision-assisting chatbots.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"72 1","pages":"1 - 21"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80535333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}