Explainable recommendation has recently attracted increasing attention from both academic and industry communities. Among different explainable strategies, generating natural language explanations is an important method, which can deliver more informative, flexible and readable explanations to facilitate better user decisions. Despite the effectiveness, existing models are mostly optimized based on the observed datasets, which can be skewed due to the selection or exposure bias. To alleviate this problem, in this paper, we formulate the task of explainable recommendation with a causal graph, and design a causality enhanced framework to generate unbiased explanations. More specifically, we firstly define an ideal unbiased learning objective, and then derive a tractable loss for the observational data based on the inverse propensity score (IPS), where the key is a sample re-weighting strategy for equalizing the loss and ideal objective in expectation. Considering that the IPS estimated from the sparse and noisy recommendation datasets can be inaccurate, we introduce a fault tolerant mechanism by minimizing the maximum loss induced by the sample weights near the IPS. For more comprehensive modeling, we further analyze and infer the potential latent confounders induced by the complex and diverse user personalities. We conduct extensive experiments by comparing with the state-of-the-art methods based on three real-world datasets to demonstrate the effectiveness of our method.
{"title":"Recommendation with Causality enhanced Natural Language Explanations","authors":"Jingsen Zhang, Xu Chen, Jiakai Tang, Weiqi Shao, Quanyu Dai, Zhenhua Dong, Rui Zhang","doi":"10.1145/3543507.3583260","DOIUrl":"https://doi.org/10.1145/3543507.3583260","url":null,"abstract":"Explainable recommendation has recently attracted increasing attention from both academic and industry communities. Among different explainable strategies, generating natural language explanations is an important method, which can deliver more informative, flexible and readable explanations to facilitate better user decisions. Despite the effectiveness, existing models are mostly optimized based on the observed datasets, which can be skewed due to the selection or exposure bias. To alleviate this problem, in this paper, we formulate the task of explainable recommendation with a causal graph, and design a causality enhanced framework to generate unbiased explanations. More specifically, we firstly define an ideal unbiased learning objective, and then derive a tractable loss for the observational data based on the inverse propensity score (IPS), where the key is a sample re-weighting strategy for equalizing the loss and ideal objective in expectation. Considering that the IPS estimated from the sparse and noisy recommendation datasets can be inaccurate, we introduce a fault tolerant mechanism by minimizing the maximum loss induced by the sample weights near the IPS. For more comprehensive modeling, we further analyze and infer the potential latent confounders induced by the complex and diverse user personalities. We conduct extensive experiments by comparing with the state-of-the-art methods based on three real-world datasets to demonstrate the effectiveness of our method.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134298381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile extended reality (XR) has developed rapidly in recent years. Compared with the app-based XR, XR in web browsers has the advantages of being lightweight and cross-platform, providing users with a pervasive experience. Therefore, many frameworks are emerging to support the development of XR in web browsers. However, little has been known about how well these frameworks perform and how complex XR apps modern web browsers can support on mobile devices. To fill the knowledge gap, in this paper, we conduct an empirical study of mobile XR in web browsers. We select seven most popular web-based XR frameworks and investigate their runtime performance, including 3D rendering, camera capturing, and real-world understanding. We find that current frameworks have the potential to further enhance their performance by increasing GPU utilization or improving computing parallelism. Besides, for 3D scenes with good rendering performance, developers can feel free to add camera capturing with little influence on performance to support augmented reality (AR) and mixed reality (MR) applications. Based on our findings, we draw several practical implications to provide better XR support in web browsers.
{"title":"Demystifying Mobile Extended Reality in Web Browsers: How Far Can We Go?","authors":"Weichen Bi, Yun Ma, Deyu Tian, Qi Yang, Mingtao Zhang, Xiang Jing","doi":"10.1145/3543507.3583329","DOIUrl":"https://doi.org/10.1145/3543507.3583329","url":null,"abstract":"Mobile extended reality (XR) has developed rapidly in recent years. Compared with the app-based XR, XR in web browsers has the advantages of being lightweight and cross-platform, providing users with a pervasive experience. Therefore, many frameworks are emerging to support the development of XR in web browsers. However, little has been known about how well these frameworks perform and how complex XR apps modern web browsers can support on mobile devices. To fill the knowledge gap, in this paper, we conduct an empirical study of mobile XR in web browsers. We select seven most popular web-based XR frameworks and investigate their runtime performance, including 3D rendering, camera capturing, and real-world understanding. We find that current frameworks have the potential to further enhance their performance by increasing GPU utilization or improving computing parallelism. Besides, for 3D scenes with good rendering performance, developers can feel free to add camera capturing with little influence on performance to support augmented reality (AR) and mixed reality (MR) applications. Based on our findings, we draw several practical implications to provide better XR support in web browsers.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133900868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software as a Service (SaaS) e-commerce platforms for merchants allow individual business owners to set up their online stores almost instantly. Prior work has shown that the checkout flows and payment integration of some e-commerce applications are vulnerable to logic bugs with serious financial consequences, e.g., allowing “shopping for free”. Apart from checkout and payment integration, vulnerabilities in other e-commerce operations have remained largely unexplored, even though they can have far more serious consequences, e.g., enabling “store takeover”. In this work, we design and implement a security evaluation framework to uncover security vulnerabilities in e-commerce operations beyond checkout/payment integration. We use this framework to analyze 32 representative e-commerce platforms, including web services of 24 commercial SaaS platforms and 15 associated Android apps, and 8 open source platforms; these platforms host over 10 million stores as approximated through Google dorks. We uncover several new vulnerabilities with serious consequences, e.g., allowing an attacker to take over all stores under a platform, and listing illegal products at a victim’s store—in addition to “shopping for free” bugs, without exploiting the checkout/payment process. We found 12 platforms vulnerable to store takeover (affecting 41000+ stores) and 6 platforms vulnerable to shopping for free (affecting 19000+ stores, approximated via Google dorks on Oct. 8, 2022). We have responsibly disclosed the vulnerabilities to all affected parties, and requested four CVEs (three assigned, and one is pending review).
{"title":"All Your Shops Are Belong to Us: Security Weaknesses in E-commerce Platforms","authors":"Rohan Pagey, Mohammad Mannan, Amr M. Youssef","doi":"10.1145/3543507.3583319","DOIUrl":"https://doi.org/10.1145/3543507.3583319","url":null,"abstract":"Software as a Service (SaaS) e-commerce platforms for merchants allow individual business owners to set up their online stores almost instantly. Prior work has shown that the checkout flows and payment integration of some e-commerce applications are vulnerable to logic bugs with serious financial consequences, e.g., allowing “shopping for free”. Apart from checkout and payment integration, vulnerabilities in other e-commerce operations have remained largely unexplored, even though they can have far more serious consequences, e.g., enabling “store takeover”. In this work, we design and implement a security evaluation framework to uncover security vulnerabilities in e-commerce operations beyond checkout/payment integration. We use this framework to analyze 32 representative e-commerce platforms, including web services of 24 commercial SaaS platforms and 15 associated Android apps, and 8 open source platforms; these platforms host over 10 million stores as approximated through Google dorks. We uncover several new vulnerabilities with serious consequences, e.g., allowing an attacker to take over all stores under a platform, and listing illegal products at a victim’s store—in addition to “shopping for free” bugs, without exploiting the checkout/payment process. We found 12 platforms vulnerable to store takeover (affecting 41000+ stores) and 6 platforms vulnerable to shopping for free (affecting 19000+ stores, approximated via Google dorks on Oct. 8, 2022). We have responsibly disclosed the vulnerabilities to all affected parties, and requested four CVEs (three assigned, and one is pending review).","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123848384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riting Xia, Yan Zhang, Chunxu Zhang, Xueyan Liu, Bo Yang
Variational graph autoencoder (VGAE) is a promising deep probabilistic model in graph representation learning. However, most existing VGAEs adopt the mean-field assumption, and cannot characterize the graphs with noise well. In this paper, we propose a novel deep probabilistic model for graph analysis, termed Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks (named SPN-MVGAE), which helps to relax the mean-field assumption and learns better latent representation with fault tolerance. Our proposed model SPN-MVGAE uses conditional sum-product networks as constraints to learn the dependencies between latent factors in an end-to-end manner. Furthermore, we introduce the superposition of the latent representations learned by multiple variational networks to represent the final latent representations of nodes. Our model is the first use sum-product networks for graph representation learning, extending the scope of sum-product networks applications. Experimental results show that compared with other baseline methods, our model has competitive advantages in link prediction, fault tolerance, node classification, and graph visualization on real datasets.
{"title":"Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks","authors":"Riting Xia, Yan Zhang, Chunxu Zhang, Xueyan Liu, Bo Yang","doi":"10.1145/3543507.3583517","DOIUrl":"https://doi.org/10.1145/3543507.3583517","url":null,"abstract":"Variational graph autoencoder (VGAE) is a promising deep probabilistic model in graph representation learning. However, most existing VGAEs adopt the mean-field assumption, and cannot characterize the graphs with noise well. In this paper, we propose a novel deep probabilistic model for graph analysis, termed Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks (named SPN-MVGAE), which helps to relax the mean-field assumption and learns better latent representation with fault tolerance. Our proposed model SPN-MVGAE uses conditional sum-product networks as constraints to learn the dependencies between latent factors in an end-to-end manner. Furthermore, we introduce the superposition of the latent representations learned by multiple variational networks to represent the final latent representations of nodes. Our model is the first use sum-product networks for graph representation learning, extending the scope of sum-product networks applications. Experimental results show that compared with other baseline methods, our model has competitive advantages in link prediction, fault tolerance, node classification, and graph visualization on real datasets.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127903408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shiguang Wu, Yaqing Wang, Qinghe Jing, Daxiang Dong, D. Dou, Quanming Yao
Making personalized recommendation for cold-start users, who only have a few interaction histories, is a challenging problem in recommendation systems. Recent works leverage hypernetworks to directly map user interaction histories to user-specific parameters, which are then used to modulate predictor by feature-wise linear modulation function. These works obtain the state-of-the-art performance. However, the physical meaning of scaling and shifting in recommendation data is unclear. Instead of using a fixed modulation function and deciding modulation position by expertise, we propose a modulation framework called ColdNAS for user cold-start problem, where we look for proper modulation structure, including function and position, via neural architecture search. We design a search space which covers broad models and theoretically prove that this search space can be transformed to a much smaller space, enabling an efficient and robust one-shot search algorithm. Extensive experimental results on benchmark datasets show that ColdNAS consistently performs the best. We observe that different modulation functions lead to the best performance on different datasets, which validates the necessity of designing a searching-based method. Codes are available at https://github.com/LARS-research/ColdNAS.
{"title":"ColdNAS: Search to Modulate for User Cold-Start Recommendation","authors":"Shiguang Wu, Yaqing Wang, Qinghe Jing, Daxiang Dong, D. Dou, Quanming Yao","doi":"10.1145/3543507.3583344","DOIUrl":"https://doi.org/10.1145/3543507.3583344","url":null,"abstract":"Making personalized recommendation for cold-start users, who only have a few interaction histories, is a challenging problem in recommendation systems. Recent works leverage hypernetworks to directly map user interaction histories to user-specific parameters, which are then used to modulate predictor by feature-wise linear modulation function. These works obtain the state-of-the-art performance. However, the physical meaning of scaling and shifting in recommendation data is unclear. Instead of using a fixed modulation function and deciding modulation position by expertise, we propose a modulation framework called ColdNAS for user cold-start problem, where we look for proper modulation structure, including function and position, via neural architecture search. We design a search space which covers broad models and theoretically prove that this search space can be transformed to a much smaller space, enabling an efficient and robust one-shot search algorithm. Extensive experimental results on benchmark datasets show that ColdNAS consistently performs the best. We observe that different modulation functions lead to the best performance on different datasets, which validates the necessity of designing a searching-based method. Codes are available at https://github.com/LARS-research/ColdNAS.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121442178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Auto-bidding is now widely adopted as an interface between advertisers and internet advertising as it allows advertisers to specify high-level goals, such as maximizing value subject to a value-per-spend constraint. Prior research has mainly focused on auctions that are truthful (such as a second-price auction) because these auctions admit simple (uniform) bidding strategies and are thus simpler to analyze. The main contribution of this paper is to characterize the efficiency across the spectrum of all auctions, including non-truthful auctions for which optimal bidding may be complex. For deterministic auctions, we show a dominance result: any uniform bidding equilibrium of a second-price auction (SPA) can be mapped to an equilibrium of any other auction – for example, first price auction (FPA) – with identical outcomes. In this sense, SPA with uniform bidding is an instance-wise optimal deterministic auction. Consequently, the price of anarchy (PoA) of any deterministic auction is at least the PoA of SPA with uniform bidding, which is known to be 2. We complement this by showing that the PoA of FPA without uniform bidding is 2. Next, we show, surprisingly, that truthful pricing is not dominant in the randomized setting. There is a randomized version of FPA that achieves a strictly smaller price of anarchy than its truthful counterpart when there are two bidders per query. Furthermore, this randomized FPA achieves the best-known PoA for two bidders, thus showing the power of non-truthfulness when combined with randomization. Finally, we show that no prior-free auction (even randomized, non-truthful) can improve on a PoA bound of 2 when there are a large number of advertisers per auction. These results should be interpreted qualitatively as follows. When the auction pressure is low, randomization and non-truthfulness is beneficial. On the other hand, if the auction pressure is intense, the benefits diminishes and it is optimal to implement a second-price auction.
{"title":"Efficiency of Non-Truthful Auctions in Auto-bidding: The Power of Randomization","authors":"Christopher Liaw, Aranyak Mehta, Andres Perlroth","doi":"10.1145/3543507.3583492","DOIUrl":"https://doi.org/10.1145/3543507.3583492","url":null,"abstract":"Auto-bidding is now widely adopted as an interface between advertisers and internet advertising as it allows advertisers to specify high-level goals, such as maximizing value subject to a value-per-spend constraint. Prior research has mainly focused on auctions that are truthful (such as a second-price auction) because these auctions admit simple (uniform) bidding strategies and are thus simpler to analyze. The main contribution of this paper is to characterize the efficiency across the spectrum of all auctions, including non-truthful auctions for which optimal bidding may be complex. For deterministic auctions, we show a dominance result: any uniform bidding equilibrium of a second-price auction (SPA) can be mapped to an equilibrium of any other auction – for example, first price auction (FPA) – with identical outcomes. In this sense, SPA with uniform bidding is an instance-wise optimal deterministic auction. Consequently, the price of anarchy (PoA) of any deterministic auction is at least the PoA of SPA with uniform bidding, which is known to be 2. We complement this by showing that the PoA of FPA without uniform bidding is 2. Next, we show, surprisingly, that truthful pricing is not dominant in the randomized setting. There is a randomized version of FPA that achieves a strictly smaller price of anarchy than its truthful counterpart when there are two bidders per query. Furthermore, this randomized FPA achieves the best-known PoA for two bidders, thus showing the power of non-truthfulness when combined with randomization. Finally, we show that no prior-free auction (even randomized, non-truthful) can improve on a PoA bound of 2 when there are a large number of advertisers per auction. These results should be interpreted qualitatively as follows. When the auction pressure is low, randomization and non-truthfulness is beneficial. On the other hand, if the auction pressure is intense, the benefits diminishes and it is optimal to implement a second-price auction.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"121 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129406046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent studies have demonstrated the vulnerability of recommender systems to membership inference attacks, which determine whether a user’s historical data was utilized for model training, posing serious privacy leakage issues. Existing works assumed that member and non-member users follow different recommendation modes, and then infer membership based on the difference vector between the user’s historical behaviors and the recommendation list. The previous frameworks are invalid against inductive recommendations, such as sequential recommendations, since the disparities of difference vectors constructed by the recommendations between members and non-members become imperceptible. This motivates us to dig deeper into the target model. In addition, most MIA frameworks assume that they can obtain some in-distribution data from the same distribution of the target data, which is hard to gain in recommender system. To address these difficulties, we propose a Membership Inference Attack framework against sequential recommenders based on Model Extraction(ME-MIA). Specifically, we train a surrogate model to simulate the target model based on two universal loss functions. For a given behavior sequence, the loss functions ensure the recommended items and corresponding rank of the surrogate model are consistent with the target model’s recommendation. Due to the special training mode of the surrogate model, it is hard to judge which user is its member(non-member). Therefore, we establish a shadow model and use shadow model’s members(non-members) to train the attack model later. Next, we build a user feature generator to construct representative feature vectors from the shadow(surrogate) model. The crafting feature vectors are finally input into the attack model to identify users’ membership. Furthermore, to tackle the high cost of obtaining in-distribution data, we develop two variants of ME-MIA, realizing data-efficient and even data-free MIA by fabricating authentic in-distribution data. Notably, the latter is impossible in the previous works. Finally, we evaluate ME-MIA against multiple sequential recommendation models on three real-world datasets. Experimental results show that ME-MIA and its variants can achieve efficient extraction and outperform state-of-the-art algorithms in terms of attack performance.
{"title":"Membership Inference Attacks Against Sequential Recommender Systems","authors":"Zhihao Zhu, Chenwang Wu, Rui Fan, Defu Lian, Enhong Chen","doi":"10.1145/3543507.3583447","DOIUrl":"https://doi.org/10.1145/3543507.3583447","url":null,"abstract":"Recent studies have demonstrated the vulnerability of recommender systems to membership inference attacks, which determine whether a user’s historical data was utilized for model training, posing serious privacy leakage issues. Existing works assumed that member and non-member users follow different recommendation modes, and then infer membership based on the difference vector between the user’s historical behaviors and the recommendation list. The previous frameworks are invalid against inductive recommendations, such as sequential recommendations, since the disparities of difference vectors constructed by the recommendations between members and non-members become imperceptible. This motivates us to dig deeper into the target model. In addition, most MIA frameworks assume that they can obtain some in-distribution data from the same distribution of the target data, which is hard to gain in recommender system. To address these difficulties, we propose a Membership Inference Attack framework against sequential recommenders based on Model Extraction(ME-MIA). Specifically, we train a surrogate model to simulate the target model based on two universal loss functions. For a given behavior sequence, the loss functions ensure the recommended items and corresponding rank of the surrogate model are consistent with the target model’s recommendation. Due to the special training mode of the surrogate model, it is hard to judge which user is its member(non-member). Therefore, we establish a shadow model and use shadow model’s members(non-members) to train the attack model later. Next, we build a user feature generator to construct representative feature vectors from the shadow(surrogate) model. The crafting feature vectors are finally input into the attack model to identify users’ membership. Furthermore, to tackle the high cost of obtaining in-distribution data, we develop two variants of ME-MIA, realizing data-efficient and even data-free MIA by fabricating authentic in-distribution data. Notably, the latter is impossible in the previous works. Finally, we evaluate ME-MIA against multiple sequential recommendation models on three real-world datasets. Experimental results show that ME-MIA and its variants can achieve efficient extraction and outperform state-of-the-art algorithms in terms of attack performance.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"409 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116134220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Combinatorial drug recommendation involves recommending a personalized combination of medication (drugs) to a patient over his/her longitudinal history, which essentially aims at solving a combinatorial optimization problem that pursues high accuracy under the safety constraint. Among existing learning-based approaches, the association between drug substructures (i.e., a sub-graph of the molecule that contributes to certain chemical effect) and the target disease is largely overlooked, though the function of drugs in fact exhibits strong relevance with particular substructures. To address this issue, we propose a molecular substructure-aware encoding method entitled MoleRec that entails a hierarchical architecture aimed at modeling inter-substructure interactions and individual substructures’ impact on patient’s health condition, in order to identify those substructures that really contribute to healing patients. Specifically, MoleRec learns to attentively pooling over substructure representations which will be element-wisely re-scaled by the model’s inferred relevancy with a patient’s health condition to obtain a prior-knowledge-informed drug representation. We further design a weight annealing strategy for drug-drug-interaction (DDI) objective to adaptively control the balance between accuracy and safety criteria throughout training. Experiments on the MIMIC-III dataset demonstrate that our approach achieves new state-of-the-art performance w.r.t. four accuracy and safety metrics. Our source code is publicly available at https://github.com/yangnianzu0515/MoleRec.
{"title":"MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning","authors":"Nianzu Yang, Kaipeng Zeng, Qitian Wu, Junchi Yan","doi":"10.1145/3543507.3583872","DOIUrl":"https://doi.org/10.1145/3543507.3583872","url":null,"abstract":"Combinatorial drug recommendation involves recommending a personalized combination of medication (drugs) to a patient over his/her longitudinal history, which essentially aims at solving a combinatorial optimization problem that pursues high accuracy under the safety constraint. Among existing learning-based approaches, the association between drug substructures (i.e., a sub-graph of the molecule that contributes to certain chemical effect) and the target disease is largely overlooked, though the function of drugs in fact exhibits strong relevance with particular substructures. To address this issue, we propose a molecular substructure-aware encoding method entitled MoleRec that entails a hierarchical architecture aimed at modeling inter-substructure interactions and individual substructures’ impact on patient’s health condition, in order to identify those substructures that really contribute to healing patients. Specifically, MoleRec learns to attentively pooling over substructure representations which will be element-wisely re-scaled by the model’s inferred relevancy with a patient’s health condition to obtain a prior-knowledge-informed drug representation. We further design a weight annealing strategy for drug-drug-interaction (DDI) objective to adaptively control the balance between accuracy and safety criteria throughout training. Experiments on the MIMIC-III dataset demonstrate that our approach achieves new state-of-the-art performance w.r.t. four accuracy and safety metrics. Our source code is publicly available at https://github.com/yangnianzu0515/MoleRec.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117067693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikita Gautam, David Shumway, Megan Kowalcyk, Sarthak Khanal, Doina Caragea, Cornelia Caragea, H. Mcginty, S. Dorevitch
A knowledge graph focusing on water quality in relation to health risks posed by water activities (such as diving or swimming) is not currently available. To address this limitation, we first use existing resources to construct a knowledge graph relevant to water quality and health risks using KNowledge Acquisition and Representation Methodology (KNARM). Subsequently, we explore knowledge graph completion approaches for maintaining and updating the graph. Specifically, we manually identify a set of domain-specific UMLS concepts and use them to extract a graph of approximately 75,000 semantic triples from the Semantic MEDLINE database (which contains head-relation-tail triples extracted from PubMed). Using the resulting knowledge graph, we experiment with the KG-BERT approach for graph completion by employing pre-trained BERT/RoBERTa models and also models fine-tuned on a collection of water quality and health risks abstracts retrieved from the Web of Science. Experimental results show that KG-BERT with BERT/RoBERTa models fine-tuned on a domain-specific corpus improves the performance of KG-BERT with pre-trained models. Furthermore, KG-BERT gives better results than several translational distance or semantic matching baseline models.
目前还没有关于水质与水上活动(如潜水或游泳)造成的健康风险的知识图谱。为了解决这一限制,我们首先使用知识获取和表示方法(KNARM)利用现有资源构建了与水质和健康风险相关的知识图谱。随后,我们探索了维护和更新知识图的知识图补全方法。具体来说,我们手动识别一组特定于领域的UMLS概念,并使用它们从semantic MEDLINE数据库(其中包含从PubMed提取的头-尾三元组)中提取大约75,000个语义三元组的图。使用得到的知识图,我们通过使用预训练的BERT/RoBERTa模型和对从Web of Science检索的水质和健康风险摘要集合进行微调的模型,对KG-BERT方法进行图补全实验。实验结果表明,在特定领域语料库上对BERT/RoBERTa模型进行微调的KG-BERT可以提高预训练模型的KG-BERT的性能。此外,KG-BERT给出了比几种翻译距离或语义匹配基线模型更好的结果。
{"title":"Leveraging Existing Literature on the Web and Deep Neural Models to Build a Knowledge Graph Focused on Water Quality and Health Risks","authors":"Nikita Gautam, David Shumway, Megan Kowalcyk, Sarthak Khanal, Doina Caragea, Cornelia Caragea, H. Mcginty, S. Dorevitch","doi":"10.1145/3543507.3584185","DOIUrl":"https://doi.org/10.1145/3543507.3584185","url":null,"abstract":"A knowledge graph focusing on water quality in relation to health risks posed by water activities (such as diving or swimming) is not currently available. To address this limitation, we first use existing resources to construct a knowledge graph relevant to water quality and health risks using KNowledge Acquisition and Representation Methodology (KNARM). Subsequently, we explore knowledge graph completion approaches for maintaining and updating the graph. Specifically, we manually identify a set of domain-specific UMLS concepts and use them to extract a graph of approximately 75,000 semantic triples from the Semantic MEDLINE database (which contains head-relation-tail triples extracted from PubMed). Using the resulting knowledge graph, we experiment with the KG-BERT approach for graph completion by employing pre-trained BERT/RoBERTa models and also models fine-tuned on a collection of water quality and health risks abstracts retrieved from the Web of Science. Experimental results show that KG-BERT with BERT/RoBERTa models fine-tuned on a domain-specific corpus improves the performance of KG-BERT with pre-trained models. Furthermore, KG-BERT gives better results than several translational distance or semantic matching baseline models.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117190703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuefeng Zhang, Richong Zhang, Xiaoyang Li, Fanshuang Kong, J. Chen, Samuel Mensah, Yongyi Mao
Word Sense Disambiguation (WSD) which aims to identify the correct sense of a target word appearing in a specific context is essential for web text analysis. The use of glosses has been explored as a means for WSD. However, only a few works model the correlation between the target context and gloss. We add to the body of literature by presenting a model that employs a multi-head attention mechanism on deep contextual features of the target word and candidate glosses to refine the target word embedding. Furthermore, to encourage the model to learn the relevant part of target features that align with the correct gloss, we recursively alternate attention on target word features and that of candidate glosses to gradually extract the relevant contextual features of the target word, refining its representation and strengthening the final disambiguation results. Empirical studies on the five most commonly used benchmark datasets show that our proposed model is effective and achieves state-of-the-art results.
{"title":"Word Sense Disambiguation by Refining Target Word Embedding","authors":"Xuefeng Zhang, Richong Zhang, Xiaoyang Li, Fanshuang Kong, J. Chen, Samuel Mensah, Yongyi Mao","doi":"10.1145/3543507.3583191","DOIUrl":"https://doi.org/10.1145/3543507.3583191","url":null,"abstract":"Word Sense Disambiguation (WSD) which aims to identify the correct sense of a target word appearing in a specific context is essential for web text analysis. The use of glosses has been explored as a means for WSD. However, only a few works model the correlation between the target context and gloss. We add to the body of literature by presenting a model that employs a multi-head attention mechanism on deep contextual features of the target word and candidate glosses to refine the target word embedding. Furthermore, to encourage the model to learn the relevant part of target features that align with the correct gloss, we recursively alternate attention on target word features and that of candidate glosses to gradually extract the relevant contextual features of the target word, refining its representation and strengthening the final disambiguation results. Empirical studies on the five most commonly used benchmark datasets show that our proposed model is effective and achieves state-of-the-art results.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114339309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}