Oren Barkan, Avi Caciularu, Idan Rejwan, Ori Katz, Jonathan Weill, Itzik Malkiel, Noam Koenigstein
We present Variational Bayesian Network (VBN) - a novel Bayesian entity representation learning model that utilizes hierarchical and relational side information and is particularly useful for modeling entities in the "long-tail'', where the data is scarce. VBN provides better modeling for long-tail entities via two complementary mechanisms: First, VBN employs informative hierarchical priors that enable information propagation between entities sharing common ancestors. Additionally, VBN models explicit relations between entities that enforce complementary structure and consistency, guiding the learned representations towards a more meaningful arrangement in space. Second, VBN represents entities by densities (rather than vectors), hence modeling uncertainty that plays a complementary role in coping with data scarcity. Finally, we propose a scalable Variational Bayes optimization algorithm that enables fast approximate Bayesian inference. We evaluate the effectiveness of VBN on linguistic, recommendations, and medical inference tasks. Our findings show that VBN outperforms other existing methods across multiple datasets, and especially in the long-tail.
{"title":"Representation Learning via Variational Bayesian Networks","authors":"Oren Barkan, Avi Caciularu, Idan Rejwan, Ori Katz, Jonathan Weill, Itzik Malkiel, Noam Koenigstein","doi":"10.1145/3459637.3482363","DOIUrl":"https://doi.org/10.1145/3459637.3482363","url":null,"abstract":"We present Variational Bayesian Network (VBN) - a novel Bayesian entity representation learning model that utilizes hierarchical and relational side information and is particularly useful for modeling entities in the \"long-tail'', where the data is scarce. VBN provides better modeling for long-tail entities via two complementary mechanisms: First, VBN employs informative hierarchical priors that enable information propagation between entities sharing common ancestors. Additionally, VBN models explicit relations between entities that enforce complementary structure and consistency, guiding the learned representations towards a more meaningful arrangement in space. Second, VBN represents entities by densities (rather than vectors), hence modeling uncertainty that plays a complementary role in coping with data scarcity. Finally, we propose a scalable Variational Bayes optimization algorithm that enables fast approximate Bayesian inference. We evaluate the effectiveness of VBN on linguistic, recommendations, and medical inference tasks. Our findings show that VBN outperforms other existing methods across multiple datasets, and especially in the long-tail.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peiyang Liu, Xi Wang, Lin Wang, Wei Ye, Xiangyu Xi, Shikun Zhang
Distilled BERT models are more suitable for efficient vertical retrieval in online sponsored vertical search with low-latency requirements than BERT due to fewer parameters and faster inference. Unfortunately, most of these models are still far from ideal inference speed. This paper presents a novel and effective method to distill knowledge from BERT into simple fully connected neural networks (FNN). Results of extensive experiments on English and Chinese datasets demonstrate that our method achieves comparable results with existing distilled BERT models while the inference is accelerated by more than ten times. We have successfully applied our method on our online sponsored vertical search engine and get remarkable improvements.
{"title":"Distilling Knowledge from BERT into Simple Fully Connected Neural Networks for Efficient Vertical Retrieval","authors":"Peiyang Liu, Xi Wang, Lin Wang, Wei Ye, Xiangyu Xi, Shikun Zhang","doi":"10.1145/3459637.3481909","DOIUrl":"https://doi.org/10.1145/3459637.3481909","url":null,"abstract":"Distilled BERT models are more suitable for efficient vertical retrieval in online sponsored vertical search with low-latency requirements than BERT due to fewer parameters and faster inference. Unfortunately, most of these models are still far from ideal inference speed. This paper presents a novel and effective method to distill knowledge from BERT into simple fully connected neural networks (FNN). Results of extensive experiments on English and Chinese datasets demonstrate that our method achieves comparable results with existing distilled BERT models while the inference is accelerated by more than ten times. We have successfully applied our method on our online sponsored vertical search engine and get remarkable improvements.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128440569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yad Fatah, Mark Nourallah, Lynn Wahab, Fatima K. Abu Salem, Shady Elbassuoni
In a world embroiled in armed conflicts, documenting conflict casualties is an important goal for many NGOs. Most of such documented records of casualties are however managed through internal databases, spreadsheets or Web forms. As such, exploring and querying such data becomes extremely chaotic. In this paper, we demonstrate CasualtIS, an RDF data management system for conflict casualties. Our system models conflict casualties data as RDF graphs and allows users to query such data using a SPARQL endpoint. Our system also includes a template-based natural-language querying interface to support non-expert users. Our system can be used for various purposes by end users, such as fact-checking certain claims about conflict casualties, aggregating casualties over time and location, and finding contextual information about casualties, such as the cause of death, actors involved, and other similar critical information. We demonstrate our system using two case studies, one related to casualties in the Iraqi war and the other related to casualties in the Syrian war.
{"title":"An RDF Data Management System for Conflict Casualties","authors":"Yad Fatah, Mark Nourallah, Lynn Wahab, Fatima K. Abu Salem, Shady Elbassuoni","doi":"10.1145/3459637.3481976","DOIUrl":"https://doi.org/10.1145/3459637.3481976","url":null,"abstract":"In a world embroiled in armed conflicts, documenting conflict casualties is an important goal for many NGOs. Most of such documented records of casualties are however managed through internal databases, spreadsheets or Web forms. As such, exploring and querying such data becomes extremely chaotic. In this paper, we demonstrate CasualtIS, an RDF data management system for conflict casualties. Our system models conflict casualties data as RDF graphs and allows users to query such data using a SPARQL endpoint. Our system also includes a template-based natural-language querying interface to support non-expert users. Our system can be used for various purposes by end users, such as fact-checking certain claims about conflict casualties, aggregating casualties over time and location, and finding contextual information about casualties, such as the cause of death, actors involved, and other similar critical information. We demonstrate our system using two case studies, one related to casualties in the Iraqi war and the other related to casualties in the Syrian war.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124546501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the classical influence maximization problem we aim to select a set of nodes, called seeds, to start an efficient information diffusion process. More precisely, the goal is to select seeds such that the expected number of nodes reached by the diffusion process is maximized. In this work we study a variant of this problem where an unknown (up to a probability distribution) set of nodes, referred to as co-existing seeds, joins in starting the diffusion process even if not selected. This setting allows to model that, in certain situations, some nodes are willing to act as "voluntary seeds'' even if not chosen by the campaign organizer. This may for example be due to the positive nature of the information campaign (e.g., public health awareness programs, HIV prevention, financial aid programs), or due to external social driving effects (e.g., nodes are friends of selected seeds in real life or in other social media). In this setting, we study two types of optimization problems. While the first one aims to maximize the expected number of reached nodes, the second one endeavors to maximize the expected increment in the number of reached nodes in comparison to a non-intervention strategy. The problems (particularly the second one) are motivated by cooperative game theory. For various probability distributions on co-existing seeds, we obtain several algorithms with approximation guarantees as well as hardness and hardness of approximation results. We conclude with experiments that demonstrate the usefulness of our approach when co-existing seeds exist.
{"title":"Influence Maximization With Co-Existing Seeds","authors":"R. Becker, Gianlorenzo D'angelo, Hugo Gilbert","doi":"10.1145/3459637.3482439","DOIUrl":"https://doi.org/10.1145/3459637.3482439","url":null,"abstract":"In the classical influence maximization problem we aim to select a set of nodes, called seeds, to start an efficient information diffusion process. More precisely, the goal is to select seeds such that the expected number of nodes reached by the diffusion process is maximized. In this work we study a variant of this problem where an unknown (up to a probability distribution) set of nodes, referred to as co-existing seeds, joins in starting the diffusion process even if not selected. This setting allows to model that, in certain situations, some nodes are willing to act as \"voluntary seeds'' even if not chosen by the campaign organizer. This may for example be due to the positive nature of the information campaign (e.g., public health awareness programs, HIV prevention, financial aid programs), or due to external social driving effects (e.g., nodes are friends of selected seeds in real life or in other social media). In this setting, we study two types of optimization problems. While the first one aims to maximize the expected number of reached nodes, the second one endeavors to maximize the expected increment in the number of reached nodes in comparison to a non-intervention strategy. The problems (particularly the second one) are motivated by cooperative game theory. For various probability distributions on co-existing seeds, we obtain several algorithms with approximation guarantees as well as hardness and hardness of approximation results. We conclude with experiments that demonstrate the usefulness of our approach when co-existing seeds exist.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130364470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaoyuan Huang, Hengda Zhang, Xiaofei Wang, Min Chen, Jianxin Li, Victor C. M. Leung
The arrival of 5G networks has extensively promoted the growth of content delivery services (CDSs). Understanding and predicting the spatio-temporal distribution of CDSs are beneficial to mobile users, Internet Content Providers and carriers. Conventional methods for predicting the spatio-temporal distribution of CDSs are mostly base-stations (BSs) centric, leading to weak generalization and spatio coarse-grained. To improve the spatio accuracy and generalization of modeling, we propose user-centric methods for CDSs spatio-temporal analysis. With geocoding and spatio-temporal graphs modeling algorithms, CDSs records collected from mobile devices are modeled as dynamic graphs with spatio-temporal attributes. Moreover, we propose a spatio-temporal-social multi-feature extraction framework for spatio fine-grained CDSs hot spots prediction. Specifically, an edge-enhanced graph convolutional block is designed to encode CDSs information based on the social relations and the spatio dependence features. Besides, we introduce the Long Short Term Memory (LSTM) to further capture the temporal dependence. Experiments on two real-world CDSs datasets verified the effectiveness of the proposed framework, and ablation studies are taken to evaluate the importance of each feature.
{"title":"Spatio-Temporal-Social Multi-Feature-based Fine-Grained Hot Spots Prediction for Content Delivery Services in 5G Era","authors":"Shaoyuan Huang, Hengda Zhang, Xiaofei Wang, Min Chen, Jianxin Li, Victor C. M. Leung","doi":"10.1145/3459637.3482298","DOIUrl":"https://doi.org/10.1145/3459637.3482298","url":null,"abstract":"The arrival of 5G networks has extensively promoted the growth of content delivery services (CDSs). Understanding and predicting the spatio-temporal distribution of CDSs are beneficial to mobile users, Internet Content Providers and carriers. Conventional methods for predicting the spatio-temporal distribution of CDSs are mostly base-stations (BSs) centric, leading to weak generalization and spatio coarse-grained. To improve the spatio accuracy and generalization of modeling, we propose user-centric methods for CDSs spatio-temporal analysis. With geocoding and spatio-temporal graphs modeling algorithms, CDSs records collected from mobile devices are modeled as dynamic graphs with spatio-temporal attributes. Moreover, we propose a spatio-temporal-social multi-feature extraction framework for spatio fine-grained CDSs hot spots prediction. Specifically, an edge-enhanced graph convolutional block is designed to encode CDSs information based on the social relations and the spatio dependence features. Besides, we introduce the Long Short Term Memory (LSTM) to further capture the temporal dependence. Experiments on two real-world CDSs datasets verified the effectiveness of the proposed framework, and ablation studies are taken to evaluate the importance of each feature.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123981133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Preksha Nema, Alexandros Karatzoglou, Filip Radlinski
Modern recommender systems usually embed users and items into a learned vector space representation. Similarity in this space is used to generate recommendations, and recommendation methods are agnostic to the structure of the embedding space. Motivated by the need for recommendation systems to be more transparent and controllable, we postulate that it is beneficial to assign meaning to some of the dimensions of user and item representations. Disentanglement is one technique commonly used for this purpose. We presenta novel supervised disentangling approach for recommendation tasks. Our model learns embeddings where attributes of interest are disentangled, while requiring only a very small number of labeled items at training time. The model can then generate interactive and critiquable recommendations for all users, without requiring any labels at recommendation time, and without sacrificing any recommendation performance. Our approach thus provides users with levers to manipulate, critique and fine-tune recommendations, and gives insight into why particular recommendations are made. Given only user-item interactions at recommendation time, we show that it identifies user tastes with respect to the attributes that have been disentangled, allowing for users to manipulate recommendations across these attributes.
{"title":"Disentangling Preference Representations for Recommendation Critiquing with ß-VAE","authors":"Preksha Nema, Alexandros Karatzoglou, Filip Radlinski","doi":"10.1145/3459637.3482425","DOIUrl":"https://doi.org/10.1145/3459637.3482425","url":null,"abstract":"Modern recommender systems usually embed users and items into a learned vector space representation. Similarity in this space is used to generate recommendations, and recommendation methods are agnostic to the structure of the embedding space. Motivated by the need for recommendation systems to be more transparent and controllable, we postulate that it is beneficial to assign meaning to some of the dimensions of user and item representations. Disentanglement is one technique commonly used for this purpose. We presenta novel supervised disentangling approach for recommendation tasks. Our model learns embeddings where attributes of interest are disentangled, while requiring only a very small number of labeled items at training time. The model can then generate interactive and critiquable recommendations for all users, without requiring any labels at recommendation time, and without sacrificing any recommendation performance. Our approach thus provides users with levers to manipulate, critique and fine-tune recommendations, and gives insight into why particular recommendations are made. Given only user-item interactions at recommendation time, we show that it identifies user tastes with respect to the attributes that have been disentangled, allowing for users to manipulate recommendations across these attributes.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124218305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoxiang Zhang, Aécio S. R. Santos, Juliana Freire
With the push for transparency and open data, many datasets and data repositories are becoming available on the Web. This opens new opportunities for data-driven exploration, from empowering analysts to answer new questions and obtain insights to improving predictive models through data augmentation. But as datasets are spread over a plethora of Web sites, finding data that are relevant for a given task is difficult. In this paper, we take a first step towards the construction of domain-specific data lakes. We propose an end-to-end dataset discovery system, targeted at domain experts, which given a small set of keywords, automatically finds potentially relevant datasets on the Web. The system makes use of search engines to hop across Web sites, uses online learning to incrementally build a model to recognize sites that contain datasets, utilizes a set of discovery actions to broaden the search, and applies a multi-armed bandit based algorithm to balance the trade-offs of different discovery actions. We report the results of an extensive experimental evaluation over multiple domains, and demonstrate that our strategy is effective and outperforms state-of-the-art content discovery methods.
{"title":"DSDD","authors":"Haoxiang Zhang, Aécio S. R. Santos, Juliana Freire","doi":"10.1145/3459637.3482427","DOIUrl":"https://doi.org/10.1145/3459637.3482427","url":null,"abstract":"With the push for transparency and open data, many datasets and data repositories are becoming available on the Web. This opens new opportunities for data-driven exploration, from empowering analysts to answer new questions and obtain insights to improving predictive models through data augmentation. But as datasets are spread over a plethora of Web sites, finding data that are relevant for a given task is difficult. In this paper, we take a first step towards the construction of domain-specific data lakes. We propose an end-to-end dataset discovery system, targeted at domain experts, which given a small set of keywords, automatically finds potentially relevant datasets on the Web. The system makes use of search engines to hop across Web sites, uses online learning to incrementally build a model to recognize sites that contain datasets, utilizes a set of discovery actions to broaden the search, and applies a multi-armed bandit based algorithm to balance the trade-offs of different discovery actions. We report the results of an extensive experimental evaluation over multiple domains, and demonstrate that our strategy is effective and outperforms state-of-the-art content discovery methods.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114363184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this digital age, people spend a significant portion of their lives online and this has led to an explosion of personal data from users and their activities. Typically, this data is private and nobody else, except the user, is allowed to look at it. This poses interesting and complex challenges from scalable information extraction point of view: extracting information under privacy aware constraints where there is little data to learn from but need highly accurate models to run on large amount of data across different users. Anonymization of data is typically used to convert private data into publicly accessible data. But this may not always be feasible and may require complex differential privacy guarantees in order to be safe from any potential negative consequences. Other techniques involve building models on a small amount of seen (eyes-on) data and a large amount of unseen (eyes-off) data. In this tutorial, we use emails as representative private data to explain the concepts of scalable IE under privacy-aware constraints.
{"title":"Large-Scale Information Extraction under Privacy-Aware Constraints","authors":"Rajeev Gupta, Ranganath Kondapally","doi":"10.1145/3459637.3482027","DOIUrl":"https://doi.org/10.1145/3459637.3482027","url":null,"abstract":"In this digital age, people spend a significant portion of their lives online and this has led to an explosion of personal data from users and their activities. Typically, this data is private and nobody else, except the user, is allowed to look at it. This poses interesting and complex challenges from scalable information extraction point of view: extracting information under privacy aware constraints where there is little data to learn from but need highly accurate models to run on large amount of data across different users. Anonymization of data is typically used to convert private data into publicly accessible data. But this may not always be feasible and may require complex differential privacy guarantees in order to be safe from any potential negative consequences. Other techniques involve building models on a small amount of seen (eyes-on) data and a large amount of unseen (eyes-off) data. In this tutorial, we use emails as representative private data to explain the concepts of scalable IE under privacy-aware constraints.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114535819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Few-shot relation extraction (FSRE) aims to predict the relation for a pair of entities in a sentence by exploring a few labeled instances for each relation type. Current methods mainly rely on meta-learning to learn generalized representations by optimizing the network parameters based on various collections of tasks sampled from training data. However, these methods may suffer from two main issues. 1) Insufficient supervision of meta-learning to learn discriminative representations on very few training instances, which are sampled from a large amount of base class data. 2) Spurious correlations between entities and relation types due to the biased training procedure that focuses more on entity pair rather than context. To learn more discriminative and unbiased representations for FSRE, this paper proposes a two-stage approach via supervised contrastive learning and sentence- and entity-level prototypical networks. In the first (pre-training) stage, we introduce a supervised contrastive pre-training method, which is able to yield more discriminative representations by learning from the entire training instances, such that the semantically related representations are close to each other, and far away otherwise. In the second (meta-learning) stage, we propose a novel sentence- and entity-level prototypical network equipped with fine-grained feature-wise fusion strategy to learn unbiased representations, where the networks are initialized with the parameters trained in the first stage. Specifically, the proposed network consists of a sentence branch and an entity branch, taking entire sentences and entity mentions as inputs, respectively. The entity branch explicitly captures the correlation between entity pairs and relations, and then dynamically adjusts the sentence branch's prediction distributions. By doing so, the spurious correlations issue caused by biased training samples can be properly mitigated. Extensive experiments on two FSRE benchmarks demonstrate the effectiveness of our approach.
{"title":"Learning Discriminative and Unbiased Representations for Few-Shot Relation Extraction","authors":"Jiale Han, Bo Cheng, Guoshun Nan","doi":"10.1145/3459637.3482268","DOIUrl":"https://doi.org/10.1145/3459637.3482268","url":null,"abstract":"Few-shot relation extraction (FSRE) aims to predict the relation for a pair of entities in a sentence by exploring a few labeled instances for each relation type. Current methods mainly rely on meta-learning to learn generalized representations by optimizing the network parameters based on various collections of tasks sampled from training data. However, these methods may suffer from two main issues. 1) Insufficient supervision of meta-learning to learn discriminative representations on very few training instances, which are sampled from a large amount of base class data. 2) Spurious correlations between entities and relation types due to the biased training procedure that focuses more on entity pair rather than context. To learn more discriminative and unbiased representations for FSRE, this paper proposes a two-stage approach via supervised contrastive learning and sentence- and entity-level prototypical networks. In the first (pre-training) stage, we introduce a supervised contrastive pre-training method, which is able to yield more discriminative representations by learning from the entire training instances, such that the semantically related representations are close to each other, and far away otherwise. In the second (meta-learning) stage, we propose a novel sentence- and entity-level prototypical network equipped with fine-grained feature-wise fusion strategy to learn unbiased representations, where the networks are initialized with the parameters trained in the first stage. Specifically, the proposed network consists of a sentence branch and an entity branch, taking entire sentences and entity mentions as inputs, respectively. The entity branch explicitly captures the correlation between entity pairs and relations, and then dynamically adjusts the sentence branch's prediction distributions. By doing so, the spurious correlations issue caused by biased training samples can be properly mitigated. Extensive experiments on two FSRE benchmarks demonstrate the effectiveness of our approach.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"34 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116491084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The economic policy uncertainty (EPU) index is one of the important text-based indexes in finance and economics fields. The EPU indexes of more than 26 countries have been constructed to reflect the policy uncertainty on country-level economic environments and serve as an important economic leading indicator. The EPU indexes are calculated based on the number of news articles with some manually-selected keywords related to economic, uncertainty, and policy. We find that the keyword-based EPU indexes contain noise, which will influence their explainability and predictability. In our experimental dataset, over 40% of news articles with the selected keywords are not related to the EPU. Instead of using keywords only, our proposed models take contextual information into account and get good performance on identifying the articles unrelated to EPU. The noise free EPU index performs better than the keyword-based EPU index in both explainability and predictability.
{"title":"Constructing Noise Free Economic Policy Uncertainty Index","authors":"Chung-Chi Chen, Hen-Hsen Huang, Yu-Lieh Huang, Hsin-Hsi Chen","doi":"10.1145/3459637.3482075","DOIUrl":"https://doi.org/10.1145/3459637.3482075","url":null,"abstract":"The economic policy uncertainty (EPU) index is one of the important text-based indexes in finance and economics fields. The EPU indexes of more than 26 countries have been constructed to reflect the policy uncertainty on country-level economic environments and serve as an important economic leading indicator. The EPU indexes are calculated based on the number of news articles with some manually-selected keywords related to economic, uncertainty, and policy. We find that the keyword-based EPU indexes contain noise, which will influence their explainability and predictability. In our experimental dataset, over 40% of news articles with the selected keywords are not related to the EPU. Instead of using keywords only, our proposed models take contextual information into account and get good performance on identifying the articles unrelated to EPU. The noise free EPU index performs better than the keyword-based EPU index in both explainability and predictability.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124474985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}