Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.
{"title":"A survey of transformers","authors":"Tianyang Lin, Yuxin Wang, Xiangyang Liu, Xipeng Qiu","doi":"10.1016/j.aiopen.2022.10.001","DOIUrl":"10.1016/j.aiopen.2022.10.001","url":null,"abstract":"<div><p>Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 111-132"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000146/pdfft?md5=802c180f3454a2e26d638dce462d3dff&pid=1-s2.0-S2666651022000146-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80994748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.11.005
Quanyu Dai , Zhenhua Dong , Xu Chen
Debiased recommender models have recently attracted increasing attention from the academic and industry communities. Existing models are mostly based on the technique of inverse propensity score (IPS). However, in the recommendation domain, IPS can be hard to estimate given the sparse and noisy nature of the observed user–item exposure data. To alleviate this problem, in this paper, we assume that the user preference can be dominated by a small amount of latent factors, and propose to cluster the users for computing more accurate IPS via increasing the exposure densities. Basically, such method is similar with the spirit of stratification models in applied statistics. However, unlike previous heuristic stratification strategy, we learn the cluster criterion by presenting the users with low ranking embeddings, which are future shared with the user representations in the recommender model. At last, we find that our model has strong connections with the previous two types of debiased recommender models. We conduct extensive experiments based on real-world datasets to demonstrate the effectiveness of the proposed method.
{"title":"Debiased recommendation with neural stratification","authors":"Quanyu Dai , Zhenhua Dong , Xu Chen","doi":"10.1016/j.aiopen.2022.11.005","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.005","url":null,"abstract":"<div><p>Debiased recommender models have recently attracted increasing attention from the academic and industry communities. Existing models are mostly based on the technique of inverse propensity score (IPS). However, in the recommendation domain, IPS can be hard to estimate given the sparse and noisy nature of the observed user–item exposure data. To alleviate this problem, in this paper, we assume that the user preference can be dominated by a small amount of latent factors, and propose to cluster the users for computing more accurate IPS via increasing the exposure densities. Basically, such method is similar with the spirit of stratification models in applied statistics. However, unlike previous heuristic stratification strategy, we learn the cluster criterion by presenting the users with low ranking embeddings, which are future shared with the user representations in the recommender model. At last, we find that our model has strong connections with the previous two types of debiased recommender models. We conduct extensive experiments based on real-world datasets to demonstrate the effectiveness of the proposed method.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 213-217"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000201/pdfft?md5=1244b2c9319c988375fcebe6f3172caa&pid=1-s2.0-S2666651022000201-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72246441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.11.002
Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente
The National Judicial Examination of China is an essential examination for selecting legal practitioners. In recent years, people have tried to use machine learning algorithms to answer examination questions. With the proposal of JEC-QA (Zhong et al. 2020), the judicial examination becomes a particular legal task. The data of judicial examination contains two types, i.e., Knowledge-Driven questions and Case-Analysis questions. Both require complex reasoning and text comprehension, thus challenging computers to answer judicial examination questions. We propose Bilinear Convolutional Neural Networks and Attention Networks (BCA) in this paper, which is an improved version based on the model proposed by our team on the Challenge of AI in Law 2021 judicial examination task. It has two essential modules, Knowledge-Driven Module (KDM) for local features extraction and Case-Analysis Module (CAM) for the semantic difference clarification between the question stem and the options. We also add a post-processing module to correct the results in the final stage. The experimental results show that our system achieves state-of-the-art in the offline test of the judicial examination task.
国家司法考试是选拔法律从业人员的重要考试。近年来,人们尝试使用机器学习算法来回答考试问题。随着JEC-QA(Zhong et al.2020)的提出,司法审查成为一项特殊的法律任务。司法考试数据分为知识驱动题和案例分析题两类。两者都需要复杂的推理和文本理解,因此对计算机回答司法考试问题具有挑战性。我们在本文中提出了双线性卷积神经网络和注意力网络(BCA),这是基于我们团队在2021年法律中人工智能挑战司法考试任务中提出的模型的改进版本。它有两个基本模块,用于局部特征提取的知识驱动模块(KDM)和用于澄清题干和选项之间语义差异的案例分析模块(CAM)。我们还添加了一个后处理模块,以在最后阶段更正结果。实验结果表明,我们的系统在司法考试任务的离线测试中达到了最先进的水平。
{"title":"BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering","authors":"Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente","doi":"10.1016/j.aiopen.2022.11.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.002","url":null,"abstract":"<div><p>The National Judicial Examination of China is an essential examination for selecting legal practitioners. In recent years, people have tried to use machine learning algorithms to answer examination questions. With the proposal of JEC-QA (Zhong et al. 2020), the judicial examination becomes a particular legal task. The data of judicial examination contains two types, i.e., Knowledge-Driven questions and Case-Analysis questions. Both require complex reasoning and text comprehension, thus challenging computers to answer judicial examination questions. We propose <strong>B</strong>ilinear <strong>C</strong>onvolutional Neural Networks and <strong>A</strong>ttention Networks (<strong>BCA</strong>) in this paper, which is an improved version based on the model proposed by our team on the Challenge of AI in Law 2021 judicial examination task. It has two essential modules, <strong>K</strong>nowledge-<strong>D</strong>riven <strong>M</strong>odule (<strong>KDM</strong>) for local features extraction and <strong>C</strong>ase-<strong>A</strong>nalysis <strong>M</strong>odule (<strong>CAM</strong>) for the semantic difference clarification between the question stem and the options. We also add a post-processing module to correct the results in the final stage. The experimental results show that our system achieves state-of-the-art in the offline test of the judicial examination task.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 172-181"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000171/pdfft?md5=7fc8cf53d6ea6be2b3999607b407f336&pid=1-s2.0-S2666651022000171-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.07.002
Shu Zhao , Jialin Chen , Jie Chen , Yanping Zhang , Jie Tang
Network embedding (NE) aims to learn low-dimensional vectors for nodes while preserving the network’s essential properties (e.g., attributes and structure). Previous methods have been proposed to learn node representations with encouraging achievements. Recent research has shown that the hierarchical label has potential value in seeking latent hierarchical structures and learning more effective classification information. Nevertheless, most existing network embedding methods either focus on the network without the hierarchical label, or the learning process of hierarchical structure for labels is separate from the network structure. Learning node embedding with the hierarchical label suffers from two challenges: (1) Fusing hierarchical labels and network is still an arduous task. (2) The data volume imbalance under different hierarchical labels is more noticeable than flat labels. This paper proposes a Hierarchical Label and Attributed Network Structure Fusion model(HANS), which realizes the fusion of hierarchical labels and nodes through attributes and the attention-based fusion module. Particularly, HANS designs a directed hierarchy structure encoder for modeling label dependencies in three directions (parent–child, child–parent, and sibling) to strengthen the co-occurrence information between labels of different frequencies and reduce the impact of the label imbalance. Experiments on real-world datasets demonstrate that the proposed method achieves significantly better performance than the state-of-the-art algorithms.
{"title":"Hierarchical label with imbalance and attributed network structure fusion for network embedding","authors":"Shu Zhao , Jialin Chen , Jie Chen , Yanping Zhang , Jie Tang","doi":"10.1016/j.aiopen.2022.07.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.07.002","url":null,"abstract":"<div><p>Network embedding (NE) aims to learn low-dimensional vectors for nodes while preserving the network’s essential properties (e.g., attributes and structure). Previous methods have been proposed to learn node representations with encouraging achievements. Recent research has shown that the hierarchical label has potential value in seeking latent hierarchical structures and learning more effective classification information. Nevertheless, most existing network embedding methods either focus on the network without the hierarchical label, or the learning process of hierarchical structure for labels is separate from the network structure. Learning node embedding with the hierarchical label suffers from two challenges: (1) Fusing hierarchical labels and network is still an arduous task. (2) The data volume imbalance under different hierarchical labels is more noticeable than flat labels. This paper proposes a <strong>H</strong>ierarchical Label and <strong>A</strong>ttributed <strong>N</strong>etwork <strong>S</strong>tructure Fusion model(HANS), which realizes the fusion of hierarchical labels and nodes through attributes and the attention-based fusion module. Particularly, HANS designs a directed hierarchy structure encoder for modeling label dependencies in three directions (parent–child, child–parent, and sibling) to strengthen the co-occurrence information between labels of different frequencies and reduce the impact of the label imbalance. Experiments on real-world datasets demonstrate that the proposed method achieves significantly better performance than the state-of-the-art algorithms.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 91-100"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000122/pdfft?md5=b0971b7ac0f357e13fd0e41f95f6412d&pid=1-s2.0-S2666651022000122-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72246448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.06.001
Wenwu Zhu , Xin Wang , Pengtao Xie
Conventional machine learning (ML) relies heavily on manual design from machine learning experts to decide learning tasks, data, models, optimization algorithms, and evaluation metrics, which is labor-intensive, time-consuming, and cannot learn autonomously like humans. In education science, self-directed learning, where human learners select learning tasks and materials on their own without requiring hands-on guidance, has been shown to be more effective than passive teacher-guided learning. Inspired by the concept of self-directed human learning, we introduce the principal concept of Self-directed Machine Learning (SDML) and propose a framework for SDML. Specifically, we design SDML as a self-directed learning process guided by self-awareness, including internal awareness and external awareness. Our proposed SDML process benefits from self task selection, self data selection, self model selection, self optimization strategy selection and self evaluation metric selection through self-awareness without human guidance. Meanwhile, the learning performance of the SDML process serves as feedback to further improve self-awareness. We propose a mathematical formulation for SDML based on multi-level optimization. Furthermore, we present case studies together with potential applications of SDML, followed by discussing future research directions. We expect that SDML could enable machines to conduct human-like self-directed learning and provide a new perspective towards artificial general intelligence.
{"title":"Self-directed machine learning","authors":"Wenwu Zhu , Xin Wang , Pengtao Xie","doi":"10.1016/j.aiopen.2022.06.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.06.001","url":null,"abstract":"<div><p>Conventional machine learning (ML) relies heavily on manual design from machine learning experts to decide learning tasks, data, models, optimization algorithms, and evaluation metrics, which is labor-intensive, time-consuming, and cannot learn autonomously like humans. In education science, self-directed learning, where human learners select learning tasks and materials on their own without requiring hands-on guidance, has been shown to be more effective than passive teacher-guided learning. Inspired by the concept of self-directed human learning, we introduce the principal concept of Self-directed Machine Learning (SDML) and propose a framework for SDML. Specifically, we design SDML as a self-directed learning process guided by self-awareness, including internal awareness and external awareness. Our proposed SDML process benefits from self task selection, self data selection, self model selection, self optimization strategy selection and self evaluation metric selection through self-awareness without human guidance. Meanwhile, the learning performance of the SDML process serves as feedback to further improve self-awareness. We propose a mathematical formulation for SDML based on multi-level optimization. Furthermore, we present case studies together with potential applications of SDML, followed by discussing future research directions. We expect that SDML could enable machines to conduct human-like self-directed learning and provide a new perspective towards artificial general intelligence.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 58-70"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000109/pdfft?md5=5480e0d544d9f6d6307d44ca29f5d00c&pid=1-s2.0-S2666651022000109-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72282567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.09.001
Linmei Hu , Siqi Wei , Ziwang Zhao , Bin Wu
The information age enables people to obtain news online through various channels, yet in the meanwhile making false news spread at unprecedented speed. Fake news exerts detrimental effects for it impairs social stability and public trust, which calls for increasing demand for fake news detection (FND). As deep learning (DL) achieves tremendous success in various domains, it has also been leveraged in FND tasks and surpasses traditional machine learning based methods, yielding state-of-the-art performance. In this survey, we present a complete review and analysis of existing DL based FND methods that focus on various features such as news content, social context, and external knowledge. We review the methods under the lines of supervised, weakly supervised, and unsupervised methods. For each line, we systematically survey the representative methods utilizing different features. Then, we introduce several commonly used FND datasets and give a quantitative analysis of the performance of the DL based FND methods over these datasets. Finally, we analyze the remaining limitations of current approaches and highlight some promising future directions.
{"title":"Deep learning for fake news detection: A comprehensive survey","authors":"Linmei Hu , Siqi Wei , Ziwang Zhao , Bin Wu","doi":"10.1016/j.aiopen.2022.09.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.09.001","url":null,"abstract":"<div><p>The information age enables people to obtain news online through various channels, yet in the meanwhile making false news spread at unprecedented speed. Fake news exerts detrimental effects for it impairs social stability and public trust, which calls for increasing demand for fake news detection (FND). As deep learning (DL) achieves tremendous success in various domains, it has also been leveraged in FND tasks and surpasses traditional machine learning based methods, yielding state-of-the-art performance. In this survey, we present a complete review and analysis of existing DL based FND methods that focus on various features such as news content, social context, and external knowledge. We review the methods under the lines of supervised, weakly supervised, and unsupervised methods. For each line, we systematically survey the representative methods utilizing different features. Then, we introduce several commonly used FND datasets and give a quantitative analysis of the performance of the DL based FND methods over these datasets. Finally, we analyze the remaining limitations of current approaches and highlight some promising future directions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 133-155"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000134/pdfft?md5=d2d9826705629e3762ea484a2d93d29d&pid=1-s2.0-S2666651022000134-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2021.12.002
Zijie Ye, Haozhe Wu, Jia Jia
The aim of human motion modeling is to understand human behaviors and create reasonable human motion like real people given different priors. With the development of deep learning, researchers tend to leverage data-driven methods to improve the performance of traditional motion modeling methods. In this paper, we present a comprehensive survey of recent human motion modeling researches. We discuss three categories of human motion modeling researches: human motion prediction, humanoid motion control and cross-modal motion synthesis and provide a detailed review over existing methods. Finally, we further discuss the remaining challenges in human motion modeling.
{"title":"Human motion modeling with deep learning: A survey","authors":"Zijie Ye, Haozhe Wu, Jia Jia","doi":"10.1016/j.aiopen.2021.12.002","DOIUrl":"10.1016/j.aiopen.2021.12.002","url":null,"abstract":"<div><p>The aim of human motion modeling is to understand human behaviors and create reasonable human motion like real people given different priors. With the development of deep learning, researchers tend to leverage data-driven methods to improve the performance of traditional motion modeling methods. In this paper, we present a comprehensive survey of recent human motion modeling researches. We discuss three categories of human motion modeling researches: human motion prediction, humanoid motion control and cross-modal motion synthesis and provide a detailed review over existing methods. Finally, we further discuss the remaining challenges in human motion modeling.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 35-39"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000309/pdfft?md5=ad9a69283a477c5f5d6b127141e48a38&pid=1-s2.0-S2666651021000309-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83892046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have been widely adopted for modeling graph-structure data. Most existing GNN studies have focused on designing different strategies to propagate information over the graph structures. After systematic investigations, we observe that the propagation step in GNNs matters, but its resultant performance improvement is insensitive to the location where we apply it. Our empirical examination further shows that the performance improvement brought by propagation mostly comes from a phenomenon of distribution alignment, i.e., propagation over graphs actually results in the alignment of the underlying distributions between the training and test sets. The findings are instrumental to understand GNNs, e.g., why decoupled GNNs can work as good as standard GNNs.1
{"title":"On the distribution alignment of propagation in graph neural networks","authors":"Qinkai Zheng , Xiao Xia , Kun Zhang , Evgeny Kharlamov , Yuxiao Dong","doi":"10.1016/j.aiopen.2022.11.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.006","url":null,"abstract":"<div><p>Graph neural networks (GNNs) have been widely adopted for modeling graph-structure data. Most existing GNN studies have focused on designing <em>different</em> strategies to propagate information over the graph structures. After systematic investigations, we observe that the propagation step in GNNs matters, but its resultant performance improvement is insensitive to the location where we apply it. Our empirical examination further shows that the performance improvement brought by propagation mostly comes from a phenomenon of <em>distribution alignment</em>, i.e., propagation over graphs actually results in the alignment of the underlying distributions between the training and test sets. The findings are instrumental to understand GNNs, e.g., why decoupled GNNs can work as good as standard GNNs.<span><sup>1</sup></span></p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 218-228"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000213/pdfft?md5=e78f6562530f06a112827f05883082be&pid=1-s2.0-S2666651022000213-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72282565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.11.001
Xiechao Guo , Ruiping Liu , Dandan Song
The mainstream domain adaptation (DA) methods transfer the supervised source domain knowledge to the unsupervised or semi-supervised target domain, so as to assist the classification task in the target domain. Usually the supervision only contains the class label of the object. However, when human beings recognize a new object, they will not only learn the class label of the object, but also correlate the object to its parent class, and use this information to learn the similarities and differences between child classes. Our model utilizes hierarchical relations via making the parent class label of labeled data (all the source domain data and part of target domain data) as a part of supervision to guide prototype learning module vbfd to learn the parent class information encoding, so that the prototypes of the same parent class are closer in the prototype space, which leads to better classification results. Inspired by this mechanism, we propose a Hierarchical relation aided Semi-Supervised Domain Adaptation (HSSDA) method which incorporates the hierarchical relations into the Semi-Supervised Domain Adaptation (SSDA) method to improve the classification results of the model. Our model performs well on the DomainNet dataset, and gets the state-of-the-art results in the semi-supervised DA problem.
{"title":"HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation","authors":"Xiechao Guo , Ruiping Liu , Dandan Song","doi":"10.1016/j.aiopen.2022.11.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.001","url":null,"abstract":"<div><p>The mainstream domain adaptation (DA) methods transfer the supervised source domain knowledge to the unsupervised or semi-supervised target domain, so as to assist the classification task in the target domain. Usually the supervision only contains the class label of the object. However, when human beings recognize a new object, they will not only learn the class label of the object, but also correlate the object to its parent class, and use this information to learn the similarities and differences between child classes. Our model utilizes hierarchical relations via making the parent class label of labeled data (all the source domain data and part of target domain data) as a part of supervision to guide prototype learning module vbfd to learn the parent class information encoding, so that the prototypes of the same parent class are closer in the prototype space, which leads to better classification results. Inspired by this mechanism, we propose a <strong>Hierarchical relation aided Semi-Supervised Domain Adaptation (HSSDA)</strong> method which incorporates the hierarchical relations into the Semi-Supervised Domain Adaptation (SSDA) method to improve the classification results of the model. Our model performs well on the DomainNet dataset, and gets the state-of-the-art results in the semi-supervised DA problem.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 156-161"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266665102200016X/pdfft?md5=acdf10fbc8ecc16b703bc63cf409d5c7&pid=1-s2.0-S266665102200016X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1016/j.aiopen.2022.12.002
Yu Cao, Yuanyuan Sun, Ce Xu, Chunnan Li, Jinming Du, Hongfei Lin
{"title":"CAILIE 1.0: A dataset for Challenge of AI in Law - Information Extraction V1.0","authors":"Yu Cao, Yuanyuan Sun, Ce Xu, Chunnan Li, Jinming Du, Hongfei Lin","doi":"10.1016/j.aiopen.2022.12.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.12.002","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"208-212"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000237/pdfft?md5=0d34de7b220463b0502bcbc2ad2a5225&pid=1-s2.0-S2666651022000237-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72246444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}