{"title":"MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation","authors":"Sufeng Duan;Hai Zhao","doi":"10.1109/TASLP.2024.3507556","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of the SAN-based model to the generation of MoG. Based on our explanation, we further introduce an MO-Transformer by enhancing the ability to capture multiple subgraphs of different orders and focusing on subgraphs of high orders. Experimental results on multiple neural machine translation tasks show that the MO-Transformer can yield effective performance improvement.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"5065-5077"},"PeriodicalIF":4.1000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10768979/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of the SAN-based model to the generation of MoG. Based on our explanation, we further introduce an MO-Transformer by enhancing the ability to capture multiple subgraphs of different orders and focusing on subgraphs of high orders. Experimental results on multiple neural machine translation tasks show that the MO-Transformer can yield effective performance improvement.
期刊介绍:
The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.