Hailong Yu , Wei Su , Lei Liu , Jing Zhang , Chuan Cai , Cunlu Xu , Huajiu Quan , Yingchun Xie
{"title":"Enhancing Chinese–Braille translation: A two-part approach with token prediction and segmentation labeling","authors":"Hailong Yu , Wei Su , Lei Liu , Jing Zhang , Chuan Cai , Cunlu Xu , Huajiu Quan , Yingchun Xie","doi":"10.1016/j.displa.2024.102819","DOIUrl":null,"url":null,"abstract":"<div><p>Visually assistive systems for the visually impaired play a pivotal role in enhancing the quality of life for the visually impaired. Assistive technologies for the visually impaired have undergone a remarkable transformation with the advent of deep learning and sophisticated assistive devices. In particular, the paper utilizes the latest machine translation models and techniques to accomplish the Chinese–Braille translation task, providing convenience for visually impaired individuals. The Traditional end-to-end Chinese–Braille translation approach incorporates Braille dots and Braille word segmentation symbols as tokens within the model’s vocabulary. However, our findings reveal that Braille word segmentation is significantly more complex than Braille dot prediction. The paper proposes a novel Two-Part Loss (TPL) method that treats these tasks distinctly, leading to significant accuracy improvements. To enhance translation performance further, we introduce a BERT-Enhanced Segmentation Transformer (BEST) method. BEST leverages knowledge distillation techniques to transfer knowledge from a pre-trained BERT model to the translate model, mitigating its limitations in word segmentation. Additionally, soft label distillation is employed to improve overall efficacy further. The TPL approach achieves an average BLEU score improvement of 1.16 and 5.42 for Transformer and GPT models on four datasets, respectively. In addition, The work presents a two-stage deep learning-based translation approach that outperforms traditional multi-step and end-to-end methods. The proposed two-stage translation method achieves an average BLEU score improvement of 0.85 across four datasets.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102819"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001835","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Visually assistive systems for the visually impaired play a pivotal role in enhancing the quality of life for the visually impaired. Assistive technologies for the visually impaired have undergone a remarkable transformation with the advent of deep learning and sophisticated assistive devices. In particular, the paper utilizes the latest machine translation models and techniques to accomplish the Chinese–Braille translation task, providing convenience for visually impaired individuals. The Traditional end-to-end Chinese–Braille translation approach incorporates Braille dots and Braille word segmentation symbols as tokens within the model’s vocabulary. However, our findings reveal that Braille word segmentation is significantly more complex than Braille dot prediction. The paper proposes a novel Two-Part Loss (TPL) method that treats these tasks distinctly, leading to significant accuracy improvements. To enhance translation performance further, we introduce a BERT-Enhanced Segmentation Transformer (BEST) method. BEST leverages knowledge distillation techniques to transfer knowledge from a pre-trained BERT model to the translate model, mitigating its limitations in word segmentation. Additionally, soft label distillation is employed to improve overall efficacy further. The TPL approach achieves an average BLEU score improvement of 1.16 and 5.42 for Transformer and GPT models on four datasets, respectively. In addition, The work presents a two-stage deep learning-based translation approach that outperforms traditional multi-step and end-to-end methods. The proposed two-stage translation method achieves an average BLEU score improvement of 0.85 across four datasets.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.