VATr++: Choose Your Words Wisely for Handwritten Text Generation

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-10-15 DOI:10.1109/TPAMI.2024.3481154

Bram Vanherle;Vittorio Pippi;Silvia Cascianelli;Nick Michiels;Frank Van Reeth;Rita Cucchiara

{"title":"VATr++: Choose Your Words Wisely for Handwritten Text Generation","authors":"Bram Vanherle;Vittorio Pippi;Silvia Cascianelli;Nick Michiels;Frank Van Reeth;Rita Cucchiara","doi":"10.1109/TPAMI.2024.3481154","DOIUrl":null,"url":null,"abstract":"Styled Handwritten Text Generation (HTG) has received significant attention in recent years, propelled by the success of learning-based solutions employing GANs, Transformers, and, preliminarily, Diffusion Models. Despite this surge in interest, there remains a critical yet understudied aspect – the impact of the input, both visual and textual, on the HTG model training and its subsequent influence on performance. This work extends the VATr (Pippi et al. 2023) Styled-HTG approach by addressing the pre-processing and training issues that it faces, which are common to many HTG models. In particular, we propose generally applicable strategies for input preparation and training regularization that allow the model to achieve better performance and generalization capabilities. Moreover, in this work, we go beyond performance optimization and address a significant hurdle in HTG research – the lack of a standardized evaluation protocol. In particular, we propose a standardization of the evaluation protocol for HTG and conduct a comprehensive benchmarking of existing approaches. By doing so, we aim to establish a foundation for fair and meaningful comparisons between HTG strategies, fostering progress in the field.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 2","pages":"934-948"},"PeriodicalIF":18.6000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10716806/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Styled Handwritten Text Generation (HTG) has received significant attention in recent years, propelled by the success of learning-based solutions employing GANs, Transformers, and, preliminarily, Diffusion Models. Despite this surge in interest, there remains a critical yet understudied aspect – the impact of the input, both visual and textual, on the HTG model training and its subsequent influence on performance. This work extends the VATr (Pippi et al. 2023) Styled-HTG approach by addressing the pre-processing and training issues that it faces, which are common to many HTG models. In particular, we propose generally applicable strategies for input preparation and training regularization that allow the model to achieve better performance and generalization capabilities. Moreover, in this work, we go beyond performance optimization and address a significant hurdle in HTG research – the lack of a standardized evaluation protocol. In particular, we propose a standardization of the evaluation protocol for HTG and conduct a comprehensive benchmarking of existing approaches. By doing so, we aim to establish a foundation for fair and meaningful comparisons between HTG strategies, fostering progress in the field.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

VATr++：为手写文本生成明智选词

风格手写文本生成（HTG）近年来受到了极大的关注，这得益于基于学习的解决方案的成功，这些解决方案采用了gan、变形器和初步的扩散模型。尽管兴趣激增，但仍有一个关键但尚未得到充分研究的方面-输入（视觉和文本）对HTG模型训练的影响及其随后对性能的影响。这项工作扩展了VATr (Pippi et al. 2023) style -HTG方法，解决了它所面临的预处理和训练问题，这些问题对于许多HTG模型都是常见的。特别是，我们提出了一般适用的输入准备和训练正则化策略，使模型能够获得更好的性能和泛化能力。此外，在这项工作中，我们超越了性能优化，并解决了HTG研究中的一个重大障碍——缺乏标准化的评估协议。特别是，我们提出了HTG评估协议的标准化，并对现有方法进行了全面的基准测试。通过这样做，我们的目标是为HTG战略之间的公平和有意义的比较奠定基础，促进该领域的进展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量