Build a Good Human-Free Prompt Tuning: Jointly Pre-Trained Template and Verbalizer for Few-Shot Classification

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-02-18 DOI:10.1109/TKDE.2025.3543422

Mouxiang Chen;Han Fu;Chenghao Liu;Xiaoyun Joy Wang;Zhuo Li;Jianling Sun

{"title":"Build a Good Human-Free Prompt Tuning: Jointly Pre-Trained Template and Verbalizer for Few-Shot Classification","authors":"Mouxiang Chen;Han Fu;Chenghao Liu;Xiaoyun Joy Wang;Zhuo Li;Jianling Sun","doi":"10.1109/TKDE.2025.3543422","DOIUrl":null,"url":null,"abstract":"Prompt tuning for pre-trained language models (PLMs) has been an effective approach for few-shot text classification. To make a prediction, a typical prompt tuning method employs a template wrapping the input text into a cloze question, and a verbalizer mapping the output embedding to labels. However, current methods typically depend on handcrafted templates and verbalizers, which require much domain-specific prior knowledge by human efforts. In this work, we investigate how to build a good human-free prompt tuning using soft prompt templates and soft verbalizers, which can be learned directly from data. To address the challenge of data scarcity, we integrate a set of trainable bases for sentence representation to transfer the contextual information into a low-dimensional space. By jointly pre-training the soft prompts and the bases using contrastive learning, the projection space can catch critical semantics at the sentence level, which could be transferred to various downstream tasks. To better bridge the gap between downstream tasks and the pre-training procedure, we formulate the few-shot classification tasks as another contrastive learning problem. We name this Jointly Pretrained Template and Verbalizer (JPTV). Extensive experiments show that this human-free prompt tuning can achieve comparable or even better performance than manual prompt tuning.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 5","pages":"2253-2265"},"PeriodicalIF":10.4000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10891939/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Prompt tuning for pre-trained language models (PLMs) has been an effective approach for few-shot text classification. To make a prediction, a typical prompt tuning method employs a template wrapping the input text into a cloze question, and a verbalizer mapping the output embedding to labels. However, current methods typically depend on handcrafted templates and verbalizers, which require much domain-specific prior knowledge by human efforts. In this work, we investigate how to build a good human-free prompt tuning using soft prompt templates and soft verbalizers, which can be learned directly from data. To address the challenge of data scarcity, we integrate a set of trainable bases for sentence representation to transfer the contextual information into a low-dimensional space. By jointly pre-training the soft prompts and the bases using contrastive learning, the projection space can catch critical semantics at the sentence level, which could be transferred to various downstream tasks. To better bridge the gap between downstream tasks and the pre-training procedure, we formulate the few-shot classification tasks as another contrastive learning problem. We name this Jointly Pretrained Template and Verbalizer (JPTV). Extensive experiments show that this human-free prompt tuning can achieve comparable or even better performance than manual prompt tuning.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

构建一个良好的无人工提示调优：联合预训练模板和少弹分类语言器

预训练语言模型（PLMs）的快速调优已经成为一种有效的小样本文本分类方法。为了进行预测，典型的提示调优方法使用一个模板将输入文本包装成一个完形问题，并使用一个语言修饰器将输出嵌入映射到标签。然而，当前的方法通常依赖于手工制作的模板和语言表达器，这需要许多特定领域的先验知识。在这项工作中，我们研究了如何使用软提示模板和软语言器来构建一个良好的人工提示调优，这些提示可以直接从数据中学习。为了解决数据稀缺的挑战，我们集成了一组可训练的句子表示库，将上下文信息转移到低维空间中。通过对比学习对软提示和基础进行联合预训练，投射空间可以捕捉句子层面的关键语义，并将其转移到各种下游任务中。为了更好地弥合下游任务和预训练过程之间的差距，我们将少镜头分类任务作为另一个对比学习问题。我们将其命名为联合预训练模板和语言分析器（JPTV）。大量的实验表明，这种无人工提示调优可以达到与手动提示调优相当甚至更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.

期刊最新文献

2025 Reviewers List XiYan-SQL: A Novel Multi-Generator Framework for Text-to-SQL Toward Federated Learning of Deep Graph Neural Networks HCGBot: Learning Homophilous Context Graphs for Twitter Bot Detection Optimizing KBQA by Correcting LLM-Generated Non-Executable Logical Form Through Knowledge-Assisted Path Reconstruction