Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Alessandra Carbone;Aurélien Decelle;Lorenzo Rosset;Beatriz Seoane
{"title":"Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics","authors":"Alessandra Carbone;Aurélien Decelle;Lorenzo Rosset;Beatriz Seoane","doi":"10.1109/TPAMI.2024.3495999","DOIUrl":null,"url":null,"abstract":"In this study, we address the challenge of using energy-based models to produce high-quality, label-specific data in complex structured datasets, such as population genetics, RNA or protein sequences data. Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing, which affects the diversity of synthetic data and increases generation times. To address these issues, we use a novel training algorithm that exploits non-equilibrium effects. This approach, applied to the Restricted Boltzmann Machine, improves the model's ability to correctly classify samples and generate high-quality synthetic data in only a few sampling steps. The effectiveness of this method is demonstrated by its successful application to five different types of data: handwritten digits, mutations of human genomes classified by continental origin, functionally characterized sequences of an enzyme protein family, homologous RNA sequences from specific taxonomies and real classical piano pieces classified by their composer.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 2","pages":"1309-1316"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10750287/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this study, we address the challenge of using energy-based models to produce high-quality, label-specific data in complex structured datasets, such as population genetics, RNA or protein sequences data. Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing, which affects the diversity of synthetic data and increases generation times. To address these issues, we use a novel training algorithm that exploits non-equilibrium effects. This approach, applied to the Restricted Boltzmann Machine, improves the model's ability to correctly classify samples and generate high-quality synthetic data in only a few sampling steps. The effectiveness of this method is demonstrated by its successful application to five different types of data: handwritten digits, mutations of human genomes classified by continental origin, functionally characterized sequences of an enzyme protein family, homologous RNA sequences from specific taxonomies and real classical piano pieces classified by their composer.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
植根于非平衡物理学的快速功能结构化数据生成器
在本研究中,我们解决了在复杂的结构化数据集(如群体遗传学、RNA或蛋白质序列数据)中使用基于能量的模型生成高质量、标签特定数据的挑战。传统的训练方法由于马尔可夫链蒙特卡罗混合效率低下,影响了合成数据的多样性,增加了生成次数,因而遇到了困难。为了解决这些问题,我们使用了一种利用非平衡效应的新型训练算法。该方法应用于受限玻尔兹曼机,提高了模型正确分类样本的能力,并在几个采样步骤中生成高质量的合成数据。该方法成功地应用于五种不同类型的数据:手写数字、按大陆起源分类的人类基因组突变、酶蛋白家族的功能特征序列、来自特定分类的同源RNA序列以及由作曲家分类的真实古典钢琴曲。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FSD V2: Improving Fully Sparse 3D Object Detection With Virtual Voxels Online Learning Under a Separable Stochastic Approximation Framework Event-Enhanced Snapshot Compressive Videography at 10K FPS Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics Estimating Information Theoretic Measures via Multidimensional Gaussianization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1