首页 > 最新文献

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence最新文献

英文 中文
IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity. IGAMT:具有异质性和不规则性的隐私保护电子病历综合。
Pub Date : 2024-01-01 Epub Date: 2024-03-24 DOI: 10.1609/aaai.v38i14.29491
Wenjie Wang, Pengfei Tang, Jian Lou, Yuanming Shao, Lance Waller, Yi-An Ko, Li Xiong

Utilizing electronic health records (EHR) for machine learning-driven clinical research has great potential to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary use of EHR data is regulated, constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the privacy risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist: addressing the heterogeneity of features including temporal and non-temporal features, structurally missing values, and irregularity of the temporal measures, and ensuring rigorous privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose IGAMT, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintains high quality with heterogeneous features, missing values, and irregular measures but also achieves differential privacy with enhanced privacy-utility trade-off. Extensive experiments prove that IGAMT significantly outperforms baseline and state-of-the-art models in terms of resemblance to real data and performance of downstream applications. Ablation studies also prove the effectiveness of the techniques applied in IGAMT.

利用电子健康记录(EHR)进行机器学习驱动的临床研究,在增强结果预测和治疗个性化方面具有巨大的潜力。然而,出于隐私和安全考虑,电子病历数据的二次使用受到了监管,限制了研究人员对电子病历数据的访问。使用深度学习方法生成合成EHR数据是一种可行且有前途的方法,可以减轻隐私问题,不仅为下游应用程序提供补充资源,还可以避免与真实患者数据相关的隐私风险。虽然之前的工作集中在EHR数据合成上,但仍然存在重大挑战:解决特征的异质性,包括时间和非时间特征、结构缺失值和时间度量的不规则性,并确保用于模型训练的真实数据的严格隐私。该领域的现有工作只集中于解决上述一两个挑战。在这项工作中,我们提出了IGAMT,这是一个创新的框架,用于生成隐私保护的合成电子病历数据,该数据不仅具有异构特征,缺失值和不规则度量,而且可以通过增强隐私-效用权衡来实现差分隐私。大量的实验证明,IGAMT在与真实数据的相似性和下游应用程序的性能方面明显优于基线和最先进的模型。消融研究也证明了应用于IGAMT的技术的有效性。
{"title":"IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity.","authors":"Wenjie Wang, Pengfei Tang, Jian Lou, Yuanming Shao, Lance Waller, Yi-An Ko, Li Xiong","doi":"10.1609/aaai.v38i14.29491","DOIUrl":"https://doi.org/10.1609/aaai.v38i14.29491","url":null,"abstract":"<p><p>Utilizing electronic health records (EHR) for machine learning-driven clinical research has great potential to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary use of EHR data is regulated, constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the privacy risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist: addressing the heterogeneity of features including temporal and non-temporal features, structurally missing values, and irregularity of the temporal measures, and ensuring rigorous privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose <i>IGAMT</i>, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintains high quality with heterogeneous features, missing values, and irregular measures but also achieves differential privacy with enhanced privacy-utility trade-off. Extensive experiments prove that <i>IGAMT</i> significantly outperforms baseline and state-of-the-art models in terms of resemblance to real data and performance of downstream applications. Ablation studies also prove the effectiveness of the techniques applied in <i>IGAMT</i>.</p>","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"38 14","pages":"15634-15643"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606572/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142775537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Erratum to: 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation 更正:3D- togo:迈向文本引导的跨类别3D对象生成
Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei Zhang, Xiaojun Chang, Hang Xu
The Original Article was published on 26 June 2023.  
原文发表于2023年6月26日。
{"title":"Erratum to: 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation","authors":"Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei Zhang, Xiaojun Chang, Hang Xu","doi":"10.1609/aaai.v37i13.27320","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27320","url":null,"abstract":"The Original Article was published on 26 June 2023. \u0000 ","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82809113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Deep Generative Models for Remote Medical Applications 远程医疗应用的多模态深度生成模型
Catherine Ordun
Visible-to-Thermal (VT) face translation is an under-studied problem of image-to-image translation that offers an AI-enabled alternative to traditional thermal sensors. Over three phases, my Doctoral Proposal explores developing multimodal deep generative solutions that can be applied towards telemedicine applications. These include the contribution of a novel Thermal Face Contrastive GAN (TFC-GAN), exploration of hybridized diffusion-GAN models, application on real clinical thermal data at the National Institutes of Health, and exploration of strategies for Federated Learning (FL) in heterogenous data settings.
可视到热(VT)人脸翻译是一个尚未得到充分研究的图像到图像翻译问题,它为传统的热传感器提供了一种人工智能支持的替代方案。在三个阶段,我的博士提案探索开发可应用于远程医疗应用的多模态深度生成解决方案。其中包括一种新型热面对比GAN (TFC-GAN)的贡献,混合扩散GAN模型的探索,在美国国立卫生研究院的实际临床热数据上的应用,以及在异构数据设置中探索联邦学习(FL)策略。
{"title":"Multimodal Deep Generative Models for Remote Medical Applications","authors":"Catherine Ordun","doi":"10.1609/aaai.v37i13.26924","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26924","url":null,"abstract":"Visible-to-Thermal (VT) face translation is an under-studied problem of image-to-image translation that offers an AI-enabled alternative to traditional thermal sensors. Over three phases, my Doctoral Proposal explores developing multimodal deep generative solutions that can be applied towards telemedicine applications. These include the contribution of a novel Thermal Face Contrastive GAN (TFC-GAN), exploration of hybridized diffusion-GAN models, application on real clinical thermal data at the National Institutes of Health, and exploration of strategies for Federated Learning (FL) in heterogenous data settings.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"50 1","pages":"16127-16128"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74147410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
McOmet: Multimodal Fusion Transformer for Physical Audiovisual Commonsense Reasoning 物理视听常识推理的多模态融合变压器
Daoming Zong, Shiliang Sun
Physical commonsense reasoning is essential for building reliable and interpretable AI systems, which involves a general understanding of the physical properties and affordances of everyday objects, how these objects can be manipulated, and how they interact with others. It is fundamentally a multi-modal task, as physical properties are manifested through multiple modalities, including vision and acoustics. In this work, we present a unified framework, named Multimodal Commonsense Transformer (MCOMET), for physical audiovisual commonsense reasoning. MCOMET has two intriguing properties: i) it fully mines higher-ordered temporal relationships across modalities (e.g., pairs, triplets, and quadruplets); and ii) it restricts the cross-modal flow through the feature collection and propagation mechanism along with tight fusion bottlenecks, forcing the model to attend the most relevant parts in each modality and suppressing the dissemination of noisy information. We evaluate our model on a very recent public benchmark, PACS. Results show that MCOMET significantly outperforms a variety of strong baselines, revealing powerful multi-modal commonsense reasoning capabilities.
物理常识推理对于构建可靠和可解释的人工智能系统至关重要,这涉及到对日常物品的物理特性和功能的一般理解,这些物品如何被操纵,以及它们如何与他人互动。它基本上是一个多模态任务,因为物理特性通过多种模态表现出来,包括视觉和声学。在这项工作中,我们提出了一个统一的框架,称为多模态常识转换器(MCOMET),用于物理视听常识推理。MCOMET有两个有趣的特性:i)它完全挖掘跨模态(例如,对、三联体和四联体)的高阶时间关系;ii)通过特征收集和传播机制限制了跨模态的流动,融合瓶颈较紧,迫使模型关注每个模态中最相关的部分,抑制了噪声信息的传播。我们用最近的公共基准PACS来评估我们的模型。结果表明,MCOMET显著优于各种强基线,显示出强大的多模态常识推理能力。
{"title":"McOmet: Multimodal Fusion Transformer for Physical Audiovisual Commonsense Reasoning","authors":"Daoming Zong, Shiliang Sun","doi":"10.1609/aaai.v37i5.25813","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25813","url":null,"abstract":"Physical commonsense reasoning is essential for building reliable and interpretable AI systems, which involves a general understanding of the physical properties and affordances of everyday objects, how these objects can be manipulated, and how they interact with others. It is fundamentally a multi-modal task, as physical properties are manifested through multiple modalities, including vision and acoustics. In this work, we present a unified framework, named Multimodal Commonsense Transformer (MCOMET), for physical audiovisual commonsense reasoning. MCOMET has two intriguing properties: i) it fully mines higher-ordered temporal relationships across modalities (e.g., pairs, triplets, and quadruplets); and ii) it restricts the cross-modal flow through the feature collection and propagation mechanism along with tight fusion bottlenecks, forcing the model to attend the most relevant parts in each modality and suppressing the dissemination of noisy information. We evaluate our model on a very recent public benchmark, PACS. Results show that MCOMET significantly outperforms a variety of strong baselines, revealing powerful multi-modal commonsense reasoning capabilities.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"42 1","pages":"6621-6629"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75194890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Revisiting Unsupervised Local Descriptor Learning 回顾无监督局部描述符学习
Wu‐ru Wang, Lei Zhang, Hua Huang
Constructing accurate training tuples is crucial for unsupervised local descriptor learning, yet challenging due to the absence of patch labels. The state-of-the-art approach constructs tuples with heuristic rules, which struggle to precisely depict real-world patch transformations, in spite of enabling fast model convergence. A possible solution to alleviate the problem is the clustering-based approach, which can capture realistic patch variations and learn more accurate class decision boundaries, but suffers from slow model convergence. This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. In addition, HybridDesc also contributes two concrete enhancing mechanisms: (1) a Differentiable Hyperparameter Search (DHS) strategy to find the optimal hyperparameter setting of the rule-based approach so as to provide accurate prior for the clustering-based approach, (2) an On-Demand Clustering (ODC) method to reduce the clustering overhead of the clustering-based approach without eroding its advantage. Extensive experimental results show that HybridDesc can efficiently learn local descriptors that surpass existing unsupervised local descriptors and even rival competitive supervised ones.
构建准确的训练元组对于无监督局部描述符学习至关重要,但由于缺乏补丁标签而具有挑战性。最先进的方法构建具有启发式规则的元组,尽管能够实现快速模型收敛,但难以精确描述现实世界的补丁转换。缓解这个问题的一个可能的解决方案是基于聚类的方法,该方法可以捕获真实的补丁变化并学习更准确的类决策边界,但存在模型收敛缓慢的问题。HybridDesc是一种无监督的方法,它结合基于规则和基于聚类的方法来构造训练元组,学习功能强大的局部描述符模型,收敛速度快。此外,HybridDesc还提供了两种具体的增强机制:(1)可微分超参数搜索(DHS)策略,用于寻找基于规则的方法的最优超参数设置,从而为基于聚类的方法提供准确的先验;(2)按需聚类(ODC)方法,用于减少基于聚类的方法的聚类开销,同时又不损害其优势。大量的实验结果表明,HybridDesc可以有效地学习超越现有无监督局部描述符甚至竞争的有监督局部描述符的局部描述符。
{"title":"Revisiting Unsupervised Local Descriptor Learning","authors":"Wu‐ru Wang, Lei Zhang, Hua Huang","doi":"10.1609/aaai.v37i3.25367","DOIUrl":"https://doi.org/10.1609/aaai.v37i3.25367","url":null,"abstract":"Constructing accurate training tuples is crucial for unsupervised local descriptor learning, yet challenging due to the absence of patch labels. The state-of-the-art approach constructs tuples with heuristic rules, which struggle to precisely depict real-world patch transformations, in spite of enabling fast model convergence. A possible solution to alleviate the problem is the clustering-based approach, which can capture realistic patch variations and learn more accurate class decision boundaries, but suffers from slow model convergence. This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. In addition, HybridDesc also contributes two concrete enhancing mechanisms: (1) a Differentiable Hyperparameter Search (DHS) strategy to find the optimal hyperparameter setting of the rule-based approach so as to provide accurate prior for the clustering-based approach, (2) an On-Demand Clustering (ODC) method to reduce the clustering overhead of the clustering-based approach without eroding its advantage. Extensive experimental results show that HybridDesc can efficiently learn local descriptors that surpass existing unsupervised local descriptors and even rival competitive supervised ones.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"26 1","pages":"2680-2688"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75665780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FC-TrackNet: Fast Convergence Net for 6D Pose Tracking in Synthetic Domains FC-TrackNet:用于合成域6D姿态跟踪的快速收敛网络
Di Jia, Qianqian Wang, Jun Cao, Peng Cai, Zhiyang Jin
In this work, we propose a fast convergence track net, or FC-TrackNet, based on a synthetic data-driven approach to maintaining long-term 6D pose tracking. Comparison experiments are performed on two different datasets, The results demonstrate that our approach can achieve a consistent tracking frequency of 90.9 Hz as well as higher accuracy than the state-of-the art approaches.
在这项工作中,我们提出了一种基于综合数据驱动方法的快速收敛跟踪网络,或FC-TrackNet,以维持长期的6D姿态跟踪。在两个不同的数据集上进行了对比实验,结果表明,我们的方法可以实现90.9 Hz的一致跟踪频率,并且比目前的方法具有更高的精度。
{"title":"FC-TrackNet: Fast Convergence Net for 6D Pose Tracking in Synthetic Domains","authors":"Di Jia, Qianqian Wang, Jun Cao, Peng Cai, Zhiyang Jin","doi":"10.1609/aaai.v37i13.27077","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27077","url":null,"abstract":"In this work, we propose a fast convergence track net, or FC-TrackNet, based on a synthetic data-driven approach to maintaining long-term 6D pose tracking. Comparison experiments are performed on two different datasets, The results demonstrate that our approach can achieve a consistent tracking frequency of 90.9 Hz as well as higher accuracy than the state-of-the art approaches.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"32 1","pages":"16455-16457"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74441212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System MoMusic:一个动作驱动的人类-人工智能协作音乐创作和表演系统
Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo
The significant development of artificial neural network architectures has facilitated the increasing adoption of automated music composition models over the past few years. However, most existing systems feature algorithmic generative structures based on hard code and predefined rules, generally excluding interactive or improvised behaviors. We propose a motion based music system, MoMusic, as a AI real time music generation system. MoMusic features a partially randomized harmonic sequencing model based on a probabilistic analysis of tonal chord progressions, mathematically abstracted through musical set theory. This model is presented against a dual dimension grid that produces resulting sounds through a posture recognition mechanism. A camera captures the users' fingers' movement and trajectories, creating coherent, partially improvised harmonic progressions. MoMusic integrates several timbrical registers, from traditional classical instruments such as the piano to a new ''human voice instrument'' created using a voice conversion technique. Our research demonstrates MoMusic's interactiveness, ability to inspire musicians, and ability to generate coherent musical material with various timbrical registers. MoMusic's capabilities could be easily expanded to incorporate different forms of posture controlled timbrical transformation, rhythmic transformation, dynamic transformation, or even digital sound processing techniques.
在过去的几年中,人工神经网络架构的重大发展促进了自动化音乐作曲模型的日益普及。然而,大多数现有系统的特点是基于硬代码和预定义规则的算法生成结构,通常排除交互或临时行为。我们提出了一个基于动作的音乐系统MoMusic,作为一个人工智能实时音乐生成系统。MoMusic的特点是基于音调和弦进行的概率分析的部分随机谐波排序模型,通过音乐集理论进行数学抽象。该模型是针对通过姿势识别机制产生声音的二维网格提出的。摄像机捕捉使用者手指的运动和轨迹,创造出连贯的、部分即兴的和声。MoMusic集成了几个音质音域,从传统的古典乐器如钢琴到使用声音转换技术创建的新“人声乐器”。我们的研究证明了mommusic的互动性、激励音乐家的能力,以及用各种音域产生连贯音乐材料的能力。mommusic的功能可以很容易地扩展到包含不同形式的姿势控制的音色转换、节奏转换、动态转换,甚至是数字声音处理技术。
{"title":"MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System","authors":"Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo","doi":"10.1609/aaai.v37i13.26907","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26907","url":null,"abstract":"The significant development of artificial neural network architectures has facilitated the increasing adoption of automated music composition models over the past few years. However, most existing systems feature algorithmic generative structures based on hard code and predefined rules, generally excluding interactive or improvised behaviors. We propose a motion based music system, MoMusic, as a AI real time music generation system. MoMusic features a partially randomized harmonic sequencing model based on a probabilistic analysis of tonal chord progressions, mathematically abstracted through musical set theory. This model is presented against a dual dimension grid that produces resulting sounds through a posture recognition mechanism. A camera captures the users' fingers' movement and trajectories, creating coherent, partially improvised harmonic progressions. MoMusic integrates several timbrical registers, from traditional classical instruments such as the piano to a new ''human voice instrument'' created using a voice conversion technique. Our research demonstrates MoMusic's interactiveness, ability to inspire musicians, and ability to generate coherent musical material with various timbrical registers. MoMusic's capabilities could be easily expanded to incorporate different forms of posture controlled timbrical transformation, rhythmic transformation, dynamic transformation, or even digital sound processing techniques.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"28 1","pages":"16057-16062"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74526568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Music-to-Facial Expressions: Emotion-Based Music Visualization for the Hearing Impaired 音乐到面部表情:听觉受损者基于情感的音乐可视化
Yubo Wang, Fengzhou Pan, Danni Liu, Jiaxiong Hu
While music is made to convey messages and emotions, auditory music is not equally accessible to everyone. Music visualization is a common approach to augment the listening experiences of the hearing users and to provide music experiences for the hearing-impaired. In this paper, we present a music visualization system that can turn the input of a piece of music into a series of facial expressions representative of the continuously changing sentiments in the music. The resulting facial expressions, recorded as action units, can later animate a static virtual avatar to be emotive synchronously with the music.
虽然音乐是用来传达信息和情感的,但听觉音乐并不是人人都能接受的。音乐可视化是增强听觉使用者听觉体验和为听障人士提供音乐体验的一种常用方法。在本文中,我们提出了一个音乐可视化系统,它可以将一段音乐的输入转化为一系列代表音乐中不断变化的情绪的面部表情。由此产生的面部表情被记录为动作单元,随后可以使静态虚拟化身与音乐同步产生情感。
{"title":"Music-to-Facial Expressions: Emotion-Based Music Visualization for the Hearing Impaired","authors":"Yubo Wang, Fengzhou Pan, Danni Liu, Jiaxiong Hu","doi":"10.1609/aaai.v37i13.26912","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26912","url":null,"abstract":"While music is made to convey messages and emotions, auditory music is not equally accessible to everyone. Music visualization is a common approach to augment the listening experiences of the hearing users and to provide music experiences for the hearing-impaired. In this paper, we present a music visualization system that can turn the input of a piece of music into a series of facial expressions representative of the continuously changing sentiments in the music. The resulting facial expressions, recorded as action units, can later animate a static virtual avatar to be emotive synchronously with the music.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"43 2","pages":"16096-16102"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72482483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation 基于解纠集注意力的对话生成框架与角色感知提示学习
Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He
Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.
赋予对话代理人物角色是传递更多类似人类对话的关键。然而,现有的基于角色的对话系统仍然缺乏人类对话的信息细节,并且倾向于用不一致和通用的回应来回应。其中一个主要的潜在原因是,预定义的人物角色句子通常很短,只是对个人属性的肤浅描述,这使得适当的人物角色选择和理解变得非常重要。另一个挑战是,考虑上下文和会话流来动态地决定何时调用不同类型的角色信号是至关重要的。为了解决这些问题,我们提出了一种基于解纠缠注意力的预训练架构,该架构结合了角色感知的提示学习,以架起所选角色和响应生成之间的桥梁。我们的模型首先利用会话流来选择与上下文相关的人物角色,然后通过人物角色感知提示来丰富肤浅的人物角色描述,增加额外的人格特征。最后,该解码器利用解纠缠注意力机制来灵活控制对人物角色和对话上下文的依赖,并结合了类似a *的基于关键字的启发式估计来实现可控生成。大量的实验表明,我们的方法可以优于强基线,并在PERSONA-CHAT数据集上提供更一致和更吸引人的响应。
{"title":"A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation","authors":"Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He","doi":"10.1609/aaai.v37i11.26556","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26556","url":null,"abstract":"Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"10 1","pages":"13255-13263"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72581913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Total-Order HTN Plan Verification with Method Preconditions - An Extension of the CYK Parsing Algorithm 方法前提下HTN计划的全序验证——CYK解析算法的扩展
Songtuan Lin, G. Behnke, Simona Ondrčková, R. Barták, P. Bercher
In this paper, we consider the plan verification problem for totally ordered (TO) HTN planning. The problem is proved to be solvable in polynomial time by recognizing its connection to the membership decision problem for context-free grammars. Currently, most HTN plan verification approaches do not have special treatments for the TO configuration, and the only one features such an optimization still relies on an exhaustive search. Hence, we will develop a new TOHTN plan verification approach in this paper by extending the standard CYK parsing algorithm which acts as the best decision procedure in general.
本文研究了全有序HTN规划的规划验证问题。通过识别该问题与上下文无关语法的隶属性决策问题的联系,证明了该问题在多项式时间内可解。目前,大多数HTN计划验证方法没有对TO配置进行特殊处理,并且唯一具有这种优化功能的方法仍然依赖于穷举搜索。因此,我们将在本文中开发一种新的TOHTN计划验证方法,通过扩展标准的CYK解析算法作为一般的最佳决策过程。
{"title":"On Total-Order HTN Plan Verification with Method Preconditions - An Extension of the CYK Parsing Algorithm","authors":"Songtuan Lin, G. Behnke, Simona Ondrčková, R. Barták, P. Bercher","doi":"10.1609/aaai.v37i10.26420","DOIUrl":"https://doi.org/10.1609/aaai.v37i10.26420","url":null,"abstract":"In this paper, we consider the plan verification problem for totally ordered (TO) HTN planning. The problem is proved to be solvable in polynomial time by recognizing its connection to the membership decision problem for context-free grammars. Currently, most HTN plan verification approaches do not have special treatments for the TO configuration, and the only one features such an optimization still relies on an exhaustive search. Hence, we will develop a new TOHTN plan verification approach in this paper by extending the standard CYK parsing algorithm which acts as the best decision procedure in general.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"54 1","pages":"12041-12048"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74561317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1