Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence最新文献

英文中文

Pharmacokinetics-Informed Neural Network for Predicting Opioid Administration Moments with Wearable Sensors. 利用可穿戴传感器预测阿片类药物用药时刻的药代动力学神经网络。

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2024-02-01 Epub Date: 2024-03-24 DOI: 10.1609/aaai.v38i21.30326

Bhanu Teja Gullapalli, Stephanie Carreiro, Brittany P Chapman, Eric L Garland, Tauhidur Rahman

Long-term and high-dose prescription opioid use places individuals at risk for opioid misuse, opioid use disorder (OUD), and overdose. Existing methods for monitoring opioid use and detecting misuse rely on self-reports, which are prone to reporting bias, and toxicology testing, which may be infeasible in outpatient settings. Although wearable technologies for monitoring day-to-day health metrics have gained significant traction in recent years due to their ease of use, flexibility, and advancements in sensor technology, their application within the opioid use space remains underexplored. In the current work, we demonstrate that oral opioid administrations can be detected using physiological signals collected from a wrist sensor. More importantly, we show that models informed by opioid pharmacokinetics increase reliability in predicting the timing of opioid administrations. Forty-two individuals who were prescribed opioids as a part of their medical treatment in-hospital and after discharge were enrolled. Participants wore a wrist sensor throughout the study, while opioid administrations were tracked using electronic medical records and self-reports. We collected 1,983 hours of sensor data containing 187 opioid administrations from the inpatient setting and 927 hours of sensor data containing 40 opioid administrations from the outpatient setting. We demonstrate that a self-supervised pre-trained model, capable of learning the canonical time series of plasma concentration of the drug derived from opioid pharmacokinetics, can reliably detect opioid administration in both settings. Our work suggests the potential of pharmacokinetic-informed, data-driven models to objectively detect opioid use in daily life.

长期和大剂量使用处方类阿片会使患者面临类阿片滥用、类阿片使用障碍（OUD）和用药过量的风险。监测阿片类药物使用和检测滥用的现有方法依赖于自我报告和毒理学检测，前者容易产生报告偏差，后者在门诊环境中可能不可行。近年来，用于监测日常健康指标的可穿戴技术因其易用性、灵活性和传感器技术的进步而备受青睐，但其在阿片类药物使用领域的应用仍未得到充分探索。在目前的工作中，我们证明了可以利用腕部传感器收集的生理信号检测口服阿片类药物的情况。更重要的是，我们证明了根据阿片类药物动力学建立的模型可以提高预测阿片类药物给药时间的可靠性。我们招募了 42 名在院内和出院后接受阿片类药物治疗的患者。在整个研究过程中，受试者一直佩戴着腕部传感器，同时使用电子病历和自我报告跟踪阿片类药物的给药情况。我们收集了 1,983 个小时的传感器数据，其中包括 187 次住院阿片类药物给药，以及 927 个小时的传感器数据，其中包括 40 次门诊阿片类药物给药。我们证明，自我监督预训练模型能够学习从阿片类药物动力学中得出的药物血浆浓度的典型时间序列，能够可靠地检测两种环境中的阿片类药物给药情况。我们的工作表明，以药代动力学为依据的数据驱动模型具有在日常生活中客观检测阿片类药物使用情况的潜力。

{"title":"Pharmacokinetics-Informed Neural Network for Predicting Opioid Administration Moments with Wearable Sensors.","authors":"Bhanu Teja Gullapalli, Stephanie Carreiro, Brittany P Chapman, Eric L Garland, Tauhidur Rahman","doi":"10.1609/aaai.v38i21.30326","DOIUrl":"10.1609/aaai.v38i21.30326","url":null,"abstract":"Long-term and high-dose prescription opioid use places individuals at risk for opioid misuse, opioid use disorder (OUD), and overdose. Existing methods for monitoring opioid use and detecting misuse rely on self-reports, which are prone to reporting bias, and toxicology testing, which may be infeasible in outpatient settings. Although wearable technologies for monitoring day-to-day health metrics have gained significant traction in recent years due to their ease of use, flexibility, and advancements in sensor technology, their application within the opioid use space remains underexplored. In the current work, we demonstrate that oral opioid administrations can be detected using physiological signals collected from a wrist sensor. More importantly, we show that models informed by opioid pharmacokinetics increase reliability in predicting the timing of opioid administrations. Forty-two individuals who were prescribed opioids as a part of their medical treatment in-hospital and after discharge were enrolled. Participants wore a wrist sensor throughout the study, while opioid administrations were tracked using electronic medical records and self-reports. We collected 1,983 hours of sensor data containing 187 opioid administrations from the inpatient setting and 927 hours of sensor data containing 40 opioid administrations from the outpatient setting. We demonstrate that a self-supervised pre-trained model, capable of learning the canonical time series of plasma concentration of the drug derived from opioid pharmacokinetics, can reliably detect opioid administration in both settings. Our work suggests the potential of pharmacokinetic-informed, data-driven models to objectively detect opioid use in daily life.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"38 21","pages":"22892-22898"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11027727/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140861820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity. IGAMT：具有异质性和不规则性的隐私保护电子病历综合。

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2024-01-01 Epub Date: 2024-03-24 DOI: 10.1609/aaai.v38i14.29491

Wenjie Wang, Pengfei Tang, Jian Lou, Yuanming Shao, Lance Waller, Yi-An Ko, Li Xiong

Utilizing electronic health records (EHR) for machine learning-driven clinical research has great potential to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary use of EHR data is regulated, constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the privacy risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist: addressing the heterogeneity of features including temporal and non-temporal features, structurally missing values, and irregularity of the temporal measures, and ensuring rigorous privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose IGAMT, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintains high quality with heterogeneous features, missing values, and irregular measures but also achieves differential privacy with enhanced privacy-utility trade-off. Extensive experiments prove that IGAMT significantly outperforms baseline and state-of-the-art models in terms of resemblance to real data and performance of downstream applications. Ablation studies also prove the effectiveness of the techniques applied in IGAMT.

利用电子健康记录（EHR）进行机器学习驱动的临床研究，在增强结果预测和治疗个性化方面具有巨大的潜力。然而，出于隐私和安全考虑，电子病历数据的二次使用受到了监管，限制了研究人员对电子病历数据的访问。使用深度学习方法生成合成EHR数据是一种可行且有前途的方法，可以减轻隐私问题，不仅为下游应用程序提供补充资源，还可以避免与真实患者数据相关的隐私风险。虽然之前的工作集中在EHR数据合成上，但仍然存在重大挑战：解决特征的异质性，包括时间和非时间特征、结构缺失值和时间度量的不规则性，并确保用于模型训练的真实数据的严格隐私。该领域的现有工作只集中于解决上述一两个挑战。在这项工作中，我们提出了IGAMT，这是一个创新的框架，用于生成隐私保护的合成电子病历数据，该数据不仅具有异构特征，缺失值和不规则度量，而且可以通过增强隐私-效用权衡来实现差分隐私。大量的实验证明，IGAMT在与真实数据的相似性和下游应用程序的性能方面明显优于基线和最先进的模型。消融研究也证明了应用于IGAMT的技术的有效性。

{"title":"IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity.","authors":"Wenjie Wang, Pengfei Tang, Jian Lou, Yuanming Shao, Lance Waller, Yi-An Ko, Li Xiong","doi":"10.1609/aaai.v38i14.29491","DOIUrl":"https://doi.org/10.1609/aaai.v38i14.29491","url":null,"abstract":"Utilizing electronic health records (EHR) for machine learning-driven clinical research has great potential to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary use of EHR data is regulated, constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the privacy risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist: addressing the heterogeneity of features including temporal and non-temporal features, structurally missing values, and irregularity of the temporal measures, and ensuring rigorous privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose IGAMT, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintains high quality with heterogeneous features, missing values, and irregular measures but also achieves differential privacy with enhanced privacy-utility trade-off. Extensive experiments prove that IGAMT significantly outperforms baseline and state-of-the-art models in terms of resemblance to real data and performance of downstream applications. Ablation studies also prove the effectiveness of the techniques applied in IGAMT.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"38 14","pages":"15634-15643"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606572/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142775537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Erratum to: 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation 更正:3D- togo:迈向文本引导的跨类别3D对象生成

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-09-06 DOI: 10.1609/aaai.v37i13.27320

Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei Zhang, Xiaojun Chang, Hang Xu

The Original Article was published on 26 June 2023.

原文发表于2023年6月26日。

引用次数: 0

Multimodal Deep Generative Models for Remote Medical Applications 远程医疗应用的多模态深度生成模型

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i13.26924

Catherine Ordun

Visible-to-Thermal (VT) face translation is an under-studied problem of image-to-image translation that offers an AI-enabled alternative to traditional thermal sensors. Over three phases, my Doctoral Proposal explores developing multimodal deep generative solutions that can be applied towards telemedicine applications. These include the contribution of a novel Thermal Face Contrastive GAN (TFC-GAN), exploration of hybridized diffusion-GAN models, application on real clinical thermal data at the National Institutes of Health, and exploration of strategies for Federated Learning (FL) in heterogenous data settings.

可视到热(VT)人脸翻译是一个尚未得到充分研究的图像到图像翻译问题，它为传统的热传感器提供了一种人工智能支持的替代方案。在三个阶段，我的博士提案探索开发可应用于远程医疗应用的多模态深度生成解决方案。其中包括一种新型热面对比GAN (TFC-GAN)的贡献，混合扩散GAN模型的探索，在美国国立卫生研究院的实际临床热数据上的应用，以及在异构数据设置中探索联邦学习(FL)策略。

引用次数: 0

McOmet: Multimodal Fusion Transformer for Physical Audiovisual Commonsense Reasoning 物理视听常识推理的多模态融合变压器

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i5.25813

Daoming Zong, Shiliang Sun

Physical commonsense reasoning is essential for building reliable and interpretable AI systems, which involves a general understanding of the physical properties and affordances of everyday objects, how these objects can be manipulated, and how they interact with others. It is fundamentally a multi-modal task, as physical properties are manifested through multiple modalities, including vision and acoustics. In this work, we present a unified framework, named Multimodal Commonsense Transformer (MCOMET), for physical audiovisual commonsense reasoning. MCOMET has two intriguing properties: i) it fully mines higher-ordered temporal relationships across modalities (e.g., pairs, triplets, and quadruplets); and ii) it restricts the cross-modal flow through the feature collection and propagation mechanism along with tight fusion bottlenecks, forcing the model to attend the most relevant parts in each modality and suppressing the dissemination of noisy information. We evaluate our model on a very recent public benchmark, PACS. Results show that MCOMET significantly outperforms a variety of strong baselines, revealing powerful multi-modal commonsense reasoning capabilities.

物理常识推理对于构建可靠和可解释的人工智能系统至关重要，这涉及到对日常物品的物理特性和功能的一般理解，这些物品如何被操纵，以及它们如何与他人互动。它基本上是一个多模态任务，因为物理特性通过多种模态表现出来，包括视觉和声学。在这项工作中，我们提出了一个统一的框架，称为多模态常识转换器(MCOMET)，用于物理视听常识推理。MCOMET有两个有趣的特性:i)它完全挖掘跨模态(例如，对、三联体和四联体)的高阶时间关系;ii)通过特征收集和传播机制限制了跨模态的流动，融合瓶颈较紧，迫使模型关注每个模态中最相关的部分，抑制了噪声信息的传播。我们用最近的公共基准PACS来评估我们的模型。结果表明，MCOMET显著优于各种强基线，显示出强大的多模态常识推理能力。

{"title":"McOmet: Multimodal Fusion Transformer for Physical Audiovisual Commonsense Reasoning","authors":"Daoming Zong, Shiliang Sun","doi":"10.1609/aaai.v37i5.25813","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25813","url":null,"abstract":"Physical commonsense reasoning is essential for building reliable and interpretable AI systems, which involves a general understanding of the physical properties and affordances of everyday objects, how these objects can be manipulated, and how they interact with others. It is fundamentally a multi-modal task, as physical properties are manifested through multiple modalities, including vision and acoustics. In this work, we present a unified framework, named Multimodal Commonsense Transformer (MCOMET), for physical audiovisual commonsense reasoning. MCOMET has two intriguing properties: i) it fully mines higher-ordered temporal relationships across modalities (e.g., pairs, triplets, and quadruplets); and ii) it restricts the cross-modal flow through the feature collection and propagation mechanism along with tight fusion bottlenecks, forcing the model to attend the most relevant parts in each modality and suppressing the dissemination of noisy information. We evaluate our model on a very recent public benchmark, PACS. Results show that MCOMET significantly outperforms a variety of strong baselines, revealing powerful multi-modal commonsense reasoning capabilities.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"42 1","pages":"6621-6629"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75194890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Revisiting Unsupervised Local Descriptor Learning 回顾无监督局部描述符学习

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i3.25367

Wu‐ru Wang, Lei Zhang, Hua Huang

Constructing accurate training tuples is crucial for unsupervised local descriptor learning, yet challenging due to the absence of patch labels. The state-of-the-art approach constructs tuples with heuristic rules, which struggle to precisely depict real-world patch transformations, in spite of enabling fast model convergence. A possible solution to alleviate the problem is the clustering-based approach, which can capture realistic patch variations and learn more accurate class decision boundaries, but suffers from slow model convergence. This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. In addition, HybridDesc also contributes two concrete enhancing mechanisms: (1) a Differentiable Hyperparameter Search (DHS) strategy to find the optimal hyperparameter setting of the rule-based approach so as to provide accurate prior for the clustering-based approach, (2) an On-Demand Clustering (ODC) method to reduce the clustering overhead of the clustering-based approach without eroding its advantage. Extensive experimental results show that HybridDesc can efficiently learn local descriptors that surpass existing unsupervised local descriptors and even rival competitive supervised ones.

构建准确的训练元组对于无监督局部描述符学习至关重要，但由于缺乏补丁标签而具有挑战性。最先进的方法构建具有启发式规则的元组，尽管能够实现快速模型收敛，但难以精确描述现实世界的补丁转换。缓解这个问题的一个可能的解决方案是基于聚类的方法，该方法可以捕获真实的补丁变化并学习更准确的类决策边界，但存在模型收敛缓慢的问题。HybridDesc是一种无监督的方法，它结合基于规则和基于聚类的方法来构造训练元组，学习功能强大的局部描述符模型，收敛速度快。此外，HybridDesc还提供了两种具体的增强机制:(1)可微分超参数搜索(DHS)策略，用于寻找基于规则的方法的最优超参数设置，从而为基于聚类的方法提供准确的先验;(2)按需聚类(ODC)方法，用于减少基于聚类的方法的聚类开销，同时又不损害其优势。大量的实验结果表明，HybridDesc可以有效地学习超越现有无监督局部描述符甚至竞争的有监督局部描述符的局部描述符。

{"title":"Revisiting Unsupervised Local Descriptor Learning","authors":"Wu‐ru Wang, Lei Zhang, Hua Huang","doi":"10.1609/aaai.v37i3.25367","DOIUrl":"https://doi.org/10.1609/aaai.v37i3.25367","url":null,"abstract":"Constructing accurate training tuples is crucial for unsupervised local descriptor learning, yet challenging due to the absence of patch labels. The state-of-the-art approach constructs tuples with heuristic rules, which struggle to precisely depict real-world patch transformations, in spite of enabling fast model convergence. A possible solution to alleviate the problem is the clustering-based approach, which can capture realistic patch variations and learn more accurate class decision boundaries, but suffers from slow model convergence. This paper presents HybridDesc, an unsupervised approach that learns powerful local descriptor models with fast convergence speed by combining the rule-based and clustering-based approaches to construct training tuples. In addition, HybridDesc also contributes two concrete enhancing mechanisms: (1) a Differentiable Hyperparameter Search (DHS) strategy to find the optimal hyperparameter setting of the rule-based approach so as to provide accurate prior for the clustering-based approach, (2) an On-Demand Clustering (ODC) method to reduce the clustering overhead of the clustering-based approach without eroding its advantage. Extensive experimental results show that HybridDesc can efficiently learn local descriptors that surpass existing unsupervised local descriptors and even rival competitive supervised ones.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"26 1","pages":"2680-2688"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75665780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FC-TrackNet: Fast Convergence Net for 6D Pose Tracking in Synthetic Domains FC-TrackNet:用于合成域6D姿态跟踪的快速收敛网络

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i13.27077

Di Jia, Qianqian Wang, Jun Cao, Peng Cai, Zhiyang Jin

In this work, we propose a fast convergence track net, or FC-TrackNet, based on a synthetic data-driven approach to maintaining long-term 6D pose tracking. Comparison experiments are performed on two different datasets, The results demonstrate that our approach can achieve a consistent tracking frequency of 90.9 Hz as well as higher accuracy than the state-of-the art approaches.

在这项工作中，我们提出了一种基于综合数据驱动方法的快速收敛跟踪网络，或FC-TrackNet，以维持长期的6D姿态跟踪。在两个不同的数据集上进行了对比实验，结果表明，我们的方法可以实现90.9 Hz的一致跟踪频率，并且比目前的方法具有更高的精度。

引用次数: 0

MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System MoMusic:一个动作驱动的人类-人工智能协作音乐创作和表演系统

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i13.26907

Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo

The significant development of artificial neural network architectures has facilitated the increasing adoption of automated music composition models over the past few years. However, most existing systems feature algorithmic generative structures based on hard code and predefined rules, generally excluding interactive or improvised behaviors. We propose a motion based music system, MoMusic, as a AI real time music generation system. MoMusic features a partially randomized harmonic sequencing model based on a probabilistic analysis of tonal chord progressions, mathematically abstracted through musical set theory. This model is presented against a dual dimension grid that produces resulting sounds through a posture recognition mechanism. A camera captures the users' fingers' movement and trajectories, creating coherent, partially improvised harmonic progressions. MoMusic integrates several timbrical registers, from traditional classical instruments such as the piano to a new ''human voice instrument'' created using a voice conversion technique. Our research demonstrates MoMusic's interactiveness, ability to inspire musicians, and ability to generate coherent musical material with various timbrical registers. MoMusic's capabilities could be easily expanded to incorporate different forms of posture controlled timbrical transformation, rhythmic transformation, dynamic transformation, or even digital sound processing techniques.

在过去的几年中，人工神经网络架构的重大发展促进了自动化音乐作曲模型的日益普及。然而，大多数现有系统的特点是基于硬代码和预定义规则的算法生成结构，通常排除交互或临时行为。我们提出了一个基于动作的音乐系统MoMusic，作为一个人工智能实时音乐生成系统。MoMusic的特点是基于音调和弦进行的概率分析的部分随机谐波排序模型，通过音乐集理论进行数学抽象。该模型是针对通过姿势识别机制产生声音的二维网格提出的。摄像机捕捉使用者手指的运动和轨迹，创造出连贯的、部分即兴的和声。MoMusic集成了几个音质音域，从传统的古典乐器如钢琴到使用声音转换技术创建的新“人声乐器”。我们的研究证明了mommusic的互动性、激励音乐家的能力，以及用各种音域产生连贯音乐材料的能力。mommusic的功能可以很容易地扩展到包含不同形式的姿势控制的音色转换、节奏转换、动态转换，甚至是数字声音处理技术。

{"title":"MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System","authors":"Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue, R. Trillo","doi":"10.1609/aaai.v37i13.26907","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26907","url":null,"abstract":"The significant development of artificial neural network architectures has facilitated the increasing adoption of automated music composition models over the past few years. However, most existing systems feature algorithmic generative structures based on hard code and predefined rules, generally excluding interactive or improvised behaviors. We propose a motion based music system, MoMusic, as a AI real time music generation system. MoMusic features a partially randomized harmonic sequencing model based on a probabilistic analysis of tonal chord progressions, mathematically abstracted through musical set theory. This model is presented against a dual dimension grid that produces resulting sounds through a posture recognition mechanism. A camera captures the users' fingers' movement and trajectories, creating coherent, partially improvised harmonic progressions. MoMusic integrates several timbrical registers, from traditional classical instruments such as the piano to a new ''human voice instrument'' created using a voice conversion technique. Our research demonstrates MoMusic's interactiveness, ability to inspire musicians, and ability to generate coherent musical material with various timbrical registers. MoMusic's capabilities could be easily expanded to incorporate different forms of posture controlled timbrical transformation, rhythmic transformation, dynamic transformation, or even digital sound processing techniques.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"28 1","pages":"16057-16062"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74526568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Music-to-Facial Expressions: Emotion-Based Music Visualization for the Hearing Impaired 音乐到面部表情:听觉受损者基于情感的音乐可视化

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i13.26912

Yubo Wang, Fengzhou Pan, Danni Liu, Jiaxiong Hu

While music is made to convey messages and emotions, auditory music is not equally accessible to everyone. Music visualization is a common approach to augment the listening experiences of the hearing users and to provide music experiences for the hearing-impaired. In this paper, we present a music visualization system that can turn the input of a piece of music into a series of facial expressions representative of the continuously changing sentiments in the music. The resulting facial expressions, recorded as action units, can later animate a static virtual avatar to be emotive synchronously with the music.

虽然音乐是用来传达信息和情感的，但听觉音乐并不是人人都能接受的。音乐可视化是增强听觉使用者听觉体验和为听障人士提供音乐体验的一种常用方法。在本文中，我们提出了一个音乐可视化系统，它可以将一段音乐的输入转化为一系列代表音乐中不断变化的情绪的面部表情。由此产生的面部表情被记录为动作单元，随后可以使静态虚拟化身与音乐同步产生情感。

引用次数: 0

A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation 基于解纠集注意力的对话生成框架与角色感知提示学习

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Pub Date : 2023-06-26 DOI: 10.1609/aaai.v37i11.26556

Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He

Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.

赋予对话代理人物角色是传递更多类似人类对话的关键。然而，现有的基于角色的对话系统仍然缺乏人类对话的信息细节，并且倾向于用不一致和通用的回应来回应。其中一个主要的潜在原因是，预定义的人物角色句子通常很短，只是对个人属性的肤浅描述，这使得适当的人物角色选择和理解变得非常重要。另一个挑战是，考虑上下文和会话流来动态地决定何时调用不同类型的角色信号是至关重要的。为了解决这些问题，我们提出了一种基于解纠缠注意力的预训练架构，该架构结合了角色感知的提示学习，以架起所选角色和响应生成之间的桥梁。我们的模型首先利用会话流来选择与上下文相关的人物角色，然后通过人物角色感知提示来丰富肤浅的人物角色描述，增加额外的人格特征。最后，该解码器利用解纠缠注意力机制来灵活控制对人物角色和对话上下文的依赖，并结合了类似a *的基于关键字的启发式估计来实现可控生成。大量的实验表明，我们的方法可以优于强基线，并在PERSONA-CHAT数据集上提供更一致和更吸引人的响应。

{"title":"A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation","authors":"Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, Liang He","doi":"10.1609/aaai.v37i11.26556","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26556","url":null,"abstract":"Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"10 1","pages":"13255-13263"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72581913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀