ArXiv最新文献

英文中文

Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery 利用对 Incus 的姿势估计进行单目显微镜到 CT 的注册，用于增强现实人工耳蜗植入手术

ArXiv

Pub Date : 2024-03-12 DOI: 10.1117/12.3008830

Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noble

For those experiencing severe-to-profound sensorineural hearing loss, the cochlear implant (CI) is the preferred treatment. Augmented reality (AR) aided surgery can potentially improve CI procedures and hearing outcomes. Typically, AR solutions for image-guided surgery rely on optical tracking systems to register pre-operative planning information to the display so that hidden anatomy or other important information can be overlayed and co-registered with the view of the surgical scene. In this paper, our goal is to develop a method that permits direct 2D-to-3D registration of the microscope video to the pre-operative Computed Tomography (CT) scan without the need for external tracking equipment. Our proposed solution involves using surface mapping of a portion of the incus in surgical recordings and determining the pose of this structure relative to the surgical microscope by performing pose estimation via the perspective-n-point (PnP) algorithm. This registration can then be applied to pre-operative segmentations of other anatomy-of-interest, as well as the planned electrode insertion trajectory to co-register this information for the AR display. Our results demonstrate the accuracy with an average rotation error of less than 25 degrees and a translation error of less than 2 mm, 3 mm, and 0.55% for the x, y, and z axes, respectively. Our proposed method has the potential to be applicable and generalized to other surgical procedures while only needing a monocular microscope during intra-operation.

对于重度到永久性感音神经性听力损失患者来说，人工耳蜗（CI）是首选的治疗方法。增强现实（AR）辅助手术有可能改善人工耳蜗手术和听力效果。通常情况下，用于图像引导手术的 AR 解决方案依赖光学跟踪系统将术前规划信息注册到显示屏上，以便将隐藏的解剖结构或其他重要信息与手术场景的视图叠加并共同注册。在本文中，我们的目标是开发一种方法，允许显微镜视频与术前计算机断层扫描 (CT) 扫描进行二维到三维的直接配准，而无需外部跟踪设备。我们提出的解决方案包括使用手术记录中切口部分的表面映射，并通过透视-点（PnP）算法进行姿态估计，确定该结构相对于手术显微镜的姿态。然后，可将此配准应用于其他感兴趣解剖结构的术前分割以及计划的电极插入轨迹，以便为 AR 显示共同配准这些信息。我们的研究结果证明了这一方法的准确性，其 x、y 和 z 轴的平均旋转误差小于 25 度，平移误差分别小于 2 毫米、3 毫米和 0.55%。我们提出的方法有可能适用于并推广到其他外科手术中，同时在术中只需要一个单目显微镜。

{"title":"Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery","authors":"Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noble","doi":"10.1117/12.3008830","DOIUrl":"https://doi.org/10.1117/12.3008830","url":null,"abstract":"For those experiencing severe-to-profound sensorineural hearing loss, the cochlear implant (CI) is the preferred treatment. Augmented reality (AR) aided surgery can potentially improve CI procedures and hearing outcomes. Typically, AR solutions for image-guided surgery rely on optical tracking systems to register pre-operative planning information to the display so that hidden anatomy or other important information can be overlayed and co-registered with the view of the surgical scene. In this paper, our goal is to develop a method that permits direct 2D-to-3D registration of the microscope video to the pre-operative Computed Tomography (CT) scan without the need for external tracking equipment. Our proposed solution involves using surface mapping of a portion of the incus in surgical recordings and determining the pose of this structure relative to the surgical microscope by performing pose estimation via the perspective-n-point (PnP) algorithm. This registration can then be applied to pre-operative segmentations of other anatomy-of-interest, as well as the planned electrode insertion trajectory to co-register this information for the AR display. Our results demonstrate the accuracy with an average rotation error of less than 25 degrees and a translation error of less than 2 mm, 3 mm, and 0.55% for the x, y, and z axes, respectively. Our proposed method has the potential to be applicable and generalized to other surgical procedures while only needing a monocular microscope during intra-operation.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"53 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140394480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models 分析对序列到序列相关性模型的对抗性攻击

ArXiv

Pub Date : 2024-03-12 DOI: 10.1007/978-3-031-56060-6_19

Andrew Parry, Maik Fröbe, Sean MacAvaney, Martin Potthast, Matthias Hagen

引用次数: 0

Perennial Semantic Data Terms of Use for Decentralized Web 去中心化网络的常年语义数据使用条款

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3589334.3645631

Rui Zhao, Jun Zhao

In today's digital landscape, the Web has become increasingly centralized, raising concerns about user privacy violations. Decentralized Web architectures, such as Solid, offer a promising solution by empowering users with better control over their data in their personal `Pods'. However, a significant challenge remains: users must navigate numerous applications to decide which application can be trusted with access to their data Pods. This often involves reading lengthy and complex Terms of Use agreements, a process that users often find daunting or simply ignore. This compromises user autonomy and impedes detection of data misuse. We propose a novel formal description of Data Terms of Use (DToU), along with a DToU reasoner. Users and applications specify their own parts of the DToU policy with local knowledge, covering permissions, requirements, prohibitions and obligations. Automated reasoning verifies compliance, and also derives policies for output data. This constitutes a ``perennial'' DToU language, where the policy authoring only occurs once, and we can conduct ongoing automated checks across users, applications and activity cycles. Our solution is built on Turtle, Notation 3 and RDF Surfaces, for the language and the reasoning engine. It ensures seamless integration with other semantic tools for enhanced interoperability. We have successfully integrated this language into the Solid framework, and conducted performance benchmark. We believe this work demonstrates a practicality of a perennial DToU language and the potential of a paradigm shift to how users interact with data and applications in a decentralized Web, offering both improved privacy and usability.

在当今的数字环境中，网络变得越来越集中，引发了对侵犯用户隐私的担忧。Solid 等去中心化网络架构提供了一个很有前景的解决方案，使用户能够更好地控制其个人 "Pods "中的数据。然而，一个巨大的挑战依然存在：用户必须浏览众多应用程序，以决定哪个应用程序可以信任地访问其数据 "Pod"。这通常需要阅读冗长而复杂的使用条款协议，而用户往往对这一过程望而生畏或干脆置之不理。这损害了用户的自主性，并阻碍了对数据滥用的检测。我们提出了一种新颖的数据使用条款（DToU）形式描述，以及一个 DToU 推理器。用户和应用程序利用本地知识指定自己的 DToU 政策部分，包括权限、要求、禁令和义务。自动推理验证合规性，并推导出输出数据的政策。这就构成了一种 "常年 "的 DToU 语言，在这种语言中，策略编写只发生一次，我们可以跨用户、应用程序和活动周期进行持续的自动检查。我们的解决方案基于 Turtle、Notation 3 和 RDF Surfaces，用于语言和推理引擎。它可确保与其他语义工具无缝集成，从而增强互操作性。我们已成功地将这种语言集成到 Solid 框架中，并进行了性能基准测试。我们相信，这项工作证明了多年来的 DToU 语言的实用性，以及在分散式网络中用户与数据和应用程序交互方式的范式转变潜力，从而提供更好的隐私性和可用性。

{"title":"Perennial Semantic Data Terms of Use for Decentralized Web","authors":"Rui Zhao, Jun Zhao","doi":"10.1145/3589334.3645631","DOIUrl":"https://doi.org/10.1145/3589334.3645631","url":null,"abstract":"In today's digital landscape, the Web has become increasingly centralized, raising concerns about user privacy violations. Decentralized Web architectures, such as Solid, offer a promising solution by empowering users with better control over their data in their personal `Pods'. However, a significant challenge remains: users must navigate numerous applications to decide which application can be trusted with access to their data Pods. This often involves reading lengthy and complex Terms of Use agreements, a process that users often find daunting or simply ignore. This compromises user autonomy and impedes detection of data misuse. We propose a novel formal description of Data Terms of Use (DToU), along with a DToU reasoner. Users and applications specify their own parts of the DToU policy with local knowledge, covering permissions, requirements, prohibitions and obligations. Automated reasoning verifies compliance, and also derives policies for output data. This constitutes a ``perennial'' DToU language, where the policy authoring only occurs once, and we can conduct ongoing automated checks across users, applications and activity cycles. Our solution is built on Turtle, Notation 3 and RDF Surfaces, for the language and the reasoning engine. It ensures seamless integration with other semantic tools for enhanced interoperability. We have successfully integrated this language into the Solid framework, and conducted performance benchmark. We believe this work demonstrates a practicality of a perennial DToU language and the potential of a paradigm shift to how users interact with data and applications in a decentralized Web, offering both improved privacy and usability.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"18 S25","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast-Forward Reality: Authoring Error-Free Context-Aware Policies with Real-Time Unit Tests in Extended Reality 快进现实：在扩展现实中通过实时单元测试编写无差错的上下文感知策略

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3613904.3642158

Xun Qian, Tianyi Wang, Xuhai Xu, Tanya R. Jonker, Kashyap Todi

Advances in ubiquitous computing have enabled end-user authoring of context-aware policies (CAPs) that control smart devices based on specific contexts of the user and environment. However, authoring CAPs accurately and avoiding run-time errors is challenging for end-users as it is difficult to foresee CAP behaviors under complex real-world conditions. We propose Fast-Forward Reality, an Extended Reality (XR) based authoring workflow that enables end-users to iteratively author and refine CAPs by validating their behaviors via simulated unit test cases. We develop a computational approach to automatically generate test cases based on the authored CAP and the user's context history. Our system delivers each test case with immersive visualizations in XR, facilitating users to verify the CAP behavior and identify necessary refinements. We evaluated Fast-Forward Reality in a user study (N=12). Our authoring and validation process improved the accuracy of CAPs and the users provided positive feedback on the system usability.

无处不在的计算技术的进步使得终端用户能够编写情境感知策略（CAP），根据用户和环境的特定情境来控制智能设备。然而，准确地编写 CAP 并避免运行时出错对终端用户来说是一项挑战，因为很难预见 CAP 在复杂的现实世界条件下的行为。我们提出了一种基于扩展现实（XR）的创作工作流程--Fast-Forward Reality，通过模拟单元测试案例验证 CAP 的行为，使最终用户能够反复创作和完善 CAP。我们开发了一种计算方法，可根据撰写的 CAP 和用户的上下文历史自动生成测试用例。我们的系统在 XR 中以身临其境的可视化方式提供每个测试用例，方便用户验证 CAP 行为并确定必要的改进。我们在一项用户研究（N=12）中对 Fast-Forward Reality 进行了评估。我们的编写和验证流程提高了 CAP 的准确性，用户对系统的可用性给予了积极反馈。

引用次数: 0

Proactive Recommendation with Iterative Preference Guidance 通过迭代偏好指导进行主动推荐

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3589335.3651548

Shuxian Bi, Wenjie Wang, Hang Pan, Fuli Feng, Xiangnan He

Recommender systems mainly tailor personalized recommendations according to user interests learned from user feedback. However, such recommender systems passively cater to user interests and even reinforce existing interests in the feedback loop, leading to problems like filter bubbles and opinion polarization. To counteract this, proactive recommendation actively steers users towards developing new interests in a target item or topic by strategically modulating recommendation sequences. Existing work for proactive recommendation faces significant hurdles: 1) overlooking the user feedback in the guidance process; 2) lacking explicit modeling of the guiding objective; and 3) insufficient flexibility for integration into existing industrial recommender systems. To address these issues, we introduce an Iterative Preference Guidance (IPG) framework. IPG performs proactive recommendation in a flexible post-processing manner by ranking items according to their IPG scores that consider both interaction probability and guiding value. These scores are explicitly estimated with iteratively updated user representation that considers the most recent user interactions. Extensive experiments validate that IPG can effectively guide user interests toward target interests with a reasonable trade-off in recommender accuracy. The code is available at https://github.com/GabyUSTC/IPG-Rec.

推荐系统主要根据从用户反馈中了解到的用户兴趣定制个性化推荐。然而，这类推荐系统被动地迎合用户兴趣，甚至在反馈环路中强化已有兴趣，从而导致过滤泡沫和意见极化等问题。为了解决这一问题，主动推荐系统通过有策略地调整推荐序列，积极引导用户对目标项目或主题产生新的兴趣。现有的主动推荐工作面临着重大障碍：1）在引导过程中忽略了用户反馈；2）缺乏对引导目标的明确建模；3）灵活性不足，无法集成到现有的工业推荐系统中。为了解决这些问题，我们引入了迭代偏好引导（IPG）框架。IPG 以灵活的后处理方式执行主动推荐，根据项目的 IPG 分数（同时考虑互动概率和指导价值）进行排序。这些分数是根据迭代更新的用户代表估算的，其中考虑到了最近的用户交互。广泛的实验验证了 IPG 能有效地将用户兴趣导向目标兴趣，并在推荐准确性方面做出合理的权衡。代码可在 https://github.com/GabyUSTC/IPG-Rec 上获取。

{"title":"Proactive Recommendation with Iterative Preference Guidance","authors":"Shuxian Bi, Wenjie Wang, Hang Pan, Fuli Feng, Xiangnan He","doi":"10.1145/3589335.3651548","DOIUrl":"https://doi.org/10.1145/3589335.3651548","url":null,"abstract":"Recommender systems mainly tailor personalized recommendations according to user interests learned from user feedback. However, such recommender systems passively cater to user interests and even reinforce existing interests in the feedback loop, leading to problems like filter bubbles and opinion polarization. To counteract this, proactive recommendation actively steers users towards developing new interests in a target item or topic by strategically modulating recommendation sequences. Existing work for proactive recommendation faces significant hurdles: 1) overlooking the user feedback in the guidance process; 2) lacking explicit modeling of the guiding objective; and 3) insufficient flexibility for integration into existing industrial recommender systems. To address these issues, we introduce an Iterative Preference Guidance (IPG) framework. IPG performs proactive recommendation in a flexible post-processing manner by ranking items according to their IPG scores that consider both interaction probability and guiding value. These scores are explicitly estimated with iteratively updated user representation that considers the most recent user interactions. Extensive experiments validate that IPG can effectively guide user interests toward target interests with a reasonable trade-off in recommender accuracy. The code is available at https://github.com/GabyUSTC/IPG-Rec.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"75 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertainty-guided Contrastive Learning for Single Source Domain Generalisation 单源领域泛化的不确定性引导对比学习

ArXiv

Pub Date : 2024-03-12 DOI: 10.1109/icassp48485.2024.10448096

Anastasios Arsenos, D. Kollias, Evangelos Petrongonas, Christos Skliros, S. Kollias

In the context of single domain generalisation, the objective is for models that have been exclusively trained on data from a single domain to demonstrate strong performance when confronted with various unfamiliar domains. In this paper, we introduce a novel model referred to as Contrastive Uncertainty Domain Generalisation Network (CUDGNet). The key idea is to augment the source capacity in both input and label spaces through the fictitious domain generator and jointly learn the domain invariant representation of each class through contrastive learning. Extensive experiments on two Single Source Domain Generalisation (SSDG) datasets demonstrate the effectiveness of our approach, which surpasses the state-of-the-art single-DG methods by up to $7.08%$. Our method also provides efficient uncertainty estimation at inference time from a single forward pass through the generator subnetwork.

在单领域泛化的背景下，我们的目标是让专门在单领域数据上训练过的模型在面对各种陌生领域时表现出强大的性能。在本文中，我们引入了一种新型模型，称为对比不确定域泛化网络（CUDGNet）。其主要思想是通过虚构域生成器增强输入和标签空间的源容量，并通过对比学习共同学习每个类别的域不变表示。在两个单源域泛化（SSDG）数据集上进行的广泛实验证明了我们的方法的有效性，它超越了最先进的单源域泛化方法，最高可达 7.08 美元/%$。我们的方法还能在推理时通过生成器子网络的单次前向传递提供高效的不确定性估计。

引用次数: 0

From Files to Streams: Revisiting Web History and Exploring Potentials for Future Prospects 从文件到信息流：重温网络历史，探索未来发展潜力

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3589335.3652001

Lucas Vogel, Thomas Springer, Matthias Wählisch

Over the last 30 years, the World Wide Web has changed significantly. In this paper, we argue that common practices to prepare web pages for delivery conflict with many efforts to present content with minimal latency, one fundamental goal that pushed changes in the WWW. To bolster our arguments, we revisit reasons that led to changes of HTTP and compare them systematically with techniques to prepare web pages. We found that the structure of many web pages leverages features of HTTP/1.1 but hinders the use of recent HTTP features to present content quickly. To improve the situation in the future, we propose fine-grained content segmentation. This would allow to exploit streaming capabilities of recent HTTP versions and to render content as quickly as possible without changing underlying protocols or web browsers.

在过去的 30 年中，万维网发生了巨大的变化。在本文中，我们认为准备网页以进行传输的常见做法与以最小延迟呈现内容的许多努力相冲突，而最小延迟是推动 WWW 变化的一个基本目标。为了支持我们的论点，我们重温了导致 HTTP 变化的原因，并将其与准备网页的技术进行了系统比较。我们发现，许多网页的结构利用了 HTTP/1.1 的特性，但却妨碍了利用 HTTP 的最新特性来快速呈现内容。为了在未来改善这种情况，我们提出了细粒度内容分割的建议。这样就可以利用最新 HTTP 版本的流功能，在不改变底层协议或网络浏览器的情况下尽快呈现内容。

引用次数: 0

Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation 通过域偏移缓解基于 GAN 的图像翻译中的模型提取攻击

ArXiv

Pub Date : 2024-03-12 DOI: 10.1609/aaai.v38i18.29966

Di Mi, Yanjun Zhang, Leo Yu Zhang, Shengshan Hu, Qi Zhong, Haizhuan Yuan, Shirui Pan

Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.

模型提取攻击（MEAs）使攻击者只需远程查询受害深度神经网络（DNN）模型的 API 服务，就能复制该模型的功能，这对基于按查询付费的 DNN 服务的安全性和完整性构成了严重威胁。尽管目前有关 MEA 的大部分研究主要集中在神经分类器上，但在我们的日常活动中，图像到图像翻译（I2IT）任务越来越普遍。然而，为 DNN 分类器开发的 MEA 技术无法直接应用于 I2IT 案例，这使得 I2IT 模型易受 MEA 攻击的程度往往被低估。本文从一个新的角度揭示了 I2IT 任务中的 MEA 威胁。与弥合攻击者查询和受害者训练样本之间分布差距的传统方法不同，我们选择减轻不同分布造成的影响，即所谓的领域偏移。为此，我们引入了一个新的正则化项，用于惩罚高频噪声，并寻求一个更平坦的最小值，以避免对偏移分布的过度拟合。我们在不同的骨干受害者模型上对不同的图像翻译任务（包括图像超分辨率和风格转换）进行了广泛的实验，在所有指标上，新设计都远远优于基线设计。经验证，一些现实生活中的 I2IT 应用程序接口也极易受到我们的攻击，这强调了增强防御和可能修订应用程序接口发布政策的必要性。

{"title":"Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation","authors":"Di Mi, Yanjun Zhang, Leo Yu Zhang, Shengshan Hu, Qi Zhong, Haizhuan Yuan, Shirui Pan","doi":"10.1609/aaai.v38i18.29966","DOIUrl":"https://doi.org/10.1609/aaai.v38i18.29966","url":null,"abstract":"Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"19 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AI-Assisted Causal Pathway Diagram for Human-Centered Design 人工智能辅助因果路径图，实现以人为本的设计

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3613904.3642179

Ruican Zhong, Donghoon Shin, Rosemary Meza, P. Klasnja, Lucas Colusso, Gary Hsieh

This paper explores the integration of causal pathway diagrams (CPD) into human-centered design (HCD), investigating how these diagrams can enhance the early stages of the design process. A dedicated CPD plugin for the online collaborative whiteboard platform Miro was developed to streamline diagram creation and offer real-time AI-driven guidance. Through a user study with designers (N=20), we found that CPD's branching and its emphasis on causal connections supported both divergent and convergent processes during design. CPD can also facilitate communication among stakeholders. Additionally, we found our plugin significantly reduces designers' cognitive workload and increases their creativity during brainstorming, highlighting the implications of AI-assisted tools in supporting creative work and evidence-based designs.

本文探讨了将因果路径图（CPD）融入以人为本的设计（HCD）的问题，研究了这些图表如何加强设计流程的早期阶段。我们为在线协作白板平台 Miro 开发了一个专用的 CPD 插件，以简化图表创建过程并提供实时的人工智能指导。通过对设计人员（20 人）进行用户研究，我们发现 CPD 的分支及其对因果联系的强调有助于设计过程中的发散和聚合过程。CPD 还能促进利益相关者之间的交流。此外，我们还发现，我们的插件大大减轻了设计师的认知工作量，提高了他们在头脑风暴过程中的创造力，突出了人工智能辅助工具在支持创造性工作和循证设计方面的意义。

引用次数: 0

From Paper to Card: Transforming Design Implications with Generative AI 从纸张到卡片：用生成式人工智能改变设计含义

ArXiv

Pub Date : 2024-03-12 DOI: 10.1145/3613904.3642266

Donghoon Shin, Lucy Lu Wang, Gary Hsieh

Communicating design implications is common within the HCI community when publishing academic papers, yet these papers are rarely read and used by designers. One solution is to use design cards as a form of translational resource that communicates valuable insights from papers in a more digestible and accessible format to assist in design processes. However, creating design cards can be time-consuming, and authors may lack the resources/know-how to produce cards. Through an iterative design process, we built a system that helps create design cards from academic papers using an LLM and text-to-image model. Our evaluation with designers (N=21) and authors of selected papers (N=12) revealed that designers perceived the design implications from our design cards as more inspiring and generative, compared to reading original paper texts, and the authors viewed our system as an effective way of communicating their design implications. We also propose future enhancements for AI-generated design cards.

在人机交互社区发表学术论文时，传达设计含义是很常见的，但这些论文却很少被设计师阅读和使用。一种解决方案是将设计卡片作为一种转化资源，以更易消化、更易获取的形式传达论文中的宝贵见解，以帮助设计过程。然而，制作设计卡可能会耗费大量时间，而且作者可能缺乏制作设计卡的资源/诀窍。通过迭代设计过程，我们建立了一个系统，利用 LLM 和文本到图像模型，帮助从学术论文中创建设计卡片。我们对设计者（21 人）和所选论文的作者（12 人）进行的评估显示，与阅读论文原文相比，设计者认为我们的设计卡片所蕴含的设计内涵更具启发性和生成性，而作者则认为我们的系统是传达其设计内涵的有效方式。我们还提出了人工智能生成设计卡的未来改进方案。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ArXiv

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀