首页 > 最新文献

ArXiv最新文献

英文 中文
HyperPredict: Estimating Hyperparameter Effects for Instance-Specific Regularization in Deformable Image Registration HyperPredict:在可变形图像配准中估计超参数效应以实现特定实例正则化
Pub Date : 2024-03-04 DOI: 10.59275/j.melba.2024-d434
Aisha L. Shuaibu, Ivor J. A. Simpson
Methods for medical image registration infer geometric transformations that align pairs, or groups, of images by maximising an image similarity metric. This problem is ill-posed as several solutions may have equivalent likelihoods, also optimising purely for image similarity can yield implausible deformable transformations. For these reasons regularization terms are essential to obtain meaningful registration results. However, this requires the introduction of at least one hyperparameter, often termed λ, which serves as a trade-off between loss terms. In some approaches and situations, the quality of the estimated transformation greatly depends on hyperparameter choice, and different choices may be required depending on the characteristics of the data. Analyzing the effect of these hyperparameters requires labelled data, which is not commonly available at test-time. In this paper, we propose a novel method for evaluating the influence of hyperparameters and subsequently selecting an optimal value for given pair of images. Our approach, which we call HyperPredict, implements a Multi-Layer Perceptron that learns the effect of selecting particular hyperparameters for registering an image pair by predicting the resulting segmentation overlap and measures of deformation smoothness. This approach enables us to select optimal hyperparameters at test time without requiring labelled data, removing the need for a one-size-fits-all cross-validation approach. Furthermore, the criteria used to define optimal hyperparameter is flexible post-training, allowing us to efficiently choose specific properties (e.g. overlap of specific anatomical regions of interest, smoothness/plausibility of the final displacement field). We evaluate our proposed method on the OASIS brain MR standard benchmark dataset using a recent deep learning approach (cLapIRN) and an algorithmic method (Niftyreg). Our results demonstrate good performance in predicting the effects of regularization hyperparameters and highlight the benefits of our image-pair specific approach to hyperparameter selection.
医学图像配准方法通过最大限度地提高图像相似度指标,推断出使图像对或图像组配准的几何变换。由于多个解决方案可能具有相同的似然性,因此这个问题并不完美,而且单纯优化图像相似性可能会产生难以置信的可变形变换。因此,要获得有意义的配准结果,正则化条件是必不可少的。不过,这需要引入至少一个超参数(通常称为 λ),作为损失项之间的权衡。在某些方法和情况下,估计变换的质量在很大程度上取决于超参数的选择,而且可能需要根据数据的特性做出不同的选择。分析这些超参数的影响需要标注数据,而这些数据在测试时并不常见。在本文中,我们提出了一种新方法,用于评估超参数的影响,并随后为给定的图像对选择最佳值。我们将这种方法称为 HyperPredict,它采用了多层感知器(Multi-Layer Perceptron),通过预测所产生的分割重叠度和变形平滑度,学习选择特定超参数对图像进行注册的效果。这种方法使我们能够在测试时选择最佳超参数,而无需标注数据,从而无需采用一刀切的交叉验证方法。此外,用于定义最佳超参数的标准在训练后是灵活的,允许我们有效地选择特定属性(如特定解剖学感兴趣区的重叠、最终位移场的平滑度/可信度)。我们使用最新的深度学习方法(cLapIRN)和算法方法(Niftyreg)在 OASIS 脑磁共振标准基准数据集上评估了我们提出的方法。我们的结果表明,我们在预测正则化超参数的效果方面表现出色,并凸显了我们针对特定图像对的超参数选择方法的优势。
{"title":"HyperPredict: Estimating Hyperparameter Effects for Instance-Specific Regularization in Deformable Image Registration","authors":"Aisha L. Shuaibu, Ivor J. A. Simpson","doi":"10.59275/j.melba.2024-d434","DOIUrl":"https://doi.org/10.59275/j.melba.2024-d434","url":null,"abstract":"Methods for medical image registration infer geometric transformations that align pairs, or groups, of images by maximising an image similarity metric. This problem is ill-posed as several solutions may have equivalent likelihoods, also optimising purely for image similarity can yield implausible deformable transformations. For these reasons regularization terms are essential to obtain meaningful registration results. However, this requires the introduction of at least one hyperparameter, often termed λ, which serves as a trade-off between loss terms. In some approaches and situations, the quality of the estimated transformation greatly depends on hyperparameter choice, and different choices may be required depending on the characteristics of the data. Analyzing the effect of these hyperparameters requires labelled data, which is not commonly available at test-time. In this paper, we propose a novel method for evaluating the influence of hyperparameters and subsequently selecting an optimal value for given pair of images. Our approach, which we call HyperPredict, implements a Multi-Layer Perceptron that learns the effect of selecting particular hyperparameters for registering an image pair by predicting the resulting segmentation overlap and measures of deformation smoothness. This approach enables us to select optimal hyperparameters at test time without requiring labelled data, removing the need for a one-size-fits-all cross-validation approach. Furthermore, the criteria used to define optimal hyperparameter is flexible post-training, allowing us to efficiently choose specific properties (e.g. overlap of specific anatomical regions of interest, smoothness/plausibility of the final displacement field). We evaluate our proposed method on the OASIS brain MR standard benchmark dataset using a recent deep learning approach (cLapIRN) and an algorithmic method (Niftyreg). Our results demonstrate good performance in predicting the effects of regularization hyperparameters and highlight the benefits of our image-pair specific approach to hyperparameter selection.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"18 3‐4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KeNet: Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification KeNet:用于多标签文本分类的知识增强文档标签注意网络
Pub Date : 2024-03-04 DOI: 10.1109/ICASSP48485.2024.10447643
Bo Li, Yuyan Chen, Liang Zeng
Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet.
多标签文本分类(MLTC)是自然语言处理(NLP)领域的一项基本任务,涉及给定文本分配多个标签。多标签文本分类具有重要意义,已被广泛应用于主题识别、推荐系统、情感分析和信息检索等多个领域。然而,传统的机器学习和深度神经网络尚未解决某些问题,例如有些文档虽然简短,但却有大量标签,以及如何建立标签之间的关系。此外,还必须承认,知识的重要性在 MLTC 领域得到了证实。为了解决这个问题,我们提供了一种称为知识增强文档标签注意力网络(KeNet)的新方法。具体来说,我们设计的注意力网络融合了外部知识、标签嵌入和综合注意力机制。与传统方法相比,我们使用文档、知识和标签的综合表示来预测每个单一文本的所有标签。我们的方法已在三个多标签数据集上进行了综合研究验证。实验结果表明,我们的方法优于最先进的 MLTC 方法。此外,我们还进行了一项案例研究,以说明 KeNet 的实际应用。
{"title":"KeNet: Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification","authors":"Bo Li, Yuyan Chen, Liang Zeng","doi":"10.1109/ICASSP48485.2024.10447643","DOIUrl":"https://doi.org/10.1109/ICASSP48485.2024.10447643","url":null,"abstract":"Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ivie: Lightweight Anchored Explanations of Just-Generated Code 艾维刚生成代码的轻量级锚定解释
Pub Date : 2024-03-04 DOI: 10.1145/3613904.3642239
Litao Yan, Alyssa Hwang, Zhiyuan Wu, Andrew Head
Programming assistants have reshaped the experience of programming into one where programmers spend less time writing and more time critically examining code. In this paper, we explore how programming assistants can be extended to accelerate the inspection of generated code. We introduce an extension to the programming assistant called Ivie, or instantly visible in-situ explanations. When using Ivie, a programmer's generated code is instantly accompanied by explanations positioned just adjacent to the code. Our design was optimized for extremely low-cost invocation and dismissal. Explanations are compact and informative. They describe meaningful expressions, from individual variables to entire blocks of code. We present an implementation of Ivie that forks VS Code, applying a modern LLM for timely segmentation and explanation of generated code. In a lab study, we compared Ivie to a contemporary baseline tool for code understanding. Ivie improved understanding of generated code, and was received by programmers as a highly useful, low distraction, desirable complement to the programming assistant.
编程助手重塑了程序员的编程体验,让程序员花更少的时间编写代码,花更多的时间批判性地检查代码。在本文中,我们将探讨如何扩展编程助手,以加速检查生成的代码。我们引入了编程助手的扩展功能 Ivie,即即时可见的现场解释。使用 Ivie 时,程序员生成的代码旁边会立即出现解释说明。我们的设计经过优化,调用和取消的成本极低。解释简洁而翔实。它们描述了有意义的表达式,从单个变量到整个代码块。我们介绍了分叉 VS 代码的 Ivie 实现,它应用现代 LLM 及时分割和解释生成的代码。在一项实验室研究中,我们将 Ivie 与当代的代码理解基准工具进行了比较。Ivie 改善了对生成代码的理解,并被程序员视为编程助手的一个非常有用、低干扰和理想的补充。
{"title":"Ivie: Lightweight Anchored Explanations of Just-Generated Code","authors":"Litao Yan, Alyssa Hwang, Zhiyuan Wu, Andrew Head","doi":"10.1145/3613904.3642239","DOIUrl":"https://doi.org/10.1145/3613904.3642239","url":null,"abstract":"Programming assistants have reshaped the experience of programming into one where programmers spend less time writing and more time critically examining code. In this paper, we explore how programming assistants can be extended to accelerate the inspection of generated code. We introduce an extension to the programming assistant called Ivie, or instantly visible in-situ explanations. When using Ivie, a programmer's generated code is instantly accompanied by explanations positioned just adjacent to the code. Our design was optimized for extremely low-cost invocation and dismissal. Explanations are compact and informative. They describe meaningful expressions, from individual variables to entire blocks of code. We present an implementation of Ivie that forks VS Code, applying a modern LLM for timely segmentation and explanation of generated code. In a lab study, we compared Ivie to a contemporary baseline tool for code understanding. Ivie improved understanding of generated code, and was received by programmers as a highly useful, low distraction, desirable complement to the programming assistant.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"4 5‐6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards A Diffractive Analysis of Prompt-Based Generative AI 对基于提示的生成式人工智能进行衍射分析
Pub Date : 2024-03-04 DOI: 10.1145/3613904.3641971
Nina Rajcic, M. T. Llano, Jon McCormack
Recent developments in prompt-based generative AI has given rise to discourse surrounding the perceived ethical concerns, economic implications, and consequences for the future of cultural production. As generative imagery becomes pervasive in mainstream society, dominated primarily by emerging industry leaders, we encourage that the role of the CHI community be one of inquiry; to investigate the numerous ways in which generative AI has the potential to, and already is, augmenting human creativity. In this paper, we conducted a diffractive analysis exploring the potential role of prompt-based interfaces in artists' creative practice. Over a two week period, seven visual artists were given access to a personalised instance of Stable Diffusion, fine-tuned on a dataset of their work. In the following diffractive analysis, we identified two dominant modes adopted by participants, AI for ideation, and AI for production. We furthermore present a number of ethical design considerations for the future development of generative AI interfaces.
基于提示的生成式人工智能的最新发展引起了人们对未来文化生产的伦理问题、经济影响和后果的讨论。随着生成式图像在主流社会中的普及,并主要由新兴的行业领导者主导,我们鼓励人工智能社区扮演探究者的角色;研究生成式人工智能有可能、并且已经在增强人类创造力的多种方式。在本文中,我们进行了一项衍射分析,探索基于提示的界面在艺术家创作实践中的潜在作用。在为期两周的时间里,七位视觉艺术家获得了稳定扩散的个性化实例,并根据他们的作品数据集进行了微调。在接下来的衍射分析中,我们确定了参与者采用的两种主要模式:人工智能用于构思和人工智能用于制作。此外,我们还为生成式人工智能界面的未来发展提出了一些伦理设计方面的考虑。
{"title":"Towards A Diffractive Analysis of Prompt-Based Generative AI","authors":"Nina Rajcic, M. T. Llano, Jon McCormack","doi":"10.1145/3613904.3641971","DOIUrl":"https://doi.org/10.1145/3613904.3641971","url":null,"abstract":"Recent developments in prompt-based generative AI has given rise to discourse surrounding the perceived ethical concerns, economic implications, and consequences for the future of cultural production. As generative imagery becomes pervasive in mainstream society, dominated primarily by emerging industry leaders, we encourage that the role of the CHI community be one of inquiry; to investigate the numerous ways in which generative AI has the potential to, and already is, augmenting human creativity. In this paper, we conducted a diffractive analysis exploring the potential role of prompt-based interfaces in artists' creative practice. Over a two week period, seven visual artists were given access to a personalised instance of Stable Diffusion, fine-tuned on a dataset of their work. In the following diffractive analysis, we identified two dominant modes adopted by participants, AI for ideation, and AI for production. We furthermore present a number of ethical design considerations for the future development of generative AI interfaces.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"27 1‐2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views 针对稀疏输入视图的深度引导鲁棒快速点云融合 NeRF
Pub Date : 2024-03-04 DOI: 10.1609/aaai.v38i3.27968
Shuai Guo, Q. Wang, Yijie Gao, Rong Xie, Li Song
Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these issues, we propose a depth-guided robust and fast point cloud fusion NeRF for sparse inputs. We perceive radiance fields as an explicit voxel grid of features. A point cloud is constructed for each input view, characterized within the voxel grid using matrices and vectors. We accumulate the point cloud of each input view to construct the fused point cloud of the entire scene. Each voxel determines its density and appearance by referring to the point cloud of the entire scene. Through point cloud fusion and voxel grid fine-tuning, inaccuracies in depth values are refined or substituted by those from other views. Moreover, our method can achieve faster reconstruction and greater compactness through effective vector-matrix decomposition. Experimental results underline the superior performance and time efficiency of our approach compared to state-of-the-art baselines.
具有稀疏输入视图的新视图合成对于 AR/VR 和自动驾驶等现实世界应用非常重要。最近的方法已将深度信息集成到 NeRF 中,用于稀疏输入合成,利用深度先验进行几何和空间理解。然而,大多数现有方法往往会忽略深度图中的不准确性,而且时间效率较低。为了解决这些问题,我们提出了一种用于稀疏输入的深度引导的稳健、快速点云融合 NeRF。我们将辐射场视为一个明确的体素网格特征。我们为每个输入视图构建点云,并使用矩阵和矢量对体素网格进行特征描述。我们将每个输入视图的点云累积起来,构建整个场景的融合点云。每个体素通过参考整个场景的点云来确定其密度和外观。通过点云融合和体素网格微调,深度值的误差会被其他视图的深度值完善或替代。此外,通过有效的矢量矩阵分解,我们的方法可以实现更快的重建和更紧凑的结构。实验结果表明,与最先进的基线方法相比,我们的方法具有更优越的性能和时间效率。
{"title":"Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views","authors":"Shuai Guo, Q. Wang, Yijie Gao, Rong Xie, Li Song","doi":"10.1609/aaai.v38i3.27968","DOIUrl":"https://doi.org/10.1609/aaai.v38i3.27968","url":null,"abstract":"Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these issues, we propose a depth-guided robust and fast point cloud fusion NeRF for sparse inputs. We perceive radiance fields as an explicit voxel grid of features. A point cloud is constructed for each input view, characterized within the voxel grid using matrices and vectors. We accumulate the point cloud of each input view to construct the fused point cloud of the entire scene. Each voxel determines its density and appearance by referring to the point cloud of the entire scene. Through point cloud fusion and voxel grid fine-tuning, inaccuracies in depth values are refined or substituted by those from other views. Moreover, our method can achieve faster reconstruction and greater compactness through effective vector-matrix decomposition. Experimental results underline the superior performance and time efficiency of our approach compared to state-of-the-art baselines.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation Memoro:使用大型语言模型实现实时内存增强的简洁界面
Pub Date : 2024-03-04 DOI: 10.1145/3613904.3642450
Wazeer Zulfikar, Samantha Chan, Pattie Maes
People have to remember an ever-expanding volume of information. Wearables that use information capture and retrieval for memory augmentation can help but can be disruptive and cumbersome in real-world tasks, such as in social settings. To address this, we developed Memoro, a wearable audio-based memory assistant with a concise user interface. Memoro uses a large language model (LLM) to infer the user's memory needs in a conversational context, semantically search memories, and present minimal suggestions. The assistant has two interaction modes: Query Mode for voicing queries and Queryless Mode for on-demand predictive assistance, without explicit query. Our study of (N=20) participants engaged in a real-time conversation demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality. We report quantitative results and discuss the preferences and experiences of users. This work contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.
人们需要记住越来越多的信息。利用信息捕捉和检索来增强记忆的可穿戴设备可以起到帮助作用,但在实际任务中(如社交场合)可能会造成干扰和麻烦。为了解决这个问题,我们开发了一款基于音频的可穿戴记忆助手 Memoro,它拥有简洁的用户界面。Memoro 使用大语言模型(LLM)来推断用户在对话语境中的记忆需求,对记忆进行语义搜索,并提出最简单的建议。该助手有两种交互模式:查询模式用于语音查询,无查询模式用于按需提供预测性帮助,无需明确查询。我们对参与实时对话的参与者(20 人)进行的研究表明,使用 Memoro 可以减少设备交互时间,提高回忆信心,同时保持对话质量。我们报告了定量结果,并讨论了用户的偏好和体验。这项工作有助于利用 LLM 设计干扰最小的可穿戴式记忆增强系统。
{"title":"Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation","authors":"Wazeer Zulfikar, Samantha Chan, Pattie Maes","doi":"10.1145/3613904.3642450","DOIUrl":"https://doi.org/10.1145/3613904.3642450","url":null,"abstract":"People have to remember an ever-expanding volume of information. Wearables that use information capture and retrieval for memory augmentation can help but can be disruptive and cumbersome in real-world tasks, such as in social settings. To address this, we developed Memoro, a wearable audio-based memory assistant with a concise user interface. Memoro uses a large language model (LLM) to infer the user's memory needs in a conversational context, semantically search memories, and present minimal suggestions. The assistant has two interaction modes: Query Mode for voicing queries and Queryless Mode for on-demand predictive assistance, without explicit query. Our study of (N=20) participants engaged in a real-time conversation demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality. We report quantitative results and discuss the preferences and experiences of users. This work contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"110 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Agent Autonomy to Casual Collaboration: A Design Investigation on Help-Seeking Urban Robots 从代理自主到休闲协作:城市求助机器人的设计研究
Pub Date : 2024-03-04 DOI: 10.1145/3613904.3642389
Xinyan Yu, Marius Hoggenmueller, M. Tomitsch
As intelligent agents transition from controlled to uncontrolled environments, they face challenges that sometimes exceed their operational capabilities. In many scenarios, they rely on assistance from bystanders to overcome those challenges. Using robots that get stuck in urban settings as an example, we investigate how agents can prompt bystanders into providing assistance. We conducted four focus group sessions with 17 participants that involved bodystorming, where participants assumed the role of robots and bystander pedestrians in role-playing activities. Generating insights from both assumed robot and bystander perspectives, we were able to identify potential non-verbal help-seeking strategies (i.e., addressing bystanders, cueing intentions, and displaying emotions) and factors shaping the assistive behaviours of bystanders. Drawing on these findings, we offer design considerations for help-seeking urban robots and other agents operating in uncontrolled environments to foster casual collaboration, encompass expressiveness, align with agent social categories, and curate appropriate incentives.
当智能代理从受控环境过渡到不受控环境时,它们面临的挑战有时会超出其操作能力。在许多情况下,它们需要依靠旁观者的帮助来克服这些挑战。以被困在城市环境中的机器人为例,我们研究了代理如何促使旁观者提供帮助。我们与 17 位参与者进行了四次焦点小组讨论,其中包括身体风暴,参与者在角色扮演活动中扮演机器人和旁观行人。从假定的机器人和旁观者的角度出发,我们能够确定潜在的非语言求助策略(即向旁观者致意、提示意图和展示情绪)以及影响旁观者求助行为的因素。根据这些发现,我们为城市机器人和其他在不受控制的环境中运行的代理提供了寻求帮助的设计考虑因素,以促进随意协作、包含表现力、与代理的社会类别保持一致,并策划适当的激励措施。
{"title":"From Agent Autonomy to Casual Collaboration: A Design Investigation on Help-Seeking Urban Robots","authors":"Xinyan Yu, Marius Hoggenmueller, M. Tomitsch","doi":"10.1145/3613904.3642389","DOIUrl":"https://doi.org/10.1145/3613904.3642389","url":null,"abstract":"As intelligent agents transition from controlled to uncontrolled environments, they face challenges that sometimes exceed their operational capabilities. In many scenarios, they rely on assistance from bystanders to overcome those challenges. Using robots that get stuck in urban settings as an example, we investigate how agents can prompt bystanders into providing assistance. We conducted four focus group sessions with 17 participants that involved bodystorming, where participants assumed the role of robots and bystander pedestrians in role-playing activities. Generating insights from both assumed robot and bystander perspectives, we were able to identify potential non-verbal help-seeking strategies (i.e., addressing bystanders, cueing intentions, and displaying emotions) and factors shaping the assistive behaviours of bystanders. Drawing on these findings, we offer design considerations for help-seeking urban robots and other agents operating in uncontrolled environments to foster casual collaboration, encompass expressiveness, align with agent social categories, and curate appropriate incentives.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"137 1‐3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection 通过机器学习全面评估恶意软件检测中的 Mal-API-2019 数据集
Pub Date : 2024-03-04 DOI: 10.62051/ijcsit.v2n1.01
Zhenglin Li, Haibei Zhu, Houze Liu, Jintong Song, Qishuo Cheng
This study conducts a thorough examination of malware detection using machine learning techniques, focusing on the evaluation of various classification models using the Mal-API-2019 dataset. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively. Both ensemble and non-ensemble machine learning methods, such as Random Forest, XGBoost, K Nearest Neighbor (KNN), and Neural Networks, are explored. Special emphasis is placed on the importance of data pre-processing techniques, particularly TF-IDF representation and Principal Component Analysis, in improving model performance. Results indicate that ensemble methods, particularly Random Forest and XGBoost, exhibit superior accuracy, precision, and recall compared to others, highlighting their effectiveness in malware detection. The paper also discusses limitations and potential future directions, emphasizing the need for continuous adaptation to address the evolving nature of malware. This research contributes to ongoing discussions in cybersecurity and provides practical insights for developing more robust malware detection systems in the digital era.
本研究利用机器学习技术对恶意软件检测进行了深入研究,重点是利用 Mal-API-2019 数据集对各种分类模型进行评估。目的是通过更有效地识别和缓解威胁来提高网络安全能力。研究探讨了集合和非集合机器学习方法,如随机森林(Random Forest)、XGBoost、K Nearest Neighbor (KNN) 和神经网络。特别强调了数据预处理技术(尤其是 TF-IDF 表示法和主成分分析)在提高模型性能方面的重要性。结果表明,与其他方法相比,集合方法(尤其是随机森林和 XGBoost)在准确度、精确度和召回率方面表现出更高的水平,突出了它们在恶意软件检测中的有效性。论文还讨论了局限性和潜在的未来发展方向,强调需要不断调整以应对恶意软件不断演变的性质。这项研究为网络安全领域正在进行的讨论做出了贡献,并为在数字时代开发更强大的恶意软件检测系统提供了实用的见解。
{"title":"Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection","authors":"Zhenglin Li, Haibei Zhu, Houze Liu, Jintong Song, Qishuo Cheng","doi":"10.62051/ijcsit.v2n1.01","DOIUrl":"https://doi.org/10.62051/ijcsit.v2n1.01","url":null,"abstract":"This study conducts a thorough examination of malware detection using machine learning techniques, focusing on the evaluation of various classification models using the Mal-API-2019 dataset. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively. Both ensemble and non-ensemble machine learning methods, such as Random Forest, XGBoost, K Nearest Neighbor (KNN), and Neural Networks, are explored. Special emphasis is placed on the importance of data pre-processing techniques, particularly TF-IDF representation and Principal Component Analysis, in improving model performance. Results indicate that ensemble methods, particularly Random Forest and XGBoost, exhibit superior accuracy, precision, and recall compared to others, highlighting their effectiveness in malware detection. The paper also discusses limitations and potential future directions, emphasizing the need for continuous adaptation to address the evolving nature of malware. This research contributes to ongoing discussions in cybersecurity and provides practical insights for developing more robust malware detection systems in the digital era.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"14 4‐6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution 神经网络听什么?利用 Sinc-convolution 探索语音增强中的关键频段
Pub Date : 2024-03-04 DOI: 10.1109/icassp48485.2024.10445878
Kuan-Hsun Ho, J. Hung, Berlin Chen
This study introduces a reformed Sinc-convolution (Sincconv) framework tailored for the encoder component of deep networks for speech enhancement (SE). The reformed Sincconv, based on parametrized sinc functions as band-pass filters, offers notable advantages in terms of training efficiency, filter diversity, and interpretability. The reformed Sinc-conv is evaluated in conjunction with various SE models, showcasing its ability to boost SE performance. Furthermore, the reformed Sincconv provides valuable insights into the specific frequency components that are prioritized in an SE scenario. This opens up a new direction of SE research and improving our knowledge of their operating dynamics.
本研究介绍了一种为语音增强(SE)深度网络编码器组件量身定制的改革 Sinc-卷积(Sincconv)框架。改革后的 Sincconv 基于参数化 sinc 函数作为带通滤波器,在训练效率、滤波器多样性和可解释性方面具有显著优势。改革后的 Sincconv 结合各种 SE 模型进行了评估,展示了其提高 SE 性能的能力。此外,改革后的 Sincconv 对 SE 场景中优先考虑的特定频率成分提供了有价值的见解。这开辟了 SE 研究的新方向,并提高了我们对其运行动态的认识。
{"title":"What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution","authors":"Kuan-Hsun Ho, J. Hung, Berlin Chen","doi":"10.1109/icassp48485.2024.10445878","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10445878","url":null,"abstract":"This study introduces a reformed Sinc-convolution (Sincconv) framework tailored for the encoder component of deep networks for speech enhancement (SE). The reformed Sincconv, based on parametrized sinc functions as band-pass filters, offers notable advantages in terms of training efficiency, filter diversity, and interpretability. The reformed Sinc-conv is evaluated in conjunction with various SE models, showcasing its ability to boost SE performance. Furthermore, the reformed Sincconv provides valuable insights into the specific frequency components that are prioritized in an SE scenario. This opens up a new direction of SE research and improving our knowledge of their operating dynamics.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"101 10‐12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Free Proxies Unmasked: A Vulnerability and Longitudinal Analysis of Free Proxy Services 揭开免费代理的面纱:免费代理服务的脆弱性和纵向分析
Pub Date : 2024-03-04 DOI: 10.14722/madweb.2024.23035
Naif Mehanna, Walter Rudametkin, Pierre Laperdrix, Antoine Vastel
Free-proxies have been widespread since the early days of the Web, helping users bypass geo-blocked content and conceal their IP addresses. Various proxy providers promise faster Internet or increased privacy while advertising their lists comprised of hundreds of readily available free proxies. However, while paid proxy services advertise the support of encrypted connections and high stability, free proxies often lack such guarantees, making them prone to malicious activities such as eavesdropping or modifying content. Furthermore, there is a market that encourages exploiting devices to install proxies. In this paper, we present a 30-month longitudinal study analyzing the stability, security, and potential manipulation of free web proxies that we collected from 11 providers. Our collection resulted in over 640,600 proxies, that we cumulatively tested daily. We find that only 34.5% of proxies were active at least once during our tests, showcasing the general instability of free proxies. Geographically, a majority of proxies originate from the US and China. Leveraging the Shodan search engine, we identified 4,452 distinct vulnerabilities on the proxies' IP addresses, including 1,755 vulnerabilities that allow unauthorized remote code execution and 2,036 that enable privilege escalation on the host device. Through the software analysis on the proxies' IP addresses, we find that 42,206 of them appear to run on MikroTik routers. Worryingly, we also discovered 16,923 proxies that manipulate content, indicating potential malicious intent by proxy owners. Ultimately, our research reveals that the use of free web proxies poses significant risks to users' privacy and security. The instability, vulnerabilities, and potential for malicious actions uncovered in our analysis lead us to strongly caution users against relying on free proxies.
免费代理从网络诞生之初就开始普及,它可以帮助用户绕过地理封锁,隐藏自己的 IP 地址。各种代理服务器提供商承诺提供更快的上网速度或更高的隐私保护,同时宣传他们的列表由数百个随时可用的免费代理服务器组成。然而,虽然付费代理服务宣传支持加密连接和高稳定性,但免费代理往往缺乏此类保证,因此容易发生恶意活动,如窃听或修改内容。此外,市场还鼓励利用设备安装代理服务器。在本文中,我们介绍了一项为期 30 个月的纵向研究,分析了我们从 11 个提供商处收集的免费网络代理的稳定性、安全性和潜在操纵性。我们收集了超过 640,600 个代理服务器,每天对其进行累积测试。我们发现,只有 34.5% 的代理服务器在测试期间至少活跃过一次,这表明免费代理服务器普遍存在不稳定性。从地域上看,大多数代理服务器来自美国和中国。利用 Shodan 搜索引擎,我们在代理服务器的 IP 地址上发现了 4,452 个不同的漏洞,其中 1,755 个漏洞允许未经授权的远程代码执行,2,036 个漏洞允许主机设备上的权限升级。通过对代理服务器 IP 地址的软件分析,我们发现其中 42,206 个代理服务器似乎运行在 MikroTik 路由器上。令人担忧的是,我们还发现 16,923 个代理程序操纵内容,这表明代理程序所有者可能有恶意意图。最终,我们的研究表明,使用免费网络代理会给用户的隐私和安全带来巨大风险。我们在分析中发现的不稳定性、漏洞和潜在的恶意行为使我们强烈警告用户不要依赖免费代理。
{"title":"Free Proxies Unmasked: A Vulnerability and Longitudinal Analysis of Free Proxy Services","authors":"Naif Mehanna, Walter Rudametkin, Pierre Laperdrix, Antoine Vastel","doi":"10.14722/madweb.2024.23035","DOIUrl":"https://doi.org/10.14722/madweb.2024.23035","url":null,"abstract":"Free-proxies have been widespread since the early days of the Web, helping users bypass geo-blocked content and conceal their IP addresses. Various proxy providers promise faster Internet or increased privacy while advertising their lists comprised of hundreds of readily available free proxies. However, while paid proxy services advertise the support of encrypted connections and high stability, free proxies often lack such guarantees, making them prone to malicious activities such as eavesdropping or modifying content. Furthermore, there is a market that encourages exploiting devices to install proxies. In this paper, we present a 30-month longitudinal study analyzing the stability, security, and potential manipulation of free web proxies that we collected from 11 providers. Our collection resulted in over 640,600 proxies, that we cumulatively tested daily. We find that only 34.5% of proxies were active at least once during our tests, showcasing the general instability of free proxies. Geographically, a majority of proxies originate from the US and China. Leveraging the Shodan search engine, we identified 4,452 distinct vulnerabilities on the proxies' IP addresses, including 1,755 vulnerabilities that allow unauthorized remote code execution and 2,036 that enable privilege escalation on the host device. Through the software analysis on the proxies' IP addresses, we find that 42,206 of them appear to run on MikroTik routers. Worryingly, we also discovered 16,923 proxies that manipulate content, indicating potential malicious intent by proxy owners. Ultimately, our research reveals that the use of free web proxies poses significant risks to users' privacy and security. The instability, vulnerabilities, and potential for malicious actions uncovered in our analysis lead us to strongly caution users against relying on free proxies.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"25 1‐2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ArXiv
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1