Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.10192
Philip A. LeMaitre, Marius Krumm, H. Briegel
With the impressive progress of deep learning, applications relying on machine learning are increasingly being integrated into daily life. However, most deep learning models have an opaque, oracle-like nature making it difficult to interpret and understand their decisions. This problem led to the development of the field known as eXplainable Artificial Intelligence (XAI). One method in this field known as Projective Simulation (PS) models a chain-of-thought as a random walk of a particle on a graph with vertices that have concepts attached to them. While this description has various benefits, including the possibility of quantization, it cannot be naturally used to model thoughts that combine several concepts simultaneously. To overcome this limitation, we introduce Multi-Excitation Projective Simulation (mePS), a generalization that considers a chain-of-thought to be a random walk of several particles on a hypergraph. A definition for a dynamic hypergraph is put forward to describe the agent's training history along with applications to AI and hypergraph visualization. An inductive bias inspired by the remarkably successful few-body interaction models used in quantum many-body physics is formalized for our classical mePS framework and employed to tackle the exponential complexity associated with naive implementations of hypergraphs. We prove that our inductive bias reduces the complexity from exponential to polynomial, with the exponent representing the cutoff on how many particles can interact. We numerically apply our method to two toy environments and a more complex scenario modelling the diagnosis of a broken computer. These environments demonstrate the resource savings provided by an appropriate choice of inductive bias, as well as showcasing aspects of interpretability. A quantum model for mePS is also briefly outlined and some future directions for it are discussed.
{"title":"Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias","authors":"Philip A. LeMaitre, Marius Krumm, H. Briegel","doi":"10.48550/arXiv.2402.10192","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10192","url":null,"abstract":"With the impressive progress of deep learning, applications relying on machine learning are increasingly being integrated into daily life. However, most deep learning models have an opaque, oracle-like nature making it difficult to interpret and understand their decisions. This problem led to the development of the field known as eXplainable Artificial Intelligence (XAI). One method in this field known as Projective Simulation (PS) models a chain-of-thought as a random walk of a particle on a graph with vertices that have concepts attached to them. While this description has various benefits, including the possibility of quantization, it cannot be naturally used to model thoughts that combine several concepts simultaneously. To overcome this limitation, we introduce Multi-Excitation Projective Simulation (mePS), a generalization that considers a chain-of-thought to be a random walk of several particles on a hypergraph. A definition for a dynamic hypergraph is put forward to describe the agent's training history along with applications to AI and hypergraph visualization. An inductive bias inspired by the remarkably successful few-body interaction models used in quantum many-body physics is formalized for our classical mePS framework and employed to tackle the exponential complexity associated with naive implementations of hypergraphs. We prove that our inductive bias reduces the complexity from exponential to polynomial, with the exponent representing the cutoff on how many particles can interact. We numerically apply our method to two toy environments and a more complex scenario modelling the diagnosis of a broken computer. These environments demonstrate the resource savings provided by an appropriate choice of inductive bias, as well as showcasing aspects of interpretability. A quantum model for mePS is also briefly outlined and some future directions for it are discussed.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"16 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in training large language models (LLMs) using massive amounts of solely textual data lead to strong generalization across many domains and tasks, including document-specific tasks. Opposed to that there is a trend to train multi-modal transformer architectures tailored for document understanding that are designed specifically to fuse textual inputs with the corresponding document layout. This involves a separate fine-tuning step for which additional training data is required. At present, no document transformers with comparable generalization to LLMs are available That raises the question which type of model is to be preferred for document understanding tasks. In this paper we investigate the possibility to use purely text-based LLMs for document-specific tasks by using layout enrichment. We explore drop-in modifications and rule-based methods to enrich purely textual LLM prompts with layout information. In our experiments we investigate the effects on the commercial ChatGPT model and the open-source LLM Solar. We demonstrate that using our approach both LLMs show improved performance on various standard document benchmarks. In addition, we study the impact of noisy OCR and layout errors, as well as the limitations of LLMs when it comes to utilizing document layout. Our results indicate that layout enrichment can improve the performance of purely text-based LLMs for document understanding by up to 15% compared to just using plain document text. In conclusion, this approach should be considered for the best model choice between text-based LLM or multi-modal document transformers.
{"title":"LAPDoc: Layout-Aware Prompting for Documents","authors":"Marcel Lamott, Yves-Noel Weweler, A. Ulges, Faisal Shafait, Dirk Krechel, Darko Obradovic","doi":"10.48550/arXiv.2402.09841","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09841","url":null,"abstract":"Recent advances in training large language models (LLMs) using massive amounts of solely textual data lead to strong generalization across many domains and tasks, including document-specific tasks. Opposed to that there is a trend to train multi-modal transformer architectures tailored for document understanding that are designed specifically to fuse textual inputs with the corresponding document layout. This involves a separate fine-tuning step for which additional training data is required. At present, no document transformers with comparable generalization to LLMs are available That raises the question which type of model is to be preferred for document understanding tasks. In this paper we investigate the possibility to use purely text-based LLMs for document-specific tasks by using layout enrichment. We explore drop-in modifications and rule-based methods to enrich purely textual LLM prompts with layout information. In our experiments we investigate the effects on the commercial ChatGPT model and the open-source LLM Solar. We demonstrate that using our approach both LLMs show improved performance on various standard document benchmarks. In addition, we study the impact of noisy OCR and layout errors, as well as the limitations of LLMs when it comes to utilizing document layout. Our results indicate that layout enrichment can improve the performance of purely text-based LLMs for document understanding by up to 15% compared to just using plain document text. In conclusion, this approach should be considered for the best model choice between text-based LLM or multi-modal document transformers.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"9 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139963064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.10107
Cheng Kang, Xinye Chen, Yong Hu, Daniel Novak
Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.
{"title":"Quantized Embedding Vectors for Controllable Diffusion Language Models","authors":"Cheng Kang, Xinye Chen, Yong Hu, Daniel Novak","doi":"10.48550/arXiv.2402.10107","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10107","url":null,"abstract":"Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139963082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09657
Jiacheng Yao, Weihong Xu, Zhaohui Yang, Xiaohu You, M. Bennis, H. V. Poor
In this paper, we quantitatively compare these two effective communication schemes, i.e., digital and analog ones, for wireless federated learning (FL) over resource-constrained networks, highlighting their essential differences as well as their respective application scenarios. We first examine both digital and analog transmission methods, together with a unified and fair comparison scheme under practical constraints. A universal convergence analysis under various imperfections is established for FL performance evaluation in wireless networks. These analytical results reveal that the fundamental difference between the two paradigms lies in whether communication and computation are jointly designed or not. The digital schemes decouple the communication design from specific FL tasks, making it difficult to support simultaneous uplink transmission of massive devices with limited bandwidth. In contrast, the analog communication allows over-the-air computation (AirComp), thus achieving efficient spectrum utilization. However, computation-oriented analog transmission reduces power efficiency, and its performance is sensitive to computational errors. Finally, numerical simulations are conducted to verify these theoretical observations.
{"title":"Digital versus Analog Transmissions for Federated Learning over Wireless Networks","authors":"Jiacheng Yao, Weihong Xu, Zhaohui Yang, Xiaohu You, M. Bennis, H. V. Poor","doi":"10.48550/arXiv.2402.09657","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09657","url":null,"abstract":"In this paper, we quantitatively compare these two effective communication schemes, i.e., digital and analog ones, for wireless federated learning (FL) over resource-constrained networks, highlighting their essential differences as well as their respective application scenarios. We first examine both digital and analog transmission methods, together with a unified and fair comparison scheme under practical constraints. A universal convergence analysis under various imperfections is established for FL performance evaluation in wireless networks. These analytical results reveal that the fundamental difference between the two paradigms lies in whether communication and computation are jointly designed or not. The digital schemes decouple the communication design from specific FL tasks, making it difficult to support simultaneous uplink transmission of massive devices with limited bandwidth. In contrast, the analog communication allows over-the-air computation (AirComp), thus achieving efficient spectrum utilization. However, computation-oriented analog transmission reduces power efficiency, and its performance is sensitive to computational errors. Finally, numerical simulations are conducted to verify these theoretical observations.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"11 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139963124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.10065
Achraf Azize, Debabrota Basu
We study the per-datum Membership Inference Attacks (MIAs), where an attacker aims to infer whether a fixed target datum has been included in the input dataset of an algorithm and thus, violates privacy. First, we define the membership leakage of a datum as the advantage of the optimal adversary targeting to identify it. Then, we quantify the per-datum membership leakage for the empirical mean, and show that it depends on the Mahalanobis distance between the target datum and the data-generating distribution. We further assess the effect of two privacy defences, i.e. adding Gaussian noise and sub-sampling. We quantify exactly how both of them decrease the per-datum membership leakage. Our analysis builds on a novel proof technique that combines an Edgeworth expansion of the likelihood ratio test and a Lindeberg-Feller central limit theorem. Our analysis connects the existing likelihood ratio and scalar product attacks, and also justifies different canary selection strategies used in the privacy auditing literature. Finally, our experiments demonstrate the impacts of the leakage score, the sub-sampling ratio and the noise scale on the per-datum membership leakage as indicated by the theory.
{"title":"How Much Does Each Datapoint Leak Your Privacy? Quantifying the Per-datum Membership Leakage","authors":"Achraf Azize, Debabrota Basu","doi":"10.48550/arXiv.2402.10065","DOIUrl":"https://doi.org/10.48550/arXiv.2402.10065","url":null,"abstract":"We study the per-datum Membership Inference Attacks (MIAs), where an attacker aims to infer whether a fixed target datum has been included in the input dataset of an algorithm and thus, violates privacy. First, we define the membership leakage of a datum as the advantage of the optimal adversary targeting to identify it. Then, we quantify the per-datum membership leakage for the empirical mean, and show that it depends on the Mahalanobis distance between the target datum and the data-generating distribution. We further assess the effect of two privacy defences, i.e. adding Gaussian noise and sub-sampling. We quantify exactly how both of them decrease the per-datum membership leakage. Our analysis builds on a novel proof technique that combines an Edgeworth expansion of the likelihood ratio test and a Lindeberg-Feller central limit theorem. Our analysis connects the existing likelihood ratio and scalar product attacks, and also justifies different canary selection strategies used in the privacy auditing literature. Finally, our experiments demonstrate the impacts of the leakage score, the sub-sampling ratio and the noise scale on the per-datum membership leakage as indicated by the theory.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"14 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139963341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09683
Zeya Chen, Ruth Schmidt
Designing seamless, frictionless user experiences has long been a dominant trend in both applied behavioral science and artificial intelligence (AI), in which the goal of making desirable actions easy and efficient informs efforts to minimize friction in user experiences. However, in some settings, friction can be genuinely beneficial, such as the insertion of deliberate delays to increase reflection, preventing individuals from resorting to automatic or biased behaviors, and enhancing opportunities for unexpected discoveries. More recently, the popularization and availability of AI on a widespread scale has only increased the need to examine how friction can help or hinder users of AI; it also suggests a need to consider how positive friction can benefit AI practitioners, both during development processes (e.g., working with diverse teams) and to inform how AI is designed into offerings. This paper first proposes a"positive friction"model that can help characterize how friction is currently beneficial in user and developer experiences with AI, diagnose the potential need for friction where it may not yet exist in these contexts, and inform how positive friction can be used to generate solutions, especially as advances in AI continue to be progress and new opportunities emerge. It then explores this model in the context of AI users and developers by proposing the value of taking a hybrid"AI+human"lens, and concludes by suggesting questions for further exploration.
{"title":"Exploring a Behavioral Model of \"Positive Friction\" in Human-AI Interaction","authors":"Zeya Chen, Ruth Schmidt","doi":"10.48550/arXiv.2402.09683","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09683","url":null,"abstract":"Designing seamless, frictionless user experiences has long been a dominant trend in both applied behavioral science and artificial intelligence (AI), in which the goal of making desirable actions easy and efficient informs efforts to minimize friction in user experiences. However, in some settings, friction can be genuinely beneficial, such as the insertion of deliberate delays to increase reflection, preventing individuals from resorting to automatic or biased behaviors, and enhancing opportunities for unexpected discoveries. More recently, the popularization and availability of AI on a widespread scale has only increased the need to examine how friction can help or hinder users of AI; it also suggests a need to consider how positive friction can benefit AI practitioners, both during development processes (e.g., working with diverse teams) and to inform how AI is designed into offerings. This paper first proposes a\"positive friction\"model that can help characterize how friction is currently beneficial in user and developer experiences with AI, diagnose the potential need for friction where it may not yet exist in these contexts, and inform how positive friction can be used to generate solutions, especially as advances in AI continue to be progress and new opportunities emerge. It then explores this model in the context of AI users and developers by proposing the value of taking a hybrid\"AI+human\"lens, and concludes by suggesting questions for further exploration.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"25 23","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09870
P. Koelewijn, Siep Weiland, Roland T'oth
This paper considers the equilibrium-free stability and performance analysis of discrete-time nonlinear systems. We consider two types of equilibrium-free notions. Namely, the universal shifted concept, which considers stability and performance w.r.t. all equilibrium points of the system, and the incremental concept, which considers stability and performance between trajectories of the system. In this paper, we show how universal shifted stability and performance of discrete-time systems can be analyzed by making use of the time-difference dynamics. Moreover, we extend the existing results for incremental dissipativity for discrete-time systems based on dissipativity analysis of the differential dynamics to more general state-dependent storage functions for less conservative results. Finally, we show how both these equilibrium-free notions can be cast as a convex analysis problem by making use of the linear parameter-varying framework, which is also demonstrated by means of an example.
{"title":"Convex Equilibrium-Free Stability and Performance Analysis of Discrete-Time Nonlinear Systems","authors":"P. Koelewijn, Siep Weiland, Roland T'oth","doi":"10.48550/arXiv.2402.09870","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09870","url":null,"abstract":"This paper considers the equilibrium-free stability and performance analysis of discrete-time nonlinear systems. We consider two types of equilibrium-free notions. Namely, the universal shifted concept, which considers stability and performance w.r.t. all equilibrium points of the system, and the incremental concept, which considers stability and performance between trajectories of the system. In this paper, we show how universal shifted stability and performance of discrete-time systems can be analyzed by making use of the time-difference dynamics. Moreover, we extend the existing results for incremental dissipativity for discrete-time systems based on dissipativity analysis of the differential dynamics to more general state-dependent storage functions for less conservative results. Finally, we show how both these equilibrium-free notions can be cast as a convex analysis problem by making use of the linear parameter-varying framework, which is also demonstrated by means of an example.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"23 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09790
Chiara Garavelli, A. Aldieri, M. Palanca, Luca Patruno, M. Viceconti
Vertebral fractures prediction in clinics lacks of accuracy. The most used scores have limitations in distinguishing between subjects at risk or not. Finite element (FE) models generated from computed tomography (CT) of these patients may improve the predictive capability. Many models have already been proposed but the most of them considered the single vertebral body, excluding from the analysis the role of the inter-vertebral discs in the distribution of the load through the spine. Multi-vertebral models instead allow to examine more complex boundary condition. However, CT scans do not provide subject-specif information about the material properties of the disc. Consequently, the goal of the study was to validate a multi-vertebral FE model with subject specific modelling of the vertebral bone and population-based properties assigned to the disc, idealizing them with a linear isotropic material. Boundary condition were assigned in order to reproduce an experimental test performed on the same specimen and recorded using digital image correlation technique (DIC). FE and DIC strains on the vertebral surfaces are compared point-wise. Young's modulus values in the range 25-30 MPa allowed to achieve a comparable order of magnitude between experimental and computational data. However, the two distribution remained strongly different. To conclude, subject-specific material properties need to be assigned also to the discs as well as to the vertebrae to achieve acceptable accuracy in the assessment of the fracture risk.
临床上对椎体骨折的预测缺乏准确性。最常用的评分在区分受试者是否存在风险方面存在局限性。根据这些患者的计算机断层扫描(CT)生成的有限元(FE)模型可以提高预测能力。目前已经提出了许多模型,但其中大多数都只考虑了单个椎体,将椎间盘在脊柱负荷分布中的作用排除在分析之外。多椎体模型则可以研究更复杂的边界条件。然而,CT 扫描无法提供有关椎间盘材料特性的特定信息。因此,本研究的目标是验证一个多椎体 FE 模型,该模型具有针对特定受试者的椎骨建模和基于人群的椎间盘属性,将其理想化为线性各向同性材料。设定边界条件是为了重现在同一试样上进行的实验测试,该测试使用数字图像相关技术(DIC)记录。对椎体表面的 FE 应变和 DIC 应变进行了点对点比较。杨氏模量值在 25-30 兆帕之间,因此实验数据和计算数据的数量级相当。然而,两者的分布仍然存在很大差异。总之,需要为椎间盘和椎体分配特定的材料属性,以便在评估骨折风险时达到可接受的准确性。
{"title":"Multi-vertebral CT-based FE models implementing linear isotropic population-based material properties for the intervertebral discs cannot accurately predict strains","authors":"Chiara Garavelli, A. Aldieri, M. Palanca, Luca Patruno, M. Viceconti","doi":"10.48550/arXiv.2402.09790","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09790","url":null,"abstract":"Vertebral fractures prediction in clinics lacks of accuracy. The most used scores have limitations in distinguishing between subjects at risk or not. Finite element (FE) models generated from computed tomography (CT) of these patients may improve the predictive capability. Many models have already been proposed but the most of them considered the single vertebral body, excluding from the analysis the role of the inter-vertebral discs in the distribution of the load through the spine. Multi-vertebral models instead allow to examine more complex boundary condition. However, CT scans do not provide subject-specif information about the material properties of the disc. Consequently, the goal of the study was to validate a multi-vertebral FE model with subject specific modelling of the vertebral bone and population-based properties assigned to the disc, idealizing them with a linear isotropic material. Boundary condition were assigned in order to reproduce an experimental test performed on the same specimen and recorded using digital image correlation technique (DIC). FE and DIC strains on the vertebral surfaces are compared point-wise. Young's modulus values in the range 25-30 MPa allowed to achieve a comparable order of magnitude between experimental and computational data. However, the two distribution remained strongly different. To conclude, subject-specific material properties need to be assigned also to the discs as well as to the vertebrae to achieve acceptable accuracy in the assessment of the fracture risk.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"16 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09731
Jianming Xian
Deep convolutional neural networks (CNNs) based approaches have achieved great performance in video matting. Many of these methods can produce accurate alpha estimation for the target body but typically yield fuzzy or incorrect target edges. This is usually caused by the following reasons: 1) The current methods always treat the target body and edge indiscriminately; 2) Target body dominates the whole target with only a tiny proportion target edge. For the first problem, we propose a CNN-based module that separately optimizes the matting target body and edge (SOBE). And on this basis, we introduce a real-time, trimap-free video matting method via progressively optimizing the matting target body and edge (POBEVM) that is much lighter than previous approaches and achieves significant improvements in the predicted target edge. For the second problem, we propose an Edge-L1-Loss (ELL) function that enforces our network on the matting target edge. Experiments demonstrate our method outperforms prior trimap-free matting methods on both Distinctions-646 (D646) and VideoMatte240K(VM) dataset, especially in edge optimization.
{"title":"POBEVM: Real-time Video Matting via Progressively Optimize the Target Body and Edge","authors":"Jianming Xian","doi":"10.48550/arXiv.2402.09731","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09731","url":null,"abstract":"Deep convolutional neural networks (CNNs) based approaches have achieved great performance in video matting. Many of these methods can produce accurate alpha estimation for the target body but typically yield fuzzy or incorrect target edges. This is usually caused by the following reasons: 1) The current methods always treat the target body and edge indiscriminately; 2) Target body dominates the whole target with only a tiny proportion target edge. For the first problem, we propose a CNN-based module that separately optimizes the matting target body and edge (SOBE). And on this basis, we introduce a real-time, trimap-free video matting method via progressively optimizing the matting target body and edge (POBEVM) that is much lighter than previous approaches and achieves significant improvements in the predicted target edge. For the second problem, we propose an Edge-L1-Loss (ELL) function that enforces our network on the matting target edge. Experiments demonstrate our method outperforms prior trimap-free matting methods on both Distinctions-646 (D646) and VideoMatte240K(VM) dataset, especially in edge optimization.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"29 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.48550/arXiv.2402.09734
Paulo Garcia
Ensuring artificial intelligence behaves in such a way that is aligned with human values is commonly referred to as the alignment challenge. Prior work has shown that rational agents, behaving in such a way that maximizes a utility function, will inevitably behave in such a way that is not aligned with human values, especially as their level of intelligence goes up. Prior work has also shown that there is no"one true utility function"; solutions must include a more holistic approach to alignment. This paper describes oblivious agents: agents that are architected in such a way that their effective utility function is an aggregation of a known and hidden sub-functions. The hidden component, to be maximized, is internally implemented as a black box, preventing the agent from examining it. The known component, to be minimized, is knowledge of the hidden sub-function. Architectural constraints further influence how agent actions can evolve its internal environment model. We show that an oblivious agent, behaving rationally, constructs an internal approximation of designers' intentions (i.e., infers alignment), and, as a consequence of its architecture and effective utility function, behaves in such a way that maximizes alignment; i.e., maximizing the approximated intention function. We show that, paradoxically, it does this for whatever utility function is used as the hidden component and, in contrast with extant techniques, chances of alignment actually improve as agent intelligence grows.
{"title":"Agents Need Not Know Their Purpose","authors":"Paulo Garcia","doi":"10.48550/arXiv.2402.09734","DOIUrl":"https://doi.org/10.48550/arXiv.2402.09734","url":null,"abstract":"Ensuring artificial intelligence behaves in such a way that is aligned with human values is commonly referred to as the alignment challenge. Prior work has shown that rational agents, behaving in such a way that maximizes a utility function, will inevitably behave in such a way that is not aligned with human values, especially as their level of intelligence goes up. Prior work has also shown that there is no\"one true utility function\"; solutions must include a more holistic approach to alignment. This paper describes oblivious agents: agents that are architected in such a way that their effective utility function is an aggregation of a known and hidden sub-functions. The hidden component, to be maximized, is internally implemented as a black box, preventing the agent from examining it. The known component, to be minimized, is knowledge of the hidden sub-function. Architectural constraints further influence how agent actions can evolve its internal environment model. We show that an oblivious agent, behaving rationally, constructs an internal approximation of designers' intentions (i.e., infers alignment), and, as a consequence of its architecture and effective utility function, behaves in such a way that maximizes alignment; i.e., maximizing the approximated intention function. We show that, paradoxically, it does this for whatever utility function is used as the hidden component and, in contrast with extant techniques, chances of alignment actually improve as agent intelligence grows.","PeriodicalId":8425,"journal":{"name":"ArXiv","volume":"26 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139962620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}