Aditya Mitra, Anisha Ghosh, Sibi Chakkaravarthy Sethuraman, Devi Priya V S
Recent increases in endpoint-based security events and threats compelled enterprise operations to switch to virtual desktop infrastructure and web-based applications. In addition to reducing potential hazards, this has guaranteed a consistent desktop environment for every user. On the other hand, the attack surface is greatly increased because all endpoints are connected to the company network, which could harbor malware and other advanced persistent threats. This results in a considerable loss of system resources on each individual endpoint. Hence our work proposes a standard called Colaboot that enables machines throughout a company to boot from a single operating system in order to address these problems and guarantee a consistent operating system environment that could be easily updated to the most recent security patches across all work stations.
{"title":"Colaboot: A Cloud-based Diskless PC Booting Mechanism","authors":"Aditya Mitra, Anisha Ghosh, Sibi Chakkaravarthy Sethuraman, Devi Priya V S","doi":"arxiv-2408.17045","DOIUrl":"https://doi.org/arxiv-2408.17045","url":null,"abstract":"Recent increases in endpoint-based security events and threats compelled\u0000enterprise operations to switch to virtual desktop infrastructure and web-based\u0000applications. In addition to reducing potential hazards, this has guaranteed a\u0000consistent desktop environment for every user. On the other hand, the attack\u0000surface is greatly increased because all endpoints are connected to the company\u0000network, which could harbor malware and other advanced persistent threats. This\u0000results in a considerable loss of system resources on each individual endpoint.\u0000Hence our work proposes a standard called Colaboot that enables machines\u0000throughout a company to boot from a single operating system in order to address\u0000these problems and guarantee a consistent operating system environment that\u0000could be easily updated to the most recent security patches across all work\u0000stations.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Li, Lizhou Fan, Hanbo Wu, Kunping Chen, Xiaoxiao Yu, Chao Che, Zhifeng Cai, Xiuhong Niu, Aihua Cao, Xin Ma
Autism Spectrum Disorder (ASD) is a rapidly growing neurodevelopmental disorder. Performing a timely intervention is crucial for the growth of young children with ASD, but traditional clinical screening methods lack objectivity. This study introduces an innovative approach to early detection of ASD. The contributions are threefold. First, this work proposes a novel Parent-Child Dyads Block-Play (PCB) protocol, grounded in kinesiological and neuroscientific research, to identify behavioral patterns distinguishing ASD from typically developing (TD) toddlers. Second, we have compiled a substantial video dataset, featuring 40 ASD and 89 TD toddlers engaged in block play with parents. This dataset exceeds previous efforts on both the scale of participants and the length of individual sessions. Third, our approach to action analysis in videos employs a hybrid deep learning framework, integrating a two-stream graph convolution network with attention-enhanced xLSTM (2sGCN-AxLSTM). This framework is adept at capturing dynamic interactions between toddlers and parents by extracting spatial features correlated with upper body and head movements and focusing on global contextual information of action sequences over time. By learning these global features with spatio-temporal correlations, our 2sGCN-AxLSTM effectively analyzes dynamic human behavior patterns and demonstrates an unprecedented accuracy of 89.6% in early detection of ASD. Our approach shows strong potential for enhancing early ASD diagnosis by accurately analyzing parent-child interactions, providing a critical tool to support timely and informed clinical decision-making.
{"title":"Enhancing Autism Spectrum Disorder Early Detection with the Parent-Child Dyads Block-Play Protocol and an Attention-enhanced GCN-xLSTM Hybrid Deep Learning Framework","authors":"Xiang Li, Lizhou Fan, Hanbo Wu, Kunping Chen, Xiaoxiao Yu, Chao Che, Zhifeng Cai, Xiuhong Niu, Aihua Cao, Xin Ma","doi":"arxiv-2408.16924","DOIUrl":"https://doi.org/arxiv-2408.16924","url":null,"abstract":"Autism Spectrum Disorder (ASD) is a rapidly growing neurodevelopmental\u0000disorder. Performing a timely intervention is crucial for the growth of young\u0000children with ASD, but traditional clinical screening methods lack objectivity.\u0000This study introduces an innovative approach to early detection of ASD. The\u0000contributions are threefold. First, this work proposes a novel Parent-Child\u0000Dyads Block-Play (PCB) protocol, grounded in kinesiological and neuroscientific\u0000research, to identify behavioral patterns distinguishing ASD from typically\u0000developing (TD) toddlers. Second, we have compiled a substantial video dataset,\u0000featuring 40 ASD and 89 TD toddlers engaged in block play with parents. This\u0000dataset exceeds previous efforts on both the scale of participants and the\u0000length of individual sessions. Third, our approach to action analysis in videos\u0000employs a hybrid deep learning framework, integrating a two-stream graph\u0000convolution network with attention-enhanced xLSTM (2sGCN-AxLSTM). This\u0000framework is adept at capturing dynamic interactions between toddlers and\u0000parents by extracting spatial features correlated with upper body and head\u0000movements and focusing on global contextual information of action sequences\u0000over time. By learning these global features with spatio-temporal correlations,\u0000our 2sGCN-AxLSTM effectively analyzes dynamic human behavior patterns and\u0000demonstrates an unprecedented accuracy of 89.6% in early detection of ASD. Our\u0000approach shows strong potential for enhancing early ASD diagnosis by accurately\u0000analyzing parent-child interactions, providing a critical tool to support\u0000timely and informed clinical decision-making.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniil Filienko, Yinzhou Wang, Caroline El Jazmi, Serena Xie, Trevor Cohen, Martine De Cock, Weichao Yuwen
While Large Language Models (LLMs) are being quickly adapted to many domains, including healthcare, their strengths and pitfalls remain under-explored. In our study, we examine the effects of prompt engineering to guide Large Language Models (LLMs) in delivering parts of a Problem-Solving Therapy (PST) session via text, particularly during the symptom identification and assessment phase for personalized goal setting. We present evaluation results of the models' performances by automatic metrics and experienced medical professionals. We demonstrate that the models' capability to deliver protocolized therapy can be improved with the proper use of prompt engineering methods, albeit with limitations. To our knowledge, this study is among the first to assess the effects of various prompting techniques in enhancing a generalist model's ability to deliver psychotherapy, focusing on overall quality, consistency, and empathy. Exploring LLMs' potential in delivering psychotherapy holds promise with the current shortage of mental health professionals amid significant needs, enhancing the potential utility of AI-based and AI-enhanced care services.
{"title":"Toward Large Language Models as a Therapeutic Tool: Comparing Prompting Techniques to Improve GPT-Delivered Problem-Solving Therapy","authors":"Daniil Filienko, Yinzhou Wang, Caroline El Jazmi, Serena Xie, Trevor Cohen, Martine De Cock, Weichao Yuwen","doi":"arxiv-2409.00112","DOIUrl":"https://doi.org/arxiv-2409.00112","url":null,"abstract":"While Large Language Models (LLMs) are being quickly adapted to many domains,\u0000including healthcare, their strengths and pitfalls remain under-explored. In\u0000our study, we examine the effects of prompt engineering to guide Large Language\u0000Models (LLMs) in delivering parts of a Problem-Solving Therapy (PST) session\u0000via text, particularly during the symptom identification and assessment phase\u0000for personalized goal setting. We present evaluation results of the models'\u0000performances by automatic metrics and experienced medical professionals. We\u0000demonstrate that the models' capability to deliver protocolized therapy can be\u0000improved with the proper use of prompt engineering methods, albeit with\u0000limitations. To our knowledge, this study is among the first to assess the\u0000effects of various prompting techniques in enhancing a generalist model's\u0000ability to deliver psychotherapy, focusing on overall quality, consistency, and\u0000empathy. Exploring LLMs' potential in delivering psychotherapy holds promise\u0000with the current shortage of mental health professionals amid significant\u0000needs, enhancing the potential utility of AI-based and AI-enhanced care\u0000services.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"410 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michela Taufer, Valerio Pascucci, Christine R. Kirkpatric, Ian T. Foster
The urgent need for data democratization in scientific research was the focal point of a panel discussion at SC23 in Denver, Colorado, from November 12 to 17, 2023. This article summarizes the outcomes of that discussion and subsequent conversations. We advocate for strategic investments in financial, human, and technological resources for sustainable data democratization. Emphasizing that data is central to scientific discovery and AI deployment, we highlight barriers such as limited access, inadequate financial incentives for cross-domain collaboration, and a shortage of workforce development initiatives. Our recommendations aim to guide decision-makers in fostering an inclusive research community, breaking down research silos, and developing a skilled workforce to advance scientific discovery.
{"title":"Sustainable Data Democratization: A Multifaceted Investment for an Equitable Future","authors":"Michela Taufer, Valerio Pascucci, Christine R. Kirkpatric, Ian T. Foster","doi":"arxiv-2408.14627","DOIUrl":"https://doi.org/arxiv-2408.14627","url":null,"abstract":"The urgent need for data democratization in scientific research was the focal\u0000point of a panel discussion at SC23 in Denver, Colorado, from November 12 to\u000017, 2023. This article summarizes the outcomes of that discussion and\u0000subsequent conversations. We advocate for strategic investments in financial,\u0000human, and technological resources for sustainable data democratization.\u0000Emphasizing that data is central to scientific discovery and AI deployment, we\u0000highlight barriers such as limited access, inadequate financial incentives for\u0000cross-domain collaboration, and a shortage of workforce development\u0000initiatives. Our recommendations aim to guide decision-makers in fostering an\u0000inclusive research community, breaking down research silos, and developing a\u0000skilled workforce to advance scientific discovery.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"282 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Javier Conde, Andres Munoz-Arcentales, Johnny Choque, Gabriel Huecas, Álvaro Alonso
We study the benefits and challenges of using Linked Open Data in smart city applications and propose a set of open source, highly scalable tools within the case of a public-rental bicycle system, which can act as a reference guide for other smart city applications.
{"title":"Overcoming the Barriers of Using Linked Open Data in Smart City Applications","authors":"Javier Conde, Andres Munoz-Arcentales, Johnny Choque, Gabriel Huecas, Álvaro Alonso","doi":"arxiv-2408.14315","DOIUrl":"https://doi.org/arxiv-2408.14315","url":null,"abstract":"We study the benefits and challenges of using Linked Open Data in smart city\u0000applications and propose a set of open source, highly scalable tools within the\u0000case of a public-rental bicycle system, which can act as a reference guide for\u0000other smart city applications.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of Internet of Things (IoT) devices has raised significant concerns regarding their security vulnerabilities. This paper explores the security risks associated with smart light systems, focusing on covert communication channels. Drawing upon previous re-search highlighting vulnerabilities in communication protocols and en-cryption flaws, the study investigates the potential for exploiting smart light systems for covert data transmission. Specifically, the paper repli-cates and analyzes an attack method introduced by Ronen and Shamir, which utilizes the Philips Hue White lighting system to create a covert channel through visible light communication (VLC). Experimental re-sults demonstrate the feasibility of transmitting data covertly through subtle variations in brightness levels, leveraging the inherent functional-ity of smart light bulbs. Despite limit. ations imposed by device constraints and communication protocols, the study underscores the need for heightened awareness and security measures in IoT environment. Ultimately, the findings emphasize the importance of implementing robust security practices and exercising caution when deploying networked IoT devices in sensitive environment.
{"title":"Security Concerns in IoT Light Bulbs: Investigating Covert Channels","authors":"Ravisha Rohilla, Janvi Panwar","doi":"arxiv-2408.14613","DOIUrl":"https://doi.org/arxiv-2408.14613","url":null,"abstract":"The proliferation of Internet of Things (IoT) devices has raised significant\u0000concerns regarding their security vulnerabilities. This paper explores the\u0000security risks associated with smart light systems, focusing on covert\u0000communication channels. Drawing upon previous re-search highlighting\u0000vulnerabilities in communication protocols and en-cryption flaws, the study\u0000investigates the potential for exploiting smart light systems for covert data\u0000transmission. Specifically, the paper repli-cates and analyzes an attack method\u0000introduced by Ronen and Shamir, which utilizes the Philips Hue White lighting\u0000system to create a covert channel through visible light communication (VLC).\u0000Experimental re-sults demonstrate the feasibility of transmitting data covertly\u0000through subtle variations in brightness levels, leveraging the inherent\u0000functional-ity of smart light bulbs. Despite limit. ations imposed by device\u0000constraints and communication protocols, the study underscores the need for\u0000heightened awareness and security measures in IoT environment. Ultimately, the\u0000findings emphasize the importance of implementing robust security practices and\u0000exercising caution when deploying networked IoT devices in sensitive\u0000environment.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The evaluation of modeling languages for augmented reality applications poses particular challenges due to the three-dimensional environment they target. The previously introduced Augmented Reality Workflow Modeling Language (ARWFML) enables the model-based creation of augmented reality scenarios without programming knowledge. Building upon the first design cycle of the language's specification, this paper presents two further design iterations for refining the language based on multi-faceted evaluations. These include a comparative evaluation of implementation options and workflow capabilities, the introduction of a 3D notation, and the development of a new 3D modeling environment. On this basis, a comprehensibility study of the language was conducted. Thereby, we show how modeling languages for augmented reality can be evolved towards a maturity level suitable for empirical evaluations.
{"title":"Multi-Faceted Evaluation of Modeling Languages for Augmented Reality Applications -- The Case of ARWFML","authors":"Fabian Muff, Hans-Georg Fill","doi":"arxiv-2408.14137","DOIUrl":"https://doi.org/arxiv-2408.14137","url":null,"abstract":"The evaluation of modeling languages for augmented reality applications poses\u0000particular challenges due to the three-dimensional environment they target. The\u0000previously introduced Augmented Reality Workflow Modeling Language (ARWFML)\u0000enables the model-based creation of augmented reality scenarios without\u0000programming knowledge. Building upon the first design cycle of the language's\u0000specification, this paper presents two further design iterations for refining\u0000the language based on multi-faceted evaluations. These include a comparative\u0000evaluation of implementation options and workflow capabilities, the\u0000introduction of a 3D notation, and the development of a new 3D modeling\u0000environment. On this basis, a comprehensibility study of the language was\u0000conducted. Thereby, we show how modeling languages for augmented reality can be\u0000evolved towards a maturity level suitable for empirical evaluations.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haozhao Zhang, Zhe Zhang, Zhiqiang Zheng, Varghese Jacob
This paper proposes a new paradigm: generative blockchain, which aims to transform conventional blockchain technology by combining transaction generation and recording, rather than focusing solely on transaction recording. Central to our design is a novel consensus mechanism, Proof-of-Merit (PoM), specifically crafted for environments where businesses must solve complex problems before transactions can be recorded. PoM integrates the generation and recording of transactions within a unified blockchain system, fundamentally differing from prevailing consensus mechanisms that primarily record existing transactions. We demonstrate PoM on a ride service on-demand platform, where the task of solving complex transaction-generating problems is delegated to a pool of independent problem solvers. These solvers generate transactions, and their solutions are selected based on merit. The winning solvers then register these transactions onto the blockchain and are rewarded accordingly. We introduce a Decentralized Control Parameter (DCP) to balance two key performance metrics: efficiency and equity. The applicability of our generative blockchain is illustrated through a ridesharing context, where matchers (solvers) are tasked with matching riders to drivers. We demonstrate PoM's performance and nuanced properties using agent-based simulation, exploring how to find the optimal DCP value to achieve a desirable balance of efficiency and equity in a generative blockchain.
{"title":"Generative Blockchain: Transforming Blockchain from Transaction Recording to Transaction Generation through Proof-of-Merit","authors":"Haozhao Zhang, Zhe Zhang, Zhiqiang Zheng, Varghese Jacob","doi":"arxiv-2408.13367","DOIUrl":"https://doi.org/arxiv-2408.13367","url":null,"abstract":"This paper proposes a new paradigm: generative blockchain, which aims to\u0000transform conventional blockchain technology by combining transaction\u0000generation and recording, rather than focusing solely on transaction recording.\u0000Central to our design is a novel consensus mechanism, Proof-of-Merit (PoM),\u0000specifically crafted for environments where businesses must solve complex\u0000problems before transactions can be recorded. PoM integrates the generation and\u0000recording of transactions within a unified blockchain system, fundamentally\u0000differing from prevailing consensus mechanisms that primarily record existing\u0000transactions. We demonstrate PoM on a ride service on-demand platform, where\u0000the task of solving complex transaction-generating problems is delegated to a\u0000pool of independent problem solvers. These solvers generate transactions, and\u0000their solutions are selected based on merit. The winning solvers then register\u0000these transactions onto the blockchain and are rewarded accordingly. We\u0000introduce a Decentralized Control Parameter (DCP) to balance two key\u0000performance metrics: efficiency and equity. The applicability of our generative\u0000blockchain is illustrated through a ridesharing context, where matchers\u0000(solvers) are tasked with matching riders to drivers. We demonstrate PoM's\u0000performance and nuanced properties using agent-based simulation, exploring how\u0000to find the optimal DCP value to achieve a desirable balance of efficiency and\u0000equity in a generative blockchain.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji Liu, Alvin Gonzales, Benchen Huang, Zain Hamid Saleem, Paul Hovland
Quantum computing carries significant potential for addressing practical problems. However, currently available quantum devices suffer from noisy quantum gates, which degrade the fidelity of executed quantum circuits. Therefore, quantum circuit optimization is crucial for obtaining useful results. In this paper, we present QuCLEAR, a compilation framework designed to optimize quantum circuits. QuCLEAR significantly reduces both the two-qubit gate count and the circuit depth through two novel optimization steps. First, we introduce the concept of Clifford Extraction, which extracts Clifford subcircuits to the end of the circuit while optimizing the gates. Second, since Clifford circuits are classically simulatable, we propose Clifford Absorption, which efficiently processes the extracted Clifford subcircuits classically. We demonstrate our framework on quantum simulation circuits, which have wide-ranging applications in quantum chemistry simulation, many-body physics, and combinatorial optimization problems. Near-term algorithms such as VQE and QAOA also fall within this category. Experimental results across various benchmarks show that QuCLEAR achieves up to a $77.7%$ reduction in CNOT gate count and up to an $84.1%$ reduction in entangling depth compared to state-of-the-art methods.
{"title":"QuCLEAR: Clifford Extraction and Absorption for Significant Reduction in Quantum Circuit Size","authors":"Ji Liu, Alvin Gonzales, Benchen Huang, Zain Hamid Saleem, Paul Hovland","doi":"arxiv-2408.13316","DOIUrl":"https://doi.org/arxiv-2408.13316","url":null,"abstract":"Quantum computing carries significant potential for addressing practical\u0000problems. However, currently available quantum devices suffer from noisy\u0000quantum gates, which degrade the fidelity of executed quantum circuits.\u0000Therefore, quantum circuit optimization is crucial for obtaining useful\u0000results. In this paper, we present QuCLEAR, a compilation framework designed to\u0000optimize quantum circuits. QuCLEAR significantly reduces both the two-qubit\u0000gate count and the circuit depth through two novel optimization steps. First,\u0000we introduce the concept of Clifford Extraction, which extracts Clifford\u0000subcircuits to the end of the circuit while optimizing the gates. Second, since\u0000Clifford circuits are classically simulatable, we propose Clifford Absorption,\u0000which efficiently processes the extracted Clifford subcircuits classically. We\u0000demonstrate our framework on quantum simulation circuits, which have\u0000wide-ranging applications in quantum chemistry simulation, many-body physics,\u0000and combinatorial optimization problems. Near-term algorithms such as VQE and\u0000QAOA also fall within this category. Experimental results across various\u0000benchmarks show that QuCLEAR achieves up to a $77.7%$ reduction in CNOT gate\u0000count and up to an $84.1%$ reduction in entangling depth compared to\u0000state-of-the-art methods.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The recently introduced Segment Anything Model (SAM), a Visual Foundation Model (VFM), has demonstrated impressive capabilities in zero-shot segmentation tasks across diverse natural image datasets. Despite its success, SAM encounters noticeably performance degradation when applied to specific domains, such as medical images. Current efforts to address this issue have involved fine-tuning strategies, intended to bolster the generalizability of the vanilla SAM. However, these approaches still predominantly necessitate the utilization of domain specific expert-level prompts during the evaluation phase, which severely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM model. Specifically, SAM-SP leverages the output from the previous iteration of the model itself as prompts to guide subsequent iteration of the model. This self-prompting module endeavors to learn how to generate useful prompts autonomously and alleviates the dependence on expert prompts during the evaluation phase, significantly broadening SAM's applicability. Additionally, we integrate a self-distillation module to enhance the self-prompting process further. Extensive experiments across various domain specific datasets validate the effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the reliance on expert prompts but also exhibits superior segmentation performance comparing to the state-of-the-art task-specific segmentation approaches, the vanilla SAM, and SAM-based approaches.
最近推出的视觉基础模型(Visual FoundationModel,VFM)--"任意分割模型"(Segment Anything Model,SAM)在各种自然图像数据集的零镜头分割任务中表现出了令人印象深刻的能力。尽管取得了成功,但当 SAM 应用于医疗图像等特定领域时,其性能却明显下降。目前解决这一问题的方法包括微调策略,旨在增强 vanillaSAM 的通用性。然而,这些方法在评估阶段仍然主要需要使用特定领域的专家级提示,这严重限制了模型的实用性。为了克服这一局限性,我们引入了一种新颖的基于自我提示的微调方法,称为 SAM-SP,专门用于扩展普通 SAM 模型。具体来说,SAM-SP 利用上一次模型迭代的输出作为提示,指导模型的后续迭代。这个自我提示模块努力学习如何自主生成有用的提示,减轻了评估阶段对专家提示的依赖,从而大大拓宽了 SAM 的适用性。此外,我们还集成了一个自发模块,以进一步增强自我提示过程。在各种特定领域数据集上进行的广泛实验验证了所提出的 SAM-SP 的有效性。我们的 SAM-SP 不仅减轻了对专家提示的依赖,而且与最先进的特定任务分割方法、Vanilla SAM 和基于 SAM 的方法相比,表现出更优越的分割性能。
{"title":"SAM-SP: Self-Prompting Makes SAM Great Again","authors":"Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang","doi":"arxiv-2408.12364","DOIUrl":"https://doi.org/arxiv-2408.12364","url":null,"abstract":"The recently introduced Segment Anything Model (SAM), a Visual Foundation\u0000Model (VFM), has demonstrated impressive capabilities in zero-shot segmentation\u0000tasks across diverse natural image datasets. Despite its success, SAM\u0000encounters noticeably performance degradation when applied to specific domains,\u0000such as medical images. Current efforts to address this issue have involved\u0000fine-tuning strategies, intended to bolster the generalizability of the vanilla\u0000SAM. However, these approaches still predominantly necessitate the utilization\u0000of domain specific expert-level prompts during the evaluation phase, which\u0000severely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based\u0000fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM\u0000model. Specifically, SAM-SP leverages the output from the previous iteration of\u0000the model itself as prompts to guide subsequent iteration of the model. This\u0000self-prompting module endeavors to learn how to generate useful prompts\u0000autonomously and alleviates the dependence on expert prompts during the\u0000evaluation phase, significantly broadening SAM's applicability. Additionally,\u0000we integrate a self-distillation module to enhance the self-prompting process\u0000further. Extensive experiments across various domain specific datasets validate\u0000the effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the\u0000reliance on expert prompts but also exhibits superior segmentation performance\u0000comparing to the state-of-the-art task-specific segmentation approaches, the\u0000vanilla SAM, and SAM-based approaches.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"107 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}