首页 > 最新文献

arXiv - CS - Operating Systems最新文献

英文 中文
Dynamic Voltage and Frequency Scaling for Intermittent Computing 间歇计算的动态电压和频率缩放
Pub Date : 2024-01-15 DOI: arxiv-2401.08710
Andrea Maioli, Kevin A. Quinones, Saad Ahmed, Muhammad H. Alizai, Luca Mottola
We present hardware/software techniques to intelligently regulate supplyvoltage and clock frequency of intermittently-computing devices. These devicesrely on ambient energy harvesting to power their operation and small capacitorsas energy buffers. Statically setting their clock frequency fails to capturethe unique relations these devices expose between capacitor voltage, energyefficiency at a given operating frequency, and the corresponding operatingrange. Existing dynamic voltage and frequency scaling techniques are alsolargely inapplicable due to extreme energy scarcity and peculiar hardwarefeatures. We introduce two hardware/software co-designs that accommodate thedistinct hardware features and function within a constrained energy envelope,offering varied trade-offs and functionalities. Our experimental evaluationcombines tests on custom-manufactured hardware and detailed emulationexperiments. The data gathered indicate that our approaches result in up to3.75x reduced energy consumption and 12x swifter execution times compared tothe considered baselines, all while utilizing smaller capacitors to accomplishidentical workloads.
我们介绍了智能调节间歇式计算设备的电源电压和时钟频率的硬件/软件技术。这些设备依靠环境能量收集为其运行提供动力,并使用小型电容器作为能量缓冲器。静态设置其时钟频率无法捕捉到这些设备在电容器电压、给定工作频率下的能效以及相应工作范围之间的独特关系。现有的动态电压和频率缩放技术也因能量极度匮乏和特殊的硬件特性而大多不适用。我们介绍了两种硬件/软件协同设计,它们能适应不同的硬件特性,并在受限的能量包络线内运行,提供不同的权衡和功能。我们的实验评估包括定制硬件测试和详细的模拟实验。收集到的数据表明,与所考虑的基线相比,我们的方法最多可将能耗降低 3.75 倍,执行时间缩短 12 倍,同时利用更小的电容器完成相同的工作负载。
{"title":"Dynamic Voltage and Frequency Scaling for Intermittent Computing","authors":"Andrea Maioli, Kevin A. Quinones, Saad Ahmed, Muhammad H. Alizai, Luca Mottola","doi":"arxiv-2401.08710","DOIUrl":"https://doi.org/arxiv-2401.08710","url":null,"abstract":"We present hardware/software techniques to intelligently regulate supply\u0000voltage and clock frequency of intermittently-computing devices. These devices\u0000rely on ambient energy harvesting to power their operation and small capacitors\u0000as energy buffers. Statically setting their clock frequency fails to capture\u0000the unique relations these devices expose between capacitor voltage, energy\u0000efficiency at a given operating frequency, and the corresponding operating\u0000range. Existing dynamic voltage and frequency scaling techniques are also\u0000largely inapplicable due to extreme energy scarcity and peculiar hardware\u0000features. We introduce two hardware/software co-designs that accommodate the\u0000distinct hardware features and function within a constrained energy envelope,\u0000offering varied trade-offs and functionalities. Our experimental evaluation\u0000combines tests on custom-manufactured hardware and detailed emulation\u0000experiments. The data gathered indicate that our approaches result in up to\u00003.75x reduced energy consumption and 12x swifter execution times compared to\u0000the considered baselines, all while utilizing smaller capacitors to accomplish\u0000identical workloads.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"81 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139500450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When eBPF Meets Machine Learning: On-the-fly OS Kernel Compartmentalization 当 eBPF 遇到机器学习:实时操作系统内核分区
Pub Date : 2024-01-11 DOI: arxiv-2401.05641
Zicheng Wang, Tiejin Chen, Qinrun Dai, Yueqi Chen, Hua Wei, Qingkai Zeng
Compartmentalization effectively prevents initial corruption from turninginto a successful attack. This paper presents O2C, a pioneering system designedto enforce OS kernel compartmentalization on the fly. It not only providesimmediate remediation for sudden threats but also maintains consistent systemavailability through the enforcement process. O2C is empowered by the newest advancements of the eBPF ecosystem whichallows to instrument eBPF programs that perform enforcement actions into thekernel at runtime. O2C takes the lead in embedding a machine learning modelinto eBPF programs, addressing unique challenges in on-the-flycompartmentalization. Our comprehensive evaluation shows that O2C effectivelyconfines damage within the compartment. Further, we validate that decision treeis optimally suited for O2C owing to its advantages in processing tabular data,its explainable nature, and its compliance with the eBPF ecosystem. Last butnot least, O2C is lightweight, showing negligible overhead and excellentsacalability system-wide.
内核分隔可有效防止初始破坏转化为成功的攻击。本文介绍了 O2C,这是一个旨在即时执行操作系统内核分隔的开创性系统。它不仅能为突发威胁提供即时补救措施,还能在执行过程中保持系统的持续可用性。O2C 借助 eBPF 生态系统的最新进展,允许在运行时将执行强制措施的 eBPF 程序植入内核。O2C率先在eBPF程序中嵌入了机器学习模型,解决了即时分区的独特挑战。我们的综合评估结果表明,O2C 能有效地将损害限制在小区内。此外,我们还验证了决策树最适合用于 O2C,因为它在处理表格数据、可解释性以及与 eBPF 生态系统的一致性方面具有优势。最后但并非最不重要的一点是,O2C 是轻量级的,其开销可以忽略不计,并且在全系统范围内具有出色的可扩展性。
{"title":"When eBPF Meets Machine Learning: On-the-fly OS Kernel Compartmentalization","authors":"Zicheng Wang, Tiejin Chen, Qinrun Dai, Yueqi Chen, Hua Wei, Qingkai Zeng","doi":"arxiv-2401.05641","DOIUrl":"https://doi.org/arxiv-2401.05641","url":null,"abstract":"Compartmentalization effectively prevents initial corruption from turning\u0000into a successful attack. This paper presents O2C, a pioneering system designed\u0000to enforce OS kernel compartmentalization on the fly. It not only provides\u0000immediate remediation for sudden threats but also maintains consistent system\u0000availability through the enforcement process. O2C is empowered by the newest advancements of the eBPF ecosystem which\u0000allows to instrument eBPF programs that perform enforcement actions into the\u0000kernel at runtime. O2C takes the lead in embedding a machine learning model\u0000into eBPF programs, addressing unique challenges in on-the-fly\u0000compartmentalization. Our comprehensive evaluation shows that O2C effectively\u0000confines damage within the compartment. Further, we validate that decision tree\u0000is optimally suited for O2C owing to its advantages in processing tabular data,\u0000its explainable nature, and its compliance with the eBPF ecosystem. Last but\u0000not least, O2C is lightweight, showing negligible overhead and excellent\u0000sacalability system-wide.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139462566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RASP for LSASS: Preventing Mimikatz-Related Attacks LSASS 的 RASP:防止与 Mimikatz 相关的攻击
Pub Date : 2023-12-30 DOI: arxiv-2401.00316
Anna Revazova, Igor Korkin
The Windows authentication infrastructure relies on the Local SecurityAuthority (LSA) system, with its integral component being lsass.exe.Regrettably, this framework is not impervious, presenting vulnerabilities thatattract threat actors with malicious intent. By exploiting documentedvulnerabilities sourced from the CVE database or leveraging sophisticated toolssuch as mimikatz, adversaries can successfully compromise user password-addressinformation. In this comprehensive analysis, we delve into proactive measures aimed atfortifying the local authentication subsystem against potential threats.Moreover, we present empirical evidence derived from practical assessments ofvarious defensive methodologies, including those articulated previously. Thisexamination not only underscores the importance of proactive security measuresbut also assesses the practical efficacy of these strategies in real-worldcontexts.
Windows 身份验证基础架构依赖于本地安全授权(LSA)系统,其不可或缺的组成部分是 lsass.exe。遗憾的是,这个框架并不是无懈可击的,它所存在的漏洞吸引着怀有恶意的威胁者。通过利用 CVE 数据库中记录的漏洞或利用 mimikatz 等复杂工具,对手可以成功入侵用户密码地址信息。在这篇综合分析报告中,我们深入探讨了旨在加强本地身份验证子系统应对潜在威胁的前瞻性措施。此外,我们还介绍了对各种防御方法(包括之前阐述的方法)进行实际评估后得出的经验证据。这项研究不仅强调了主动安全措施的重要性,还评估了这些策略在现实环境中的实际效果。
{"title":"RASP for LSASS: Preventing Mimikatz-Related Attacks","authors":"Anna Revazova, Igor Korkin","doi":"arxiv-2401.00316","DOIUrl":"https://doi.org/arxiv-2401.00316","url":null,"abstract":"The Windows authentication infrastructure relies on the Local Security\u0000Authority (LSA) system, with its integral component being lsass.exe.\u0000Regrettably, this framework is not impervious, presenting vulnerabilities that\u0000attract threat actors with malicious intent. By exploiting documented\u0000vulnerabilities sourced from the CVE database or leveraging sophisticated tools\u0000such as mimikatz, adversaries can successfully compromise user password-address\u0000information. In this comprehensive analysis, we delve into proactive measures aimed at\u0000fortifying the local authentication subsystem against potential threats.\u0000Moreover, we present empirical evidence derived from practical assessments of\u0000various defensive methodologies, including those articulated previously. This\u0000examination not only underscores the importance of proactive security measures\u0000but also assesses the practical efficacy of these strategies in real-world\u0000contexts.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139077898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ALPC Is In Danger: ALPChecker Detects Spoofing and Blinding ALPC 处于危险之中ALPChecker 检测欺骗和欺骗行为
Pub Date : 2023-12-30 DOI: arxiv-2401.01376
Anastasiia Kropova, Igor Korkin
The purpose of this study is to evaluate the possibility of implementing anattack on ALPC connection in the Windows operating system through the kernelwithout closing the connection covertly from programs and the operating systemand to propose a method of protection against this type of attacks.Asynchronous Local Procedure Call technology (ALPC) is used in various Windowsinformation protection systems, including antivirus systems (AV) and EndpointDetection and Response systems (EDR). To ensure the concealment of malicioussoftware, attackers need to disrupt the operation of AV, EDR tools, which inturn can be achieved by destructive impact on the components of the ALPCtechnology. Examples of such attacks already exist and are covered in thispaper. To counteract such new threats, it is necessary to advance theimprovement of information security systems and the ALPC security research wasconducted. The most difficult case, Windows kernel driver attack, wasconsidered. Three attacks on the ALPC connection were carried out, based onchanging the ALPC structures in the kernel memory, which led to creation ofillegitimate connections in the system and the disruption of correctconnections. ALPChecker protection tool has been developed. The tool wassuccessfully tested on three demonstrated attacks.
本研究的目的是评估通过内核对 Windows 操作系统中的 ALPC 连接实施攻击的可能性,而不从程序和操作系统隐蔽地关闭连接,并提出一种防范此类攻击的方法。异步本地过程调用技术(ALPC)用于各种 Windows 信息保护系统,包括防病毒系统(AV)和端点检测与响应系统(EDR)。为确保隐藏恶意软件,攻击者需要破坏 AV 和 EDR 工具的运行,而这可以通过对 ALPC 技术组件的破坏性影响来实现。此类攻击的例子已经存在,本文将对此进行介绍。为了应对此类新威胁,有必要推进信息安全系统的改进,因此开展了 ALPC 安全研究。研究考虑了最困难的情况,即 Windows 内核驱动程序攻击。通过改变内核内存中的 ALPC 结构,对 ALPC 连接进行了三次攻击,从而在系统中创建了非法连接并破坏了正确的连接。ALPChecker 保护工具已经开发出来。该工具在三次演示攻击中进行了成功测试。
{"title":"ALPC Is In Danger: ALPChecker Detects Spoofing and Blinding","authors":"Anastasiia Kropova, Igor Korkin","doi":"arxiv-2401.01376","DOIUrl":"https://doi.org/arxiv-2401.01376","url":null,"abstract":"The purpose of this study is to evaluate the possibility of implementing an\u0000attack on ALPC connection in the Windows operating system through the kernel\u0000without closing the connection covertly from programs and the operating system\u0000and to propose a method of protection against this type of attacks.\u0000Asynchronous Local Procedure Call technology (ALPC) is used in various Windows\u0000information protection systems, including antivirus systems (AV) and Endpoint\u0000Detection and Response systems (EDR). To ensure the concealment of malicious\u0000software, attackers need to disrupt the operation of AV, EDR tools, which in\u0000turn can be achieved by destructive impact on the components of the ALPC\u0000technology. Examples of such attacks already exist and are covered in this\u0000paper. To counteract such new threats, it is necessary to advance the\u0000improvement of information security systems and the ALPC security research was\u0000conducted. The most difficult case, Windows kernel driver attack, was\u0000considered. Three attacks on the ALPC connection were carried out, based on\u0000changing the ALPC structures in the kernel memory, which led to creation of\u0000illegitimate connections in the system and the disruption of correct\u0000connections. ALPChecker protection tool has been developed. The tool was\u0000successfully tested on three demonstrated attacks.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"87 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139096135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching 关注、提炼和表格化:基于神经网络的实用预取
Pub Date : 2023-12-23 DOI: arxiv-2401.06362
Pengmiao Zhang, Neelesh Gupta, Rajgopal Kannan, Viktor K. Prasanna
Attention-based Neural Networks (NN) have demonstrated their effectiveness inaccurate memory access prediction, an essential step in data prefetching.However, the substantial computational overheads associated with these modelsresult in high inference latency, limiting their feasibility as practicalprefetchers. To close the gap, we propose a new approach based ontabularization that significantly reduces model complexity and inferencelatency without sacrificing prediction accuracy. Our novel tabularizationmethodology takes as input a distilled, yet highly accurate attention-basedmodel for memory access prediction and efficiently converts its expensivematrix multiplications into a hierarchy of fast table lookups. As an exemplarof the above approach, we develop DART, a prefetcher comprised of a simplehierarchy of tables. With a modest 0.09 drop in F1-score, DART reduces 99.99%of arithmetic operations from the large attention-based model and 91.83% fromthe distilled model. DART accelerates the large model inference by 170x and thedistilled model by 9.4x. DART has comparable latency and storage costs asstate-of-the-art rule-based prefetcher BO but surpasses it by 6.1% in IPCimprovement, resulting in a 37.6% speed-up. DART outperforms state-of-the-artNN-based prefetchers TransFetch by 33.1% and Voyager by 37.2% in terms of IPCimprovement, primarily due to its low prefetching latency.
基于注意力的神经网络(NN)在不准确的内存访问预测(数据预取的一个重要步骤)方面已经证明了其有效性。然而,与这些模型相关的大量计算开销导致了较高的推理延迟,限制了它们作为实用预取器的可行性。为了缩小差距,我们提出了一种基于表格化的新方法,它能在不牺牲预测准确性的前提下显著降低模型复杂度和推理延迟。我们新颖的表格化方法将经过提炼但高度精确的基于注意力的内存访问预测模型作为输入,并将其昂贵的矩阵乘法有效地转换为分层的快速表格查找。作为上述方法的范例,我们开发了 DART,一种由简单的表层次结构组成的预取器。在 F1 分数下降 0.09 的情况下,DART 从基于注意力的大型模型中减少了 99.99% 的算术运算,从经过提炼的模型中减少了 91.83%。DART 将大型模型的推理速度提高了 170 倍,将蒸馏模型的推理速度提高了 9.4 倍。DART 的延迟和存储成本与最先进的基于规则的预取器 BO 不相上下,但在 IPC 提升方面却比它高出 6.1%,速度提高了 37.6%。在 IPCimprovement 方面,DART 比基于最新网络的预取器 TransFetch 快 33.1%,比 Voyager 快 37.2%,这主要归功于其较低的预取延迟。
{"title":"Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching","authors":"Pengmiao Zhang, Neelesh Gupta, Rajgopal Kannan, Viktor K. Prasanna","doi":"arxiv-2401.06362","DOIUrl":"https://doi.org/arxiv-2401.06362","url":null,"abstract":"Attention-based Neural Networks (NN) have demonstrated their effectiveness in\u0000accurate memory access prediction, an essential step in data prefetching.\u0000However, the substantial computational overheads associated with these models\u0000result in high inference latency, limiting their feasibility as practical\u0000prefetchers. To close the gap, we propose a new approach based on\u0000tabularization that significantly reduces model complexity and inference\u0000latency without sacrificing prediction accuracy. Our novel tabularization\u0000methodology takes as input a distilled, yet highly accurate attention-based\u0000model for memory access prediction and efficiently converts its expensive\u0000matrix multiplications into a hierarchy of fast table lookups. As an exemplar\u0000of the above approach, we develop DART, a prefetcher comprised of a simple\u0000hierarchy of tables. With a modest 0.09 drop in F1-score, DART reduces 99.99%\u0000of arithmetic operations from the large attention-based model and 91.83% from\u0000the distilled model. DART accelerates the large model inference by 170x and the\u0000distilled model by 9.4x. DART has comparable latency and storage costs as\u0000state-of-the-art rule-based prefetcher BO but surpasses it by 6.1% in IPC\u0000improvement, resulting in a 37.6% speed-up. DART outperforms state-of-the-art\u0000NN-based prefetchers TransFetch by 33.1% and Voyager by 37.2% in terms of IPC\u0000improvement, primarily due to its low prefetching latency.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139470795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU PowerInfer:使用消费级 GPU 快速处理大型语言模型
Pub Date : 2023-12-16 DOI: arxiv-2312.12456
Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
This paper introduces PowerInfer, a high-speed Large Language Model (LLM)inference engine on a personal computer (PC) equipped with a singleconsumer-grade GPU. The key underlying the design of PowerInfer is exploitingthe high locality inherent in LLM inference, characterized by a power-lawdistribution in neuron activation. This distribution indicates that a smallsubset of neurons, termed hot neurons, are consistently activated acrossinputs, while the majority, cold neurons, vary based on specific inputs.PowerInfer exploits such an insight to design a GPU-CPU hybrid inferenceengine: hot-activated neurons are preloaded onto the GPU for fast access, whilecold-activated neurons are computed on the CPU, thus significantly reducing GPUmemory demands and CPU-GPU data transfers. PowerInfer further integratesadaptive predictors and neuron-aware sparse operators, optimizing theefficiency of neuron activation and computational sparsity. Evaluation showsthat PowerInfer attains an average token generation rate of 13.20 tokens/s,with a peak of 29.08 tokens/s, across various LLMs (including OPT-175B) on asingle NVIDIA RTX 4090 GPU, only 18% lower than that achieved by a top-tierserver-grade A100 GPU. This significantly outperforms llama.cpp by up to 11.69xwhile retaining model accuracy.
本文介绍了 PowerInfer,这是一种在配备了单个消费级 GPU 的个人计算机(PC)上运行的高速大型语言模型(LLM)推理引擎。PowerInfer 设计的关键在于利用 LLM 推理固有的高局部性,其特点是神经元激活的幂律分布。这种分布表明,一小部分神经元(称为热神经元)在不同的输入中被持续激活,而大部分神经元(称为冷神经元)则根据特定的输入而变化。PowerInfer 利用这种洞察力设计了一种 GPU-CPU 混合推理引擎:热激活神经元被预先加载到 GPU 上以实现快速访问,而冷激活神经元则在 CPU 上进行计算,从而大大减少了对 GPU 内存的需求以及 CPU-GPU 之间的数据传输。PowerInfer 进一步集成了自适应预测器和神经元感知稀疏算子,优化了神经元激活效率和计算稀疏性。评估结果表明,在各种 LLM(包括 OPT-175B)上,PowerInfer 在单个英伟达 RTX 4090 GPU 上的平均令牌生成率为 13.20 个令牌/秒,峰值为 29.08 个令牌/秒,仅比顶级服务器级 A100 GPU 低 18%。在保持模型准确性的同时,其性能比 llama.cpp 高出 11.69 倍。
{"title":"PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU","authors":"Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen","doi":"arxiv-2312.12456","DOIUrl":"https://doi.org/arxiv-2312.12456","url":null,"abstract":"This paper introduces PowerInfer, a high-speed Large Language Model (LLM)\u0000inference engine on a personal computer (PC) equipped with a single\u0000consumer-grade GPU. The key underlying the design of PowerInfer is exploiting\u0000the high locality inherent in LLM inference, characterized by a power-law\u0000distribution in neuron activation. This distribution indicates that a small\u0000subset of neurons, termed hot neurons, are consistently activated across\u0000inputs, while the majority, cold neurons, vary based on specific inputs.\u0000PowerInfer exploits such an insight to design a GPU-CPU hybrid inference\u0000engine: hot-activated neurons are preloaded onto the GPU for fast access, while\u0000cold-activated neurons are computed on the CPU, thus significantly reducing GPU\u0000memory demands and CPU-GPU data transfers. PowerInfer further integrates\u0000adaptive predictors and neuron-aware sparse operators, optimizing the\u0000efficiency of neuron activation and computational sparsity. Evaluation shows\u0000that PowerInfer attains an average token generation rate of 13.20 tokens/s,\u0000with a peak of 29.08 tokens/s, across various LLMs (including OPT-175B) on a\u0000single NVIDIA RTX 4090 GPU, only 18% lower than that achieved by a top-tier\u0000server-grade A100 GPU. This significantly outperforms llama.cpp by up to 11.69x\u0000while retaining model accuracy.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138825467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On a Foundation Model for Operating Systems 关于操作系统的基础模型
Pub Date : 2023-12-13 DOI: arxiv-2312.07813
Divyanshu Saxena, Nihal Sharma, Donghyun Kim, Rohit Dwivedula, Jiayi Chen, Chenxi Yang, Sriram Ravula, Zichao Hu, Aditya Akella, Sebastian Angel, Joydeep Biswas, Swarat Chaudhuri, Isil Dillig, Alex Dimakis, P. Brighten Godfrey, Daehyeok Kim, Chris Rossbach, Gang Wang
This paper lays down the research agenda for a domain-specific foundationmodel for operating systems (OSes). Our case for a foundation model revolvesaround the observations that several OS components such as CPU, memory, andnetwork subsystems are interrelated and that OS traces offer the ideal datasetfor a foundation model to grasp the intricacies of diverse OS components andtheir behavior in varying environments and workloads. We discuss a wide rangeof possibilities that then arise, from employing foundation models as policyagents to utilizing them as generators and predictors to assist traditional OScontrol algorithms. Our hope is that this paper spurs further research into OSfoundation models and creating the next generation of operating systems for theevolving computing landscape.
本文为操作系统(os)特定领域的基础模型提出了研究议程。我们对基础模型的案例围绕着几个操作系统组件(如CPU、内存和网络子系统)是相互关联的观察,并且操作系统跟踪为基础模型提供了理想的数据集,以掌握不同操作系统组件的复杂性及其在不同环境和工作负载中的行为。我们讨论了随后出现的各种可能性,从使用基础模型作为策略代理到利用它们作为生成器和预测器来辅助传统的操作系统控制算法。我们希望这篇论文能激发对os基础模型的进一步研究,并为不断发展的计算领域创造下一代操作系统。
{"title":"On a Foundation Model for Operating Systems","authors":"Divyanshu Saxena, Nihal Sharma, Donghyun Kim, Rohit Dwivedula, Jiayi Chen, Chenxi Yang, Sriram Ravula, Zichao Hu, Aditya Akella, Sebastian Angel, Joydeep Biswas, Swarat Chaudhuri, Isil Dillig, Alex Dimakis, P. Brighten Godfrey, Daehyeok Kim, Chris Rossbach, Gang Wang","doi":"arxiv-2312.07813","DOIUrl":"https://doi.org/arxiv-2312.07813","url":null,"abstract":"This paper lays down the research agenda for a domain-specific foundation\u0000model for operating systems (OSes). Our case for a foundation model revolves\u0000around the observations that several OS components such as CPU, memory, and\u0000network subsystems are interrelated and that OS traces offer the ideal dataset\u0000for a foundation model to grasp the intricacies of diverse OS components and\u0000their behavior in varying environments and workloads. We discuss a wide range\u0000of possibilities that then arise, from employing foundation models as policy\u0000agents to utilizing them as generators and predictors to assist traditional OS\u0000control algorithms. Our hope is that this paper spurs further research into OS\u0000foundation models and creating the next generation of operating systems for the\u0000evolving computing landscape.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138632265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security, extensibility, and redundancy in the Metabolic Operating System 新陈代谢操作系统的安全性、可扩展性和冗余性
Pub Date : 2023-12-11 DOI: arxiv-2401.01357
Samuel T. King
People living with Type 1 Diabetes (T1D) lose the ability to produce insulinnaturally. To compensate, they inject synthetic insulin. One common way toinject insulin is through automated insulin delivery systems, which use sensorsto monitor their metabolic state and an insulin pump device to adjust insulinto adapt. In this paper, we present the Metabolic Operating System, a new automatedinsulin delivery system that we designed from the ground up using securityfirst principles. From an architecture perspective, we apply separationprinciples to simplify the core system and isolate non-critical functionalityfrom the core closed-loop algorithm. From an algorithmic perspective, weevaluate trends in insulin technology and formulate a simple, but effective,algorithm given the state-of-the-art. From a safety perspective, we build inmultiple layers of redundancy to ensure that the person using our systemremains safe. Fundamentally, this paper is a paper on real-world experiences building andrunning an automated insulin delivery system. We report on the designiterations we make based on experiences working with one individual using oursystem. Our evaluation shows that an automated insulin delivery system builtfrom the ground up using security first principles can still help manage T1Deffectively. Our source code is open source and available on GitHub (link omitted).
1 型糖尿病(T1D)患者失去了自然产生胰岛素的能力。为了弥补这一缺陷,他们需要注射合成胰岛素。注射胰岛素的一种常见方式是通过自动胰岛素输送系统,该系统使用传感器监测患者的新陈代谢状态,并使用胰岛素泵设备调节胰岛素以适应新陈代谢。在本文中,我们介绍了新陈代谢操作系统,这是一种全新的自动胰岛素输送系统,我们采用安全第一的原则从头开始设计该系统。从架构的角度来看,我们运用分离原则简化了核心系统,并将非关键功能与核心闭环算法隔离开来。从算法角度来看,我们评估了胰岛素技术的发展趋势,并根据最新技术制定了简单而有效的算法。从安全角度来看,我们建立了多层冗余,以确保使用我们系统的人的安全。从根本上说,本文是一篇关于构建和运行胰岛素自动输送系统的实际经验的论文。我们报告了根据与一个使用我们系统的人的合作经验所做的评估。我们的评估结果表明,采用安全第一原则从头构建的胰岛素自动给药系统仍能帮助有效管理 T1。我们的源代码是开源的,可在 GitHub 上获取(链接省略)。
{"title":"Security, extensibility, and redundancy in the Metabolic Operating System","authors":"Samuel T. King","doi":"arxiv-2401.01357","DOIUrl":"https://doi.org/arxiv-2401.01357","url":null,"abstract":"People living with Type 1 Diabetes (T1D) lose the ability to produce insulin\u0000naturally. To compensate, they inject synthetic insulin. One common way to\u0000inject insulin is through automated insulin delivery systems, which use sensors\u0000to monitor their metabolic state and an insulin pump device to adjust insulin\u0000to adapt. In this paper, we present the Metabolic Operating System, a new automated\u0000insulin delivery system that we designed from the ground up using security\u0000first principles. From an architecture perspective, we apply separation\u0000principles to simplify the core system and isolate non-critical functionality\u0000from the core closed-loop algorithm. From an algorithmic perspective, we\u0000evaluate trends in insulin technology and formulate a simple, but effective,\u0000algorithm given the state-of-the-art. From a safety perspective, we build in\u0000multiple layers of redundancy to ensure that the person using our system\u0000remains safe. Fundamentally, this paper is a paper on real-world experiences building and\u0000running an automated insulin delivery system. We report on the design\u0000iterations we make based on experiences working with one individual using our\u0000system. Our evaluation shows that an automated insulin delivery system built\u0000from the ground up using security first principles can still help manage T1D\u0000effectively. Our source code is open source and available on GitHub (link omitted).","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"215 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139096273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KEN: Kernel Extensions using Natural Language KEN:使用自然语言的内核扩展
Pub Date : 2023-12-09 DOI: arxiv-2312.05531
Yusheng Zheng, Yiwei Yang, Maolin Chen, Andrew Quinn
The ability to modify and extend an operating system is an important featurefor improving a system's security, reliability, and performance. The extendedBerkeley Packet Filters (eBPF) ecosystem has emerged as the standard mechanismfor extending the Linux kernel and has recently been ported to Windows. eBPFprograms inject new logic into the kernel that the system will execute beforeor after existing logic. While the eBPF ecosystem provides a flexible mechanismfor kernel extension, it is difficult for developers to write eBPF programstoday. An eBPF developer must have deep knowledge of the internals of theoperating system to determine where to place logic and cope with programminglimitations on the control flow and data accesses of their eBPF programenforced by the eBPF verifier. This paper presents KEN, an alternativeframework that alleviates the difficulty of writing an eBPF program by allowingKernel Extensions to be written in Natural language. KEN uses recent advancesin large language models (LLMs) to synthesize an eBPF program given a user'sEnglish language prompt. To ensure that LLM's output is semantically equivalentto the user's prompt, KEN employs a combination of LLM-empowered programcomprehension, symbolic execution, and a series of feedback loops. KEN's keynovelty is the combination of these techniques. In particular, the system usessymbolic execution in a novel structure that allows it to combine the resultsof program synthesis and program comprehension and build on the recent successthat LLMs have shown for each of these tasks individually. To evaluate KEN, wedeveloped a new corpus of natural language prompts for eBPF programs. We showthat KEN produces correct eBPF programs on 80% which is an improvement of afactor of 2.67 compared to an LLM-empowered program synthesis baseline.
修改和扩展操作系统的能力是提高系统安全性、可靠性和性能的一项重要功能。扩展的伯克利数据包过滤器(eBPF)生态系统已成为扩展 Linux 内核的标准机制,最近还被移植到了 Windows 上。eBPF 程序将新的逻辑注入内核,系统将在现有逻辑之前或之后执行这些逻辑。虽然 eBPF 生态系统为内核扩展提供了一种灵活的机制,但如今开发人员很难编写 eBPF 程序。eBPF 开发者必须对操作系统的内部结构有深入的了解,才能确定在何处放置逻辑,并应对 eBPF 验证器对其 eBPF 程序的控制流和数据访问的编程限制。本文介绍的 KEN 是一个替代框架,它允许使用自然语言编写内核扩展,从而减轻了编写 eBPF 程序的难度。KEN 利用大语言模型(LLM)的最新进展,根据用户的英语提示合成 eBPF 程序。为确保 LLM 的输出在语义上等同于用户的提示,KEN 结合使用了 LLM 驱动的程序理解、符号执行和一系列反馈回路。KEN 的关键之处在于这些技术的结合。特别是,该系统在一种新颖的结构中使用了符号执行,从而将程序合成和程序理解的结果结合起来,并以 LLM 最近在这两项任务中分别取得的成功为基础。为了评估 KEN,我们为 eBPF 程序开发了一个新的自然语言提示语料库。结果表明,KEN 生成的 eBPF 程序正确率达到 80%,与 LLM 支持的程序合成基线相比,提高了 2.67 个系数。
{"title":"KEN: Kernel Extensions using Natural Language","authors":"Yusheng Zheng, Yiwei Yang, Maolin Chen, Andrew Quinn","doi":"arxiv-2312.05531","DOIUrl":"https://doi.org/arxiv-2312.05531","url":null,"abstract":"The ability to modify and extend an operating system is an important feature\u0000for improving a system's security, reliability, and performance. The extended\u0000Berkeley Packet Filters (eBPF) ecosystem has emerged as the standard mechanism\u0000for extending the Linux kernel and has recently been ported to Windows. eBPF\u0000programs inject new logic into the kernel that the system will execute before\u0000or after existing logic. While the eBPF ecosystem provides a flexible mechanism\u0000for kernel extension, it is difficult for developers to write eBPF programs\u0000today. An eBPF developer must have deep knowledge of the internals of the\u0000operating system to determine where to place logic and cope with programming\u0000limitations on the control flow and data accesses of their eBPF program\u0000enforced by the eBPF verifier. This paper presents KEN, an alternative\u0000framework that alleviates the difficulty of writing an eBPF program by allowing\u0000Kernel Extensions to be written in Natural language. KEN uses recent advances\u0000in large language models (LLMs) to synthesize an eBPF program given a user's\u0000English language prompt. To ensure that LLM's output is semantically equivalent\u0000to the user's prompt, KEN employs a combination of LLM-empowered program\u0000comprehension, symbolic execution, and a series of feedback loops. KEN's key\u0000novelty is the combination of these techniques. In particular, the system uses\u0000symbolic execution in a novel structure that allows it to combine the results\u0000of program synthesis and program comprehension and build on the recent success\u0000that LLMs have shown for each of these tasks individually. To evaluate KEN, we\u0000developed a new corpus of natural language prompts for eBPF programs. We show\u0000that KEN produces correct eBPF programs on 80% which is an improvement of a\u0000factor of 2.67 compared to an LLM-empowered program synthesis baseline.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"81 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138575611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SYSFLOW: Efficient Execution Platform for IoT Devices SYSFLOW:物联网设备的高效执行平台
Pub Date : 2023-12-08 DOI: arxiv-2312.04871
Jun Lu, Zhenya Ma, Yinggang Gao, Ju Ren, Yaoxue Zhang
Traditional executable delivery models pose challenges for IoT devices withlimited storage, necessitating the download of complete executables anddependencies. Network solutions like NFS, designed for data files, encounterhigh IO overhead for irregular access patterns. This paper introduces SYSFLOW,a lightweight network-based executable delivery system for IoT. SYSFLOWdelivers on-demand, redirecting local disk IO to the server through optimizednetwork IO. To optimize cache hit rates, SYSFLOW employs server-sideaction-based prefetching, reducing latency by 45.1% to 75.8% compared to nativeLinux filesystems on SD cards. In wired environments, SYSFLOW's latency is upto 67.7% lower than NFS. In wireless scenarios, SYSFLOW performs 22.9% worsethan Linux, comparable with Linux and outperforming NFS by up to 60.7%. WhileSYSFLOW's power consumption may be 6.7% higher than NFS, it offers energysavings due to lower processing time.
传统的可执行文件交付模式给存储空间有限的物联网设备带来了挑战,因为它们必须下载完整的可执行文件和依赖项。像 NFS 这样专为数据文件设计的网络解决方案,在不规则访问模式下会遇到很高的 IO 开销。本文介绍了 SYSFLOW,这是一种适用于物联网的基于网络的轻量级可执行文件交付系统。SYSFLOW 按需交付,通过优化的网络 IO 将本地磁盘 IO 重定向到服务器。为了优化缓存命中率,SYSFLOW 采用了基于服务器端动作的预取技术,与 SD 卡上的本地 Linux 文件系统相比,延迟降低了 45.1% 到 75.8%。在有线环境中,SYSFLOW 的延迟比 NFS 低 67.7%。在无线环境中,SYSFLOW 的性能比 Linux 差 22.9%,与 Linux 不相上下,比 NFS 高出 60.7%。虽然SYSFLOW的功耗可能比NFS高6.7%,但由于处理时间更短,因此可以节省能源。
{"title":"SYSFLOW: Efficient Execution Platform for IoT Devices","authors":"Jun Lu, Zhenya Ma, Yinggang Gao, Ju Ren, Yaoxue Zhang","doi":"arxiv-2312.04871","DOIUrl":"https://doi.org/arxiv-2312.04871","url":null,"abstract":"Traditional executable delivery models pose challenges for IoT devices with\u0000limited storage, necessitating the download of complete executables and\u0000dependencies. Network solutions like NFS, designed for data files, encounter\u0000high IO overhead for irregular access patterns. This paper introduces SYSFLOW,\u0000a lightweight network-based executable delivery system for IoT. SYSFLOW\u0000delivers on-demand, redirecting local disk IO to the server through optimized\u0000network IO. To optimize cache hit rates, SYSFLOW employs server-side\u0000action-based prefetching, reducing latency by 45.1% to 75.8% compared to native\u0000Linux filesystems on SD cards. In wired environments, SYSFLOW's latency is up\u0000to 67.7% lower than NFS. In wireless scenarios, SYSFLOW performs 22.9% worse\u0000than Linux, comparable with Linux and outperforming NFS by up to 60.7%. While\u0000SYSFLOW's power consumption may be 6.7% higher than NFS, it offers energy\u0000savings due to lower processing time.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138575829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Operating Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1