首页 > 最新文献

arXiv - CS - Programming Languages最新文献

英文 中文
Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML 描述数据,获取科学数据就绪工具:作为开泰结构 YAML 目标的尴尬
Pub Date : 2024-07-19 DOI: arxiv-2407.14461
Manasvi Goyal, Andrea Zonca, Amy Roberts, Jim Pivarski, Ianna Osborne
In some fields, scientific data formats differ across experiments due tospecialized hardware and data acquisition systems. Researchers need to develop,document, and maintain experiment-specific analysis software to interact withthese data formats. These software are often tightly coupled with a particulardata format. This proliferation of custom data formats has been a prominentchallenge for small to mid-scale experiments. The widespread adoption of ROOThas largely mitigated this problem for the Large Hadron Collider experiments.However, many smaller experiments continue to use custom data formats to meetspecific research needs. Therefore, simplifying the process of accessing aunique data format for analysis holds immense value for scientific communitieswithin HEP. We have added Awkward Arrays as a target language for Kaitai Structfor this purpose. Researchers can describe their custom data format in theKaitai Struct YAML (KSY) language. The Kaitai Struct Compiler generates C++code to fill the LayoutBuilder buffers using the KSY format. In a few steps,the Kaitai Struct Awkward Runtime API can convert the generated C++ code into acompiled Python module. Finally, the raw data can be passed to the module toproduce Awkward Arrays. This paper introduces the Awkward Target for the KaitaiStruct Compiler and the Kaitai Struct Awkward Runtime API. It also demonstratesthe conversion of a given KSY for a specific custom file format to AwkwardArrays.
在某些领域,由于硬件和数据采集系统的特殊性,不同实验的科学数据格式各不相同。研究人员需要开发、记录和维护特定于实验的分析软件,以便与这些数据格式进行交互。这些软件通常与特定的数据格式紧密结合。定制数据格式的激增一直是中小型实验面临的一个突出挑战。ROOT 的广泛采用在很大程度上缓解了大型强子对撞机实验的这一问题。然而,许多小型实验仍在继续使用自定义数据格式,以满足特定的研究需求。因此,简化访问独特数据格式进行分析的过程对于 HEP 内的科学界具有巨大价值。为此,我们为 Kaitai Struct 增加了 Awkward Arrays 作为目标语言。研究人员可以用 Kaitai Struct YAML(KSY)语言描述他们的自定义数据格式。Kaitai Struct 编译器会生成 C++ 代码,使用 KSY 格式填充 LayoutBuilder 缓冲区。只需几步,Kaitai Struct Awkward Runtime API 就能将生成的 C++ 代码转换为编译后的 Python 模块。最后,原始数据可以传递给模块,生成 Awkward 数组。本文介绍了用于 KaitaiStruct 编译器的 Awkward Target 和 Kaitai Struct Awkward Runtime API。本文还演示了将特定自定义文件格式的给定 KSY 转换为 AwkwardArrays 的过程。
{"title":"Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML","authors":"Manasvi Goyal, Andrea Zonca, Amy Roberts, Jim Pivarski, Ianna Osborne","doi":"arxiv-2407.14461","DOIUrl":"https://doi.org/arxiv-2407.14461","url":null,"abstract":"In some fields, scientific data formats differ across experiments due to\u0000specialized hardware and data acquisition systems. Researchers need to develop,\u0000document, and maintain experiment-specific analysis software to interact with\u0000these data formats. These software are often tightly coupled with a particular\u0000data format. This proliferation of custom data formats has been a prominent\u0000challenge for small to mid-scale experiments. The widespread adoption of ROOT\u0000has largely mitigated this problem for the Large Hadron Collider experiments.\u0000However, many smaller experiments continue to use custom data formats to meet\u0000specific research needs. Therefore, simplifying the process of accessing a\u0000unique data format for analysis holds immense value for scientific communities\u0000within HEP. We have added Awkward Arrays as a target language for Kaitai Struct\u0000for this purpose. Researchers can describe their custom data format in the\u0000Kaitai Struct YAML (KSY) language. The Kaitai Struct Compiler generates C++\u0000code to fill the LayoutBuilder buffers using the KSY format. In a few steps,\u0000the Kaitai Struct Awkward Runtime API can convert the generated C++ code into a\u0000compiled Python module. Finally, the raw data can be passed to the module to\u0000produce Awkward Arrays. This paper introduces the Awkward Target for the Kaitai\u0000Struct Compiler and the Kaitai Struct Awkward Runtime API. It also demonstrates\u0000the conversion of a given KSY for a specific custom file format to Awkward\u0000Arrays.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141737354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Approximate Relational Reasoning for Higher-Order Probabilistic Programs 高阶概率程序的近似关系推理
Pub Date : 2024-07-19 DOI: arxiv-2407.14107
Philipp G. Haselwarter, Kwing Hei Li, Alejandro Aguirre, Simon Oddershede Gregersen, Joseph Tassarotti, Lars Birkedal
Properties such as provable security and correctness for randomized programsare naturally expressed relationally as approximate equivalences. As a result,a number of relational program logics have been developed to reason about suchapproximate equivalences of probabilistic programs. However, existingapproximate relational logics are mostly restricted to first-order programswithout general state. In this paper we develop Approxis, a higher-order approximate relationalseparation logic for reasoning about approximate equivalence of programswritten in an expressive ML-like language with discrete probabilistic sampling,higher-order functions, and higher-order state. The Approxis logic recasts theconcept of error credits in the relational setting to reason about relationalapproximation, which allows for expressive notions of modularity andcomposition, a range of new approximate relational rules, and aninternalization of a standard limiting argument for showing exact probabilisticequivalences by approximation. We also use Approxis to develop a logicalrelation model that quantifies over error credits, which can be used to proveexact contextual equivalence. We demonstrate the flexibility of our approach ona range of examples, including the PRP/PRF switching lemma, IND$-CPA securityof an encryption scheme, and a collection of rejection samplers. All of theresults have been mechanized in the Coq proof assistant and the Iris separationlogic framework.
随机化程序的可证明安全性和正确性等属性,可以自然地通过关系表达为近似等价。因此,人们开发了许多关系程序逻辑来推理概率程序的近似等价性。然而,现有的近似关系逻辑大多局限于没有一般状态的一阶程序。在本文中,我们开发了一种高阶近似关系分离逻辑 Approxis,用于推理用具有离散概率采样、高阶函数和高阶状态的表达式 ML 样语言编写的程序的近似等价性。Approxis 逻辑重构了关系设置中的误差信用概念,以推理关系近似,它允许模块化和组合的表达式概念、一系列新的近似关系规则,以及标准限制论证的内部化,从而通过近似来显示精确的概率等价性。我们还利用 Approxis 开发了一个逻辑关联模型,该模型可量化错误信用,并可用于证明精确的上下文等价性。我们在一系列示例中展示了我们方法的灵活性,包括 PRP/PRF 切换两难、加密方案的 IND$-CPA 安全性以及一系列拒绝采样器。所有结果都已在 Coq 证明助手和 Iris 分离逻辑框架中实现了机械化。
{"title":"Approximate Relational Reasoning for Higher-Order Probabilistic Programs","authors":"Philipp G. Haselwarter, Kwing Hei Li, Alejandro Aguirre, Simon Oddershede Gregersen, Joseph Tassarotti, Lars Birkedal","doi":"arxiv-2407.14107","DOIUrl":"https://doi.org/arxiv-2407.14107","url":null,"abstract":"Properties such as provable security and correctness for randomized programs\u0000are naturally expressed relationally as approximate equivalences. As a result,\u0000a number of relational program logics have been developed to reason about such\u0000approximate equivalences of probabilistic programs. However, existing\u0000approximate relational logics are mostly restricted to first-order programs\u0000without general state. In this paper we develop Approxis, a higher-order approximate relational\u0000separation logic for reasoning about approximate equivalence of programs\u0000written in an expressive ML-like language with discrete probabilistic sampling,\u0000higher-order functions, and higher-order state. The Approxis logic recasts the\u0000concept of error credits in the relational setting to reason about relational\u0000approximation, which allows for expressive notions of modularity and\u0000composition, a range of new approximate relational rules, and an\u0000internalization of a standard limiting argument for showing exact probabilistic\u0000equivalences by approximation. We also use Approxis to develop a logical\u0000relation model that quantifies over error credits, which can be used to prove\u0000exact contextual equivalence. We demonstrate the flexibility of our approach on\u0000a range of examples, including the PRP/PRF switching lemma, IND$-CPA security\u0000of an encryption scheme, and a collection of rejection samplers. All of the\u0000results have been mechanized in the Coq proof assistant and the Iris separation\u0000logic framework.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141737353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning 结合 R 和 Python 代码进行数据科学、机器学习和强化学习的综合指南
Pub Date : 2024-07-19 DOI: arxiv-2407.14695
Alejandro L. García Navarro, Nataliia Koneva, Alfonso Sánchez-Macián, José Alberto Hernández
Python has gained widespread popularity in the fields of machine learning,artificial intelligence, and data engineering due to its effectiveness andextensive libraries. R, on its side, remains a dominant language forstatistical analysis and visualization. However, certain libraries have becomeoutdated, limiting their functionality and performance. Users can use Python'sadvanced machine learning and AI capabilities alongside R's robust statisticalpackages by combining these two programming languages. This paper exploresusing R's reticulate package to call Python from R, providing practicalexamples and highlighting scenarios where this integration enhancesproductivity and analytical capabilities. With a few hello-world code snippets,we demonstrate how to run Python's scikit-learn, pytorch and OpenAI gymlibraries for building Machine Learning, Deep Learning, and ReinforcementLearning projects easily.
Python 因其高效和丰富的库而在机器学习、人工智能和数据工程领域广受欢迎。R 则仍然是统计分析和可视化的主流语言。然而,某些库已经过时,限制了其功能和性能。用户可以通过将 Python 和 R 这两种编程语言结合起来,在使用 R 的强大统计软件包的同时,使用 Python 先进的机器学习和人工智能功能。本文探讨了如何使用 R 的 reticulate 包从 R 中调用 Python,并提供了实际示例,重点介绍了这种集成可以提高生产率和分析能力的应用场景。我们将通过一些hello-world代码片段,演示如何运行Python的scikit-learn、pytorch和OpenAI gymlibraries,轻松构建机器学习、深度学习和强化学习项目。
{"title":"A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning","authors":"Alejandro L. García Navarro, Nataliia Koneva, Alfonso Sánchez-Macián, José Alberto Hernández","doi":"arxiv-2407.14695","DOIUrl":"https://doi.org/arxiv-2407.14695","url":null,"abstract":"Python has gained widespread popularity in the fields of machine learning,\u0000artificial intelligence, and data engineering due to its effectiveness and\u0000extensive libraries. R, on its side, remains a dominant language for\u0000statistical analysis and visualization. However, certain libraries have become\u0000outdated, limiting their functionality and performance. Users can use Python's\u0000advanced machine learning and AI capabilities alongside R's robust statistical\u0000packages by combining these two programming languages. This paper explores\u0000using R's reticulate package to call Python from R, providing practical\u0000examples and highlighting scenarios where this integration enhances\u0000productivity and analytical capabilities. With a few hello-world code snippets,\u0000we demonstrate how to run Python's scikit-learn, pytorch and OpenAI gym\u0000libraries for building Machine Learning, Deep Learning, and Reinforcement\u0000Learning projects easily.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141775837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reactive graphs in action (extended version) 运行中的反应图(扩展版)
Pub Date : 2024-07-19 DOI: arxiv-2407.14705
David Tinoco, Alexandre Madeira, Manuel A. Martins, José Proença
Reactive graphs are transition structures whereas edges become active andinactive during its evolution, that were introduced by Dov Gabbay from amathematical's perspective. This paper presents Marge(https://fm-dcc.github.io/MARGe), a web-based tool to visualise and analysereactive graphs enriched with labels. Marge animates the operational semanticsof reactive graphs and offers different graphical views to provide insightsover concrete systems. We motivate the applicability of reactive graphs foradaptive systems and for featured transition systems, using Marge to tightenthe gap between the existing theoretical models and their usage to analyseconcrete systems.
反应图是一种过渡结构,在其演化过程中,边会变得活跃和不活跃,这种结构是由 Dov Gabbay 从数学角度提出的。本文介绍的 Marge(https://fm-dcc.github.io/MARGe) 是一种基于网络的工具,用于可视化和分析带有标签的反应图。Marge 将反应图的操作语义动画化,并提供不同的图形视图,以便深入了解具体系统。我们利用 Marge 拉近了现有理论模型与用于分析具体系统之间的差距,从而激发了反应图在自适应系统和特征转换系统中的适用性。
{"title":"Reactive graphs in action (extended version)","authors":"David Tinoco, Alexandre Madeira, Manuel A. Martins, José Proença","doi":"arxiv-2407.14705","DOIUrl":"https://doi.org/arxiv-2407.14705","url":null,"abstract":"Reactive graphs are transition structures whereas edges become active and\u0000inactive during its evolution, that were introduced by Dov Gabbay from a\u0000mathematical's perspective. This paper presents Marge\u0000(https://fm-dcc.github.io/MARGe), a web-based tool to visualise and analyse\u0000reactive graphs enriched with labels. Marge animates the operational semantics\u0000of reactive graphs and offers different graphical views to provide insights\u0000over concrete systems. We motivate the applicability of reactive graphs for\u0000adaptive systems and for featured transition systems, using Marge to tighten\u0000the gap between the existing theoretical models and their usage to analyse\u0000concrete systems.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141775949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusing Gathers with Integer Linear Programming 融合收集与整数线性规划
Pub Date : 2024-07-18 DOI: arxiv-2407.13585
David van Balen, Gabriele Keller, Ivo Gabede Wolff, Trevor L. McDonell
We present an Integer Linear Programming based approach to finding theoptimal fusion strategy for combinator-based parallel programs. Whilecombinator-based languages or libraries provide a convenient interface forprogramming parallel hardware, fusing combinators to more complex operations isessential to achieve the desired performance. Our approach is not only suitablefor languages with the usual map, fold, scan, indexing and scatter operations,but also gather operations, which access arrays in arbitrary order, andtherefore goes beyond the traditional producer-consumer fusion. It can beparametrised with appropriate cost functions, and is fast enough to be suitablefor just-in-time compilation.
我们提出了一种基于整数线性规划的方法,为基于组合器的并行程序寻找最佳融合策略。虽然基于组合器的语言或程序库为并行硬件编程提供了方便的接口,但要达到理想的性能,必须将组合器与更复杂的操作相融合。我们的方法不仅适用于具有常用的映射、折叠、扫描、索引和分散操作的语言,也适用于以任意顺序访问数组的集合操作,因此超越了传统的生产者-消费者融合。它可以用适当的代价函数进行分解,而且速度足够快,适用于即时编译。
{"title":"Fusing Gathers with Integer Linear Programming","authors":"David van Balen, Gabriele Keller, Ivo Gabede Wolff, Trevor L. McDonell","doi":"arxiv-2407.13585","DOIUrl":"https://doi.org/arxiv-2407.13585","url":null,"abstract":"We present an Integer Linear Programming based approach to finding the\u0000optimal fusion strategy for combinator-based parallel programs. While\u0000combinator-based languages or libraries provide a convenient interface for\u0000programming parallel hardware, fusing combinators to more complex operations is\u0000essential to achieve the desired performance. Our approach is not only suitable\u0000for languages with the usual map, fold, scan, indexing and scatter operations,\u0000but also gather operations, which access arrays in arbitrary order, and\u0000therefore goes beyond the traditional producer-consumer fusion. It can be\u0000parametrised with appropriate cost functions, and is fast enough to be suitable\u0000for just-in-time compilation.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compressing Structured Tensor Algebra 压缩结构化张量代数
Pub Date : 2024-07-18 DOI: arxiv-2407.13726
Mahdi Ghorbani, Emilien Bauer, Tobias Grosser, Amir Shaikhha
Tensor algebra is a crucial component for data-intensive workloads such asmachine learning and scientific computing. As the complexity of data grows,scientists often encounter a dilemma between the highly specialized densetensor algebra and efficient structure-aware algorithms provided by sparsetensor algebra. In this paper, we introduce DASTAC, a framework to propagatethe tensors's captured high-level structure down to low-level code generationby incorporating techniques such as automatic data layout compression,polyhedral analysis, and affine code generation. Our methodology reduces memoryfootprint by automatically detecting the best data layout, heavily benefitsfrom polyhedral optimizations, leverages further optimizations, and enablesparallelization through MLIR. Through extensive experimentation, we show thatDASTAC achieves 1 to 2 orders of magnitude speedup over TACO, astate-of-the-art sparse tensor compiler, and StructTensor, a state-of-the-artstructured tensor algebra compiler, with a significantly lower memoryfootprint.
张量代数是机器学习和科学计算等数据密集型工作负载的重要组成部分。随着数据复杂度的增加,科学家们经常会在高度专业化的张量代数和高效的结构感知算法之间左右为难。在本文中,我们介绍了 DASTAC,这是一个通过整合自动数据布局压缩、多面体分析和仿射代码生成等技术,将捕捉到的张量高层结构传播到底层代码生成的框架。我们的方法通过自动检测最佳数据布局来减少内存足迹,从多面体优化中获益匪浅,充分利用进一步优化,并通过 MLIR 实现并行化。通过大量实验,我们发现与最先进的稀疏张量编译器 TACO 和最先进的结构化张量代数编译器 StructTensor 相比,DASTAC 的速度提高了 1 到 2 个数量级,内存足迹也显著降低。
{"title":"Compressing Structured Tensor Algebra","authors":"Mahdi Ghorbani, Emilien Bauer, Tobias Grosser, Amir Shaikhha","doi":"arxiv-2407.13726","DOIUrl":"https://doi.org/arxiv-2407.13726","url":null,"abstract":"Tensor algebra is a crucial component for data-intensive workloads such as\u0000machine learning and scientific computing. As the complexity of data grows,\u0000scientists often encounter a dilemma between the highly specialized dense\u0000tensor algebra and efficient structure-aware algorithms provided by sparse\u0000tensor algebra. In this paper, we introduce DASTAC, a framework to propagate\u0000the tensors's captured high-level structure down to low-level code generation\u0000by incorporating techniques such as automatic data layout compression,\u0000polyhedral analysis, and affine code generation. Our methodology reduces memory\u0000footprint by automatically detecting the best data layout, heavily benefits\u0000from polyhedral optimizations, leverages further optimizations, and enables\u0000parallelization through MLIR. Through extensive experimentation, we show that\u0000DASTAC achieves 1 to 2 orders of magnitude speedup over TACO, a\u0000state-of-the-art sparse tensor compiler, and StructTensor, a state-of-the-art\u0000structured tensor algebra compiler, with a significantly lower memory\u0000footprint.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsafe Impedance: Safe Languages and Safe by Design Software 不安全的阻抗:安全语言和安全设计软件
Pub Date : 2024-07-17 DOI: arxiv-2407.13046
Lee Barney, Adolfo Neto
In December 2023, security agencies from five countries in North America,Europe, and the south Pacific produced a document encouraging senior executivesin all software producing organizations to take responsibility for andoversight of the security of the software their organizations produce. InFebruary 2024, the White House released a cybersecurity outline, highlightingthe December document. In this work we review the safe languages listed inthese documents, and compare the safety of those languages with Erlang andElixir, two BEAM languages. These security agencies' declaration of some languages as safe is necessarybut insufficient to make wise decisions regarding what language to use whencreating code. We propose an additional way of looking at languages and theease with which unsafe code can be written and used. We call this newperspective em{unsafe impedance}. We then go on to use unsafe impedance toexamine nine languages that are considered to be safe. Finally, we suggest thatbusiness processes include what we refer to as an Unsafe Acceptance Process.This Unsafe Acceptance Process can be used as part of the memory safe roadmapssuggested by these agencies. Unsafe Acceptance Processes can aid organizationsin their production of safe by design software.
2023 年 12 月,来自北美、欧洲和南太平洋五个国家的安全机构编制了一份文件,鼓励所有软件生产组织的高级管理人员对其组织生产的软件的安全性负责和监督。2024 年 2 月,白宫发布了一份网络安全纲要,强调了 12 月份的文件。在这项工作中,我们回顾了这些文件中列出的安全语言,并将这些语言的安全性与 Erlang 和Elixir 这两种 BEAM 语言进行了比较。这些安全机构将某些语言宣布为安全语言是必要的,但不足以让我们在创建代码时明智地决定使用哪种语言。我们提出了另一种看待语言的方法,以及编写和使用不安全代码的可能性。我们将这种新视角称为 "不安全阻抗"。然后,我们将使用不安全阻抗来考察九种被认为是安全的语言。最后,我们建议业务流程包含我们称之为 "不安全验收流程 "的内容。"不安全验收流程 "可作为这些机构建议的内存安全路线图的一部分。不安全验收流程可以帮助企业生产安全的设计软件。
{"title":"Unsafe Impedance: Safe Languages and Safe by Design Software","authors":"Lee Barney, Adolfo Neto","doi":"arxiv-2407.13046","DOIUrl":"https://doi.org/arxiv-2407.13046","url":null,"abstract":"In December 2023, security agencies from five countries in North America,\u0000Europe, and the south Pacific produced a document encouraging senior executives\u0000in all software producing organizations to take responsibility for and\u0000oversight of the security of the software their organizations produce. In\u0000February 2024, the White House released a cybersecurity outline, highlighting\u0000the December document. In this work we review the safe languages listed in\u0000these documents, and compare the safety of those languages with Erlang and\u0000Elixir, two BEAM languages. These security agencies' declaration of some languages as safe is necessary\u0000but insufficient to make wise decisions regarding what language to use when\u0000creating code. We propose an additional way of looking at languages and the\u0000ease with which unsafe code can be written and used. We call this new\u0000perspective em{unsafe impedance}. We then go on to use unsafe impedance to\u0000examine nine languages that are considered to be safe. Finally, we suggest that\u0000business processes include what we refer to as an Unsafe Acceptance Process.\u0000This Unsafe Acceptance Process can be used as part of the memory safe roadmaps\u0000suggested by these agencies. Unsafe Acceptance Processes can aid organizations\u0000in their production of safe by design software.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141737169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PyTond: Efficient Python Data Science on the Shoulders of Databases PyTond:数据库肩上的高效 Python 数据科学
Pub Date : 2024-07-16 DOI: arxiv-2407.11616
Hesam Shahrokhi, Amirali Kaboli, Mahdi Ghorbani, Amir Shaikhha
Python data science libraries such as Pandas and NumPy have recently gainedimmense popularity. Although these libraries are feature-rich and easy to use,their scalability limitations require more robust computational resources. Inthis paper, we present PyTond, an efficient approach to push the processing ofdata science workloads down into the database engines that are already knownfor their big data handling capabilities. Compared to the previous work, byintroducing TondIR, our approach can capture a more comprehensive set ofworkloads and data layouts. Moreover, by doing IR-level optimizations, wegenerate better SQL code that improves the query processing by the underlyingdatabase engine. Our evaluation results show promising performance improvementcompared to Python and other alternatives for diverse data science workloads.
Pandas 和 NumPy 等 Python 数据科学库最近大受欢迎。虽然这些库功能丰富且易于使用,但由于其可扩展性的限制,需要更强大的计算资源。在本文中,我们介绍了 PyTond,这是一种将数据科学工作负载的处理推向数据库引擎的高效方法。与之前的工作相比,通过引入 TondIR,我们的方法可以捕获更全面的工作负载和数据布局。此外,通过进行 IR 级优化,我们生成了更好的 SQL 代码,从而改进了底层数据库引擎的查询处理。我们的评估结果表明,与 Python 和其他替代方案相比,我们在各种数据科学工作负载方面的性能都有了很大的提高。
{"title":"PyTond: Efficient Python Data Science on the Shoulders of Databases","authors":"Hesam Shahrokhi, Amirali Kaboli, Mahdi Ghorbani, Amir Shaikhha","doi":"arxiv-2407.11616","DOIUrl":"https://doi.org/arxiv-2407.11616","url":null,"abstract":"Python data science libraries such as Pandas and NumPy have recently gained\u0000immense popularity. Although these libraries are feature-rich and easy to use,\u0000their scalability limitations require more robust computational resources. In\u0000this paper, we present PyTond, an efficient approach to push the processing of\u0000data science workloads down into the database engines that are already known\u0000for their big data handling capabilities. Compared to the previous work, by\u0000introducing TondIR, our approach can capture a more comprehensive set of\u0000workloads and data layouts. Moreover, by doing IR-level optimizations, we\u0000generate better SQL code that improves the query processing by the underlying\u0000database engine. Our evaluation results show promising performance improvement\u0000compared to Python and other alternatives for diverse data science workloads.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Haskelite: A Tracing Interpreter Based on a Pattern-Matching Calculus Haskelite:基于模式匹配计算的跟踪解释器
Pub Date : 2024-07-16 DOI: arxiv-2407.11831
Pedro Vasconcelos, Rodrigo Marques
Many Haskell textbooks explain the evaluation of pure functional programs asa process of stepwise rewriting using equations. However, usual implementationtechniques perform program transformations that make producing thecorresponding tracing evaluations difficult. This paper presents a tracinginterpreter for a subset of Haskell based on the pattern matching calculus ofKahl. We start from a big-step semantics in the style of Launchbury and developa small-step semantics in the style of Sestoft's machines. This machine is usedin the implementation of a step-by-step educational interpreter. We alsodiscuss some implementation decisions and present illustrative examples.
许多 Haskell 教科书将纯函数式程序的评估解释为使用方程逐步重写的过程。然而,通常的实现技术会对程序进行转换,这使得生成相应的跟踪评估变得困难。本文介绍了基于 Kahl 模式匹配微积分的 Haskell 子集跟踪解释器。我们从 Launchbury 风格的大步语义出发,开发了 Sestoft 机器风格的小步语义。这个机器被用于实现一个分步教育解释器。我们还讨论了一些实现上的决定,并举例说明。
{"title":"Haskelite: A Tracing Interpreter Based on a Pattern-Matching Calculus","authors":"Pedro Vasconcelos, Rodrigo Marques","doi":"arxiv-2407.11831","DOIUrl":"https://doi.org/arxiv-2407.11831","url":null,"abstract":"Many Haskell textbooks explain the evaluation of pure functional programs as\u0000a process of stepwise rewriting using equations. However, usual implementation\u0000techniques perform program transformations that make producing the\u0000corresponding tracing evaluations difficult. This paper presents a tracing\u0000interpreter for a subset of Haskell based on the pattern matching calculus of\u0000Kahl. We start from a big-step semantics in the style of Launchbury and develop\u0000a small-step semantics in the style of Sestoft's machines. This machine is used\u0000in the implementation of a step-by-step educational interpreter. We also\u0000discuss some implementation decisions and present illustrative examples.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141722116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modal Effect Types 模态效果类型
Pub Date : 2024-07-16 DOI: arxiv-2407.11816
Wenhao Tang, Leo White, Stephen Dolan, Daniel Hillerström, Sam Lindley, Anton Lorenzen
We propose a novel type system for effects and handlers using modal types.Conventional effect systems attach effects to function types, which can lead toverbose effect-polymorphic types, especially for higher-order functions. Ourmodal effect system provides succinct types for higher-order first-classfunctions without losing modularity and reusability. The core idea is todecouple effects from function types and instead to track effects throughrelative and absolute modalities, which represent transformations on theambient effects provided by the context. We formalise the idea of modal effect types in a multimodal System F-stylecore calculus Met with effects and handlers. Met supports modular effectfulprogramming via modalities without relying on effect variables. We encode apractical fragment of a conventional row-based effect system with effectpolymorphism, which captures most common use-cases, into Met in order toformally demonstrate the expressive power of modal effect types. To recover thefull power of conventional effect systems beyond this fragment, we seamlesslyextend Met to Mete with effect variables. We propose a surface language Metelfor Mete with a sound and complete type inference algorithm inspired byFreezeML.
传统的效果系统将效果附加到函数类型上,这会导致冗长的效果多态类型,尤其是对高阶函数而言。我们的模态效果系统为高阶一阶函数提供了简洁的类型,同时又不失模块性和可重用性。其核心思想是将效果与函数类型分离开来,而是通过相关模态和绝对模态来跟踪效果,这些模态表示上下文所提供的环境效果的变换。我们将模态效果类型的想法正式化为具有效果和处理程序的多模态系统 F 风格核心微积分 Met。Met 支持通过模态进行模块化效果编程,而无需依赖效果变量。我们将传统的基于行的效果系统的一个实用片段编码到 Met 中,该片段具有效果多态性,可以捕捉到最常见的用例,从而从形式上展示了模态效果类型的表现力。为了在此片段之外恢复传统效果系统的全部功能,我们将 Met 无缝扩展为带有效果变量的 Mete。我们为 Mete 提出了一种表面语言 Metelfor,它具有受 FreezeML 启发的健全而完整的类型推断算法。
{"title":"Modal Effect Types","authors":"Wenhao Tang, Leo White, Stephen Dolan, Daniel Hillerström, Sam Lindley, Anton Lorenzen","doi":"arxiv-2407.11816","DOIUrl":"https://doi.org/arxiv-2407.11816","url":null,"abstract":"We propose a novel type system for effects and handlers using modal types.\u0000Conventional effect systems attach effects to function types, which can lead to\u0000verbose effect-polymorphic types, especially for higher-order functions. Our\u0000modal effect system provides succinct types for higher-order first-class\u0000functions without losing modularity and reusability. The core idea is to\u0000decouple effects from function types and instead to track effects through\u0000relative and absolute modalities, which represent transformations on the\u0000ambient effects provided by the context. We formalise the idea of modal effect types in a multimodal System F-style\u0000core calculus Met with effects and handlers. Met supports modular effectful\u0000programming via modalities without relying on effect variables. We encode a\u0000practical fragment of a conventional row-based effect system with effect\u0000polymorphism, which captures most common use-cases, into Met in order to\u0000formally demonstrate the expressive power of modal effect types. To recover the\u0000full power of conventional effect systems beyond this fragment, we seamlessly\u0000extend Met to Mete with effect variables. We propose a surface language Metel\u0000for Mete with a sound and complete type inference algorithm inspired by\u0000FreezeML.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Programming Languages
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1