Aaron Jarmusch, Felipe Cabarcas, Swaroop Pophale, Andrew Kallai, Johannes Doerfert, Luke Peyralans, Seyong Lee, Joel Denny, Sunita Chandrasekaran
Software developers must adapt to keep up with the changing capabilities of platforms so that they can utilize the power of High- Performance Computers (HPC), including exascale systems. OpenMP, a directive-based parallel programming model, allows developers to include directives to existing C, C++, or Fortran code to allow node level parallelism without compromising performance. This paper describes our CI/CD efforts to provide easy evaluation of the support of OpenMP across different compilers using existing testsuites and benchmark suites on HPC platforms. Our main contributions include (1) the set of a Continuous Integration (CI) and Continuous Development (CD) workflow that captures bugs and provides faster feedback to compiler developers, (2) an evaluation of OpenMP (offloading) implementations supported by AMD, HPE, GNU, LLVM, and Intel, and (3) evaluation of the quality of compilers across different heterogeneous HPC platforms. With the comprehensive testing through the CI/CD workflow, we aim to provide a comprehensive understanding of the current state of OpenMP (offloading) support in different compilers and heterogeneous platforms consisting of CPUs and GPUs from NVIDIA, AMD, and Intel.
{"title":"CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations","authors":"Aaron Jarmusch, Felipe Cabarcas, Swaroop Pophale, Andrew Kallai, Johannes Doerfert, Luke Peyralans, Seyong Lee, Joel Denny, Sunita Chandrasekaran","doi":"arxiv-2408.11777","DOIUrl":"https://doi.org/arxiv-2408.11777","url":null,"abstract":"Software developers must adapt to keep up with the changing capabilities of\u0000platforms so that they can utilize the power of High- Performance Computers\u0000(HPC), including exascale systems. OpenMP, a directive-based parallel\u0000programming model, allows developers to include directives to existing C, C++,\u0000or Fortran code to allow node level parallelism without compromising\u0000performance. This paper describes our CI/CD efforts to provide easy evaluation\u0000of the support of OpenMP across different compilers using existing testsuites\u0000and benchmark suites on HPC platforms. Our main contributions include (1) the\u0000set of a Continuous Integration (CI) and Continuous Development (CD) workflow\u0000that captures bugs and provides faster feedback to compiler developers, (2) an\u0000evaluation of OpenMP (offloading) implementations supported by AMD, HPE, GNU,\u0000LLVM, and Intel, and (3) evaluation of the quality of compilers across\u0000different heterogeneous HPC platforms. With the comprehensive testing through\u0000the CI/CD workflow, we aim to provide a comprehensive understanding of the\u0000current state of OpenMP (offloading) support in different compilers and\u0000heterogeneous platforms consisting of CPUs and GPUs from NVIDIA, AMD, and\u0000Intel.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Fink, Dimitrios Stavrakakis, Dennis Sprokholt, Soham Chakraborty, Jan-Erik Ekberg, Pramod Bhatotia
WebAssembly (WASM) is an immensely versatile and increasingly popular compilation target. It executes applications written in several languages (e.g., C/C++) with near-native performance in various domains (e.g., mobile, edge, cloud). Despite WASM's sandboxing feature, which isolates applications from other instances and the host platform, WASM does not inherently provide any memory safety guarantees for applications written in low-level, unsafe languages. To this end, we propose Cage, a hardware-accelerated toolchain for WASM that supports unmodified applications compiled to WASM and utilizes diverse Arm hardware features aiming to enrich the memory safety properties of WASM. Precisely, Cage leverages Arm's Memory Tagging Extension (MTE) to (i)~provide spatial and temporal memory safety for heap and stack allocations and (ii)~improve the performance of WASM's sandboxing mechanism. Cage further employs Arm's Pointer Authentication (PAC) to prevent leaked pointers from being reused by other WASM instances, thus enhancing WASM's security properties. We implement our system based on 64-bit WASM. We provide a WASM compiler and runtime with support for Arm's MTE and PAC. On top of that, Cage's LLVM-based compiler toolchain transforms unmodified applications to provide spatial and temporal memory safety for stack and heap allocations and prevent function pointer reuse. Our evaluation on real hardware shows that Cage incurs minimal runtime ($<5.8,%$) and memory ($<3.7,%$) overheads and can improve the performance of WASM's sandboxing mechanism, achieving a speedup of over $5.1,%$, while offering efficient memory safety guarantees.
{"title":"Cage: Hardware-Accelerated Safe WebAssembly","authors":"Martin Fink, Dimitrios Stavrakakis, Dennis Sprokholt, Soham Chakraborty, Jan-Erik Ekberg, Pramod Bhatotia","doi":"arxiv-2408.11456","DOIUrl":"https://doi.org/arxiv-2408.11456","url":null,"abstract":"WebAssembly (WASM) is an immensely versatile and increasingly popular\u0000compilation target. It executes applications written in several languages\u0000(e.g., C/C++) with near-native performance in various domains (e.g., mobile,\u0000edge, cloud). Despite WASM's sandboxing feature, which isolates applications\u0000from other instances and the host platform, WASM does not inherently provide\u0000any memory safety guarantees for applications written in low-level, unsafe\u0000languages. To this end, we propose Cage, a hardware-accelerated toolchain for WASM that\u0000supports unmodified applications compiled to WASM and utilizes diverse Arm\u0000hardware features aiming to enrich the memory safety properties of WASM.\u0000Precisely, Cage leverages Arm's Memory Tagging Extension (MTE) to (i)~provide\u0000spatial and temporal memory safety for heap and stack allocations and\u0000(ii)~improve the performance of WASM's sandboxing mechanism. Cage further\u0000employs Arm's Pointer Authentication (PAC) to prevent leaked pointers from\u0000being reused by other WASM instances, thus enhancing WASM's security\u0000properties. We implement our system based on 64-bit WASM. We provide a WASM compiler and\u0000runtime with support for Arm's MTE and PAC. On top of that, Cage's LLVM-based\u0000compiler toolchain transforms unmodified applications to provide spatial and\u0000temporal memory safety for stack and heap allocations and prevent function\u0000pointer reuse. Our evaluation on real hardware shows that Cage incurs minimal\u0000runtime ($<5.8,%$) and memory ($<3.7,%$) overheads and can improve the\u0000performance of WASM's sandboxing mechanism, achieving a speedup of over\u0000$5.1,%$, while offering efficient memory safety guarantees.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
First-order logic has been established as an important tool for modeling and verifying intricate systems such as distributed protocols and concurrent systems. These systems are parametric in the number of nodes in the network or the number of threads, which is finite in any system instance, but unbounded. One disadvantage of first-order logic is that it cannot distinguish between finite and infinite structures, leading to spurious counterexamples. To mitigate this, we offer a verification approach that captures only finite system instances. Our approach is an adaptation of the cutoff method to systems modeled in first-order logic. The idea is to show that any safety violation in a system instance of size larger than some bound can be simulated by a safety violation in a system of a smaller size. The simulation provides an inductive argument for correctness in finite instances, reducing the problem to showing safety of instances with bounded size. To this end, we develop a framework to (i) encode such simulation relations in first-order logic and to (ii) validate the simulation relation by a set of verification conditions given to an SMT solver. We apply our approach to verify safety of a set of examples, some of which cannot be proven by a first-order inductive invariant.
{"title":"Proving Cutoff Bounds for Safety Properties in First-Order Logic","authors":"Raz Lotan, Eden Frenkel, Sharon Shoham","doi":"arxiv-2408.10685","DOIUrl":"https://doi.org/arxiv-2408.10685","url":null,"abstract":"First-order logic has been established as an important tool for modeling and\u0000verifying intricate systems such as distributed protocols and concurrent\u0000systems. These systems are parametric in the number of nodes in the network or\u0000the number of threads, which is finite in any system instance, but unbounded.\u0000One disadvantage of first-order logic is that it cannot distinguish between\u0000finite and infinite structures, leading to spurious counterexamples. To\u0000mitigate this, we offer a verification approach that captures only finite\u0000system instances. Our approach is an adaptation of the cutoff method to systems\u0000modeled in first-order logic. The idea is to show that any safety violation in\u0000a system instance of size larger than some bound can be simulated by a safety\u0000violation in a system of a smaller size. The simulation provides an inductive\u0000argument for correctness in finite instances, reducing the problem to showing\u0000safety of instances with bounded size. To this end, we develop a framework to\u0000(i) encode such simulation relations in first-order logic and to (ii) validate\u0000the simulation relation by a set of verification conditions given to an SMT\u0000solver. We apply our approach to verify safety of a set of examples, some of\u0000which cannot be proven by a first-order inductive invariant.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a stable model semantics for higher-order logic programs. Our semantics is developed using Approximation Fixpoint Theory (AFT), a powerful formalism that has successfully been used to give meaning to diverse non-monotonic formalisms. The proposed semantics generalizes the classical two-valued stable model semantics of (Gelfond and Lifschitz 1988) as-well-as the three-valued one of (Przymusinski 1990), retaining their desirable properties. Due to the use of AFT, we also get for free alternative semantics for higher-order logic programs, namely supported model, Kripke-Kleene, and well-founded. Additionally, we define a broad class of stratified higher-order logic programs and demonstrate that they have a unique two-valued higher-order stable model which coincides with the well-founded semantics of such programs. We provide a number of examples in different application domains, which demonstrate that higher-order logic programming under the stable model semantics is a powerful and versatile formalism, which can potentially form the basis of novel ASP systems.
我们提出了一种适用于高阶逻辑程序的稳定模型语义。我们的语义是利用近似定点理论(AFT)发展起来的,AFT 是一种强大的形式主义,已被成功地用于赋予各种非单调形式主义以意义。所提出的语义概括了(Gelfond 和 Lifschitz,1988 年)的经典两值稳定模型语义以及(Przymusinski,1990 年)的三值稳定模型语义,保留了它们的理想特性。由于使用了 AFT,我们还免费获得了高阶逻辑程序的替代语义,即支持模型、克里普克-克莱因和有根据。此外,我们还定义了一大类分层高阶逻辑程序,并证明它们有一个独特的两值高阶稳定模型,该模型与此类程序的有根据语义相吻合。我们提供了不同应用领域中的大量实例,证明稳定模型语义下的高阶逻辑编程是一种强大而多用途的形式主义,有可能成为新型 ASP 系统的基础。
{"title":"The Stable Model Semantics for Higher-Order Logic Programming","authors":"Bart Bogaerts, Angelos Charalambidis, Giannos Chatziagapis, Babis Kostopoulos, Samuele Pollaci, Panos Rondogiannis","doi":"arxiv-2408.10563","DOIUrl":"https://doi.org/arxiv-2408.10563","url":null,"abstract":"We propose a stable model semantics for higher-order logic programs. Our\u0000semantics is developed using Approximation Fixpoint Theory (AFT), a powerful\u0000formalism that has successfully been used to give meaning to diverse\u0000non-monotonic formalisms. The proposed semantics generalizes the classical\u0000two-valued stable model semantics of (Gelfond and Lifschitz 1988) as-well-as\u0000the three-valued one of (Przymusinski 1990), retaining their desirable\u0000properties. Due to the use of AFT, we also get for free alternative semantics\u0000for higher-order logic programs, namely supported model, Kripke-Kleene, and\u0000well-founded. Additionally, we define a broad class of stratified higher-order\u0000logic programs and demonstrate that they have a unique two-valued higher-order\u0000stable model which coincides with the well-founded semantics of such programs.\u0000We provide a number of examples in different application domains, which\u0000demonstrate that higher-order logic programming under the stable model\u0000semantics is a powerful and versatile formalism, which can potentially form the\u0000basis of novel ASP systems.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We show how (well-established) type systems based on non-idempotent intersection types can be extended to characterize termination properties of functional programming languages with pattern matching features. To model such programming languages, we use a (weak and closed) $lambda$-calculus integrating a pattern matching mechanism on algebraic data types (ADTs). Remarkably, we also show that this language not only encodes Plotkin's CBV and CBN $lambda$-calculus as well as other subsuming frameworks, such as the bang-calculus, but can also be used to interpret the semantics of effectful languages with exceptions. After a thorough study of the untyped language, we introduce a type system based on intersection types, and we show through purely logical methods that the set of terminating terms of the language corresponds exactly to that of well-typed terms. Moreover, by considering non-idempotent intersection types, this characterization turns out to be quantitative, i.e. the size of the type derivation of a term t gives an upper bound for the number of evaluation steps from t to its normal form.
我们展示了如何将基于非幂交集类型的(成熟的)类型系统扩展到表征具有模式匹配特征的函数式编程语言的终止属性。值得注意的是,我们还展示了这种语言不仅编码了普洛特金的CBV和CBN $lambda$-calculus 以及其他子包含框架,比如砰算,而且还可以用来解释具有异常的效果语言的语义。在对无类型语言进行深入研究之后,我们引入了基于交集类型的类型系统,并通过纯逻辑方法证明了该语言的终止项集与类型良好的终止项集完全对应。此外,通过考虑非幂等交集类型,我们发现这种表征是定量的,即术语 t 的类型推导的大小给出了从 t 到其正常形式的求值步骤数的上限。
{"title":"Extending the Quantitative Pattern-Matching Paradigm","authors":"Sandra Alves, Delia Kesner, Miguel Ramos","doi":"arxiv-2408.11007","DOIUrl":"https://doi.org/arxiv-2408.11007","url":null,"abstract":"We show how (well-established) type systems based on non-idempotent\u0000intersection types can be extended to characterize termination properties of\u0000functional programming languages with pattern matching features. To model such\u0000programming languages, we use a (weak and closed) $lambda$-calculus\u0000integrating a pattern matching mechanism on algebraic data types (ADTs).\u0000Remarkably, we also show that this language not only encodes Plotkin's CBV and\u0000CBN $lambda$-calculus as well as other subsuming frameworks, such as the\u0000bang-calculus, but can also be used to interpret the semantics of effectful\u0000languages with exceptions. After a thorough study of the untyped language, we\u0000introduce a type system based on intersection types, and we show through purely\u0000logical methods that the set of terminating terms of the language corresponds\u0000exactly to that of well-typed terms. Moreover, by considering non-idempotent\u0000intersection types, this characterization turns out to be quantitative, i.e.\u0000the size of the type derivation of a term t gives an upper bound for the number\u0000of evaluation steps from t to its normal form.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing visual assistive technologies are built for simple and common use cases, and have few avenues for blind people to customize their functionalities. Drawing from prior work on DIY assistive technology, this paper investigates end-user programming as a means for users to create and customize visual access programs to meet their unique needs. We introduce ProgramAlly, a system for creating custom filters for visual information, e.g., 'find NUMBER on BUS', leveraging three end-user programming approaches: block programming, natural language, and programming by example. To implement ProgramAlly, we designed a representation of visual filtering tasks based on scenarios encountered by blind people, and integrated a set of on-device and cloud models for generating and running these programs. In user studies with 12 blind adults, we found that participants preferred different programming modalities depending on the task, and envisioned using visual access programs to address unique accessibility challenges that are otherwise difficult with existing applications. Through ProgramAlly, we present an exploration of how blind end-users can create visual access programs to customize and control their experiences.
{"title":"ProgramAlly: Creating Custom Visual Access Programs via Multi-Modal End-User Programming","authors":"Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, Anhong Guo","doi":"arxiv-2408.10499","DOIUrl":"https://doi.org/arxiv-2408.10499","url":null,"abstract":"Existing visual assistive technologies are built for simple and common use\u0000cases, and have few avenues for blind people to customize their\u0000functionalities. Drawing from prior work on DIY assistive technology, this\u0000paper investigates end-user programming as a means for users to create and\u0000customize visual access programs to meet their unique needs. We introduce\u0000ProgramAlly, a system for creating custom filters for visual information, e.g.,\u0000'find NUMBER on BUS', leveraging three end-user programming approaches: block\u0000programming, natural language, and programming by example. To implement\u0000ProgramAlly, we designed a representation of visual filtering tasks based on\u0000scenarios encountered by blind people, and integrated a set of on-device and\u0000cloud models for generating and running these programs. In user studies with 12\u0000blind adults, we found that participants preferred different programming\u0000modalities depending on the task, and envisioned using visual access programs\u0000to address unique accessibility challenges that are otherwise difficult with\u0000existing applications. Through ProgramAlly, we present an exploration of how\u0000blind end-users can create visual access programs to customize and control\u0000their experiences.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Jurjo-Rivas, Jose F. Morales, Pedro López-García, Manuel V. Hermenegildo
Variable sharing is a fundamental property in the static analysis of logic programs, since it is instrumental for ensuring correctness and increasing precision while inferring many useful program properties. Such properties include modes, determinacy, non-failure, cost, etc. This has motivated significant work on developing abstract domains to improve the precision and performance of sharing analyses. Much of this work has centered around the family of set-sharing domains, because of the high precision they offer. However, this comes at a price: their scalability to a wide set of realistic programs remains challenging and this hinders their wider adoption. In this work, rather than defining new sharing abstract domains, we focus instead on developing techniques which can be incorporated in the analyzers to address aspects that are known to affect the efficiency of these domains, such as the number of variables, without affecting precision. These techniques are inspired in others used in the context of compiler optimizations, such as expression reassociation and variable trimming. We present several such techniques and provide an extensive experimental evaluation of over 1100 program modules taken from both production code and classical benchmarks. This includes the Spectector cache analyzer, the s(CASP) system, the libraries of the Ciao system, the LPdoc documenter, the PLAI analyzer itself, etc. The experimental results are quite encouraging: we have obtained significant speed-ups, and, more importantly, the number of modules that require a timeout was cut in half. As a result, many more programs can be analyzed precisely in reasonable times.
{"title":"Abstract Environment Trimming","authors":"Daniel Jurjo-Rivas, Jose F. Morales, Pedro López-García, Manuel V. Hermenegildo","doi":"arxiv-2408.09848","DOIUrl":"https://doi.org/arxiv-2408.09848","url":null,"abstract":"Variable sharing is a fundamental property in the static analysis of logic\u0000programs, since it is instrumental for ensuring correctness and increasing\u0000precision while inferring many useful program properties. Such properties\u0000include modes, determinacy, non-failure, cost, etc. This has motivated\u0000significant work on developing abstract domains to improve the precision and\u0000performance of sharing analyses. Much of this work has centered around the\u0000family of set-sharing domains, because of the high precision they offer.\u0000However, this comes at a price: their scalability to a wide set of realistic\u0000programs remains challenging and this hinders their wider adoption. In this\u0000work, rather than defining new sharing abstract domains, we focus instead on\u0000developing techniques which can be incorporated in the analyzers to address\u0000aspects that are known to affect the efficiency of these domains, such as the\u0000number of variables, without affecting precision. These techniques are inspired\u0000in others used in the context of compiler optimizations, such as expression\u0000reassociation and variable trimming. We present several such techniques and\u0000provide an extensive experimental evaluation of over 1100 program modules taken\u0000from both production code and classical benchmarks. This includes the\u0000Spectector cache analyzer, the s(CASP) system, the libraries of the Ciao\u0000system, the LPdoc documenter, the PLAI analyzer itself, etc. The experimental\u0000results are quite encouraging: we have obtained significant speed-ups, and,\u0000more importantly, the number of modules that require a timeout was cut in half.\u0000As a result, many more programs can be analyzed precisely in reasonable times.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional implementations of strongly-typed functional programming languages often miss the root cause of type errors. As a consequence, type error messages are often misleading and confusing - particularly for students learning such a language. We describe Tyro, a type error localization tool which determines the optimal source of an error for ill-typed programs following fundamental ideas by Pavlinovic et al. : we first translate typing constraints into SMT (Satisfiability Modulo Theories) using an intermediate representation which is more readable than the actual SMT encoding; during this phase we apply a new encoding for polymorphic types. Second, we translate our intermediate representation into an actual SMT encoding and take advantage of recent advancements in off-the-shelf SMT solvers to effectively find optimal error sources for ill-typed programs. Our design maintains the separation of heuristic and search also present in prior and similar work. In addition, our architecture design increases modularity, re-usability, and trust in the overall architecture using an intermediate representation to facilitate the safe generation of the SMT encoding. We believe this design principle will apply to many other tools that leverage SMT solvers. Our experimental evaluation reinforces that the SMT approach finds accurate error sources using both expert-labeled programs and an automated method for larger-scale analysis. Compared to prior work, Tyro lays the basis for large-scale evaluation of error localization techniques, which can be integrated into programming environments and enable us to understand the impact of precise error messages for students in practice.
{"title":"Modernizing SMT-Based Type Error Localization","authors":"Max Kopinsky, Brigitte Pientka, Xujie Si","doi":"arxiv-2408.09034","DOIUrl":"https://doi.org/arxiv-2408.09034","url":null,"abstract":"Traditional implementations of strongly-typed functional programming\u0000languages often miss the root cause of type errors. As a consequence, type\u0000error messages are often misleading and confusing - particularly for students\u0000learning such a language. We describe Tyro, a type error localization tool\u0000which determines the optimal source of an error for ill-typed programs\u0000following fundamental ideas by Pavlinovic et al. : we first translate typing\u0000constraints into SMT (Satisfiability Modulo Theories) using an intermediate\u0000representation which is more readable than the actual SMT encoding; during this\u0000phase we apply a new encoding for polymorphic types. Second, we translate our\u0000intermediate representation into an actual SMT encoding and take advantage of\u0000recent advancements in off-the-shelf SMT solvers to effectively find optimal\u0000error sources for ill-typed programs. Our design maintains the separation of\u0000heuristic and search also present in prior and similar work. In addition, our\u0000architecture design increases modularity, re-usability, and trust in the\u0000overall architecture using an intermediate representation to facilitate the\u0000safe generation of the SMT encoding. We believe this design principle will\u0000apply to many other tools that leverage SMT solvers. Our experimental evaluation reinforces that the SMT approach finds accurate\u0000error sources using both expert-labeled programs and an automated method for\u0000larger-scale analysis. Compared to prior work, Tyro lays the basis for\u0000large-scale evaluation of error localization techniques, which can be\u0000integrated into programming environments and enable us to understand the impact\u0000of precise error messages for students in practice.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Wolff, Ekanshdeep Gupta, Zafer Esen, Hossein Hojjat, Philipp Rümmer, Thomas Wies
Memory safety is an essential correctness property of software systems. For programs operating on linked heap-allocated data structures, the problem of proving memory safety boils down to analyzing the possible shapes of data structures, leading to the field of shape analysis. This paper presents a novel reduction-based approach to memory safety analysis that relies on two forms of abstraction: flow abstraction, representing global properties of the heap graph through local flow equations; and view abstraction, which enable verification tools to reason symbolically about an unbounded number of heap objects. In combination, the two abstractions make it possible to reduce memory-safety proofs to proofs about heap-less imperative programs that can be discharged using off-the-shelf software verification tools without built-in support for heap reasoning. Using an empirical evaluation on a broad range of programs, the paper shows that the reduction approach can effectively verify memory safety for sequential and concurrent programs operating on different kinds of linked data structures, including singly-linked, doubly-linked, and nested lists as well as trees.
{"title":"Arithmetizing Shape Analysis","authors":"Sebastian Wolff, Ekanshdeep Gupta, Zafer Esen, Hossein Hojjat, Philipp Rümmer, Thomas Wies","doi":"arxiv-2408.09037","DOIUrl":"https://doi.org/arxiv-2408.09037","url":null,"abstract":"Memory safety is an essential correctness property of software systems. For\u0000programs operating on linked heap-allocated data structures, the problem of\u0000proving memory safety boils down to analyzing the possible shapes of data\u0000structures, leading to the field of shape analysis. This paper presents a novel\u0000reduction-based approach to memory safety analysis that relies on two forms of\u0000abstraction: flow abstraction, representing global properties of the heap graph\u0000through local flow equations; and view abstraction, which enable verification\u0000tools to reason symbolically about an unbounded number of heap objects. In\u0000combination, the two abstractions make it possible to reduce memory-safety\u0000proofs to proofs about heap-less imperative programs that can be discharged\u0000using off-the-shelf software verification tools without built-in support for\u0000heap reasoning. Using an empirical evaluation on a broad range of programs, the\u0000paper shows that the reduction approach can effectively verify memory safety\u0000for sequential and concurrent programs operating on different kinds of linked\u0000data structures, including singly-linked, doubly-linked, and nested lists as\u0000well as trees.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This thesis embarks on a comprehensive exploration of formal computational models that underlie typed programming languages. We focus on programming calculi, both functional (sequential) and concurrent, as they provide a compelling rigorous framework for evaluating program semantics and for developing analyses and program verification techniques. This is the full version of the thesis containing appendices.
{"title":"On the Expressivity of Typed Concurrent Calculi","authors":"Joseph William Neal Paulus","doi":"arxiv-2408.07915","DOIUrl":"https://doi.org/arxiv-2408.07915","url":null,"abstract":"This thesis embarks on a comprehensive exploration of formal computational\u0000models that underlie typed programming languages. We focus on programming\u0000calculi, both functional (sequential) and concurrent, as they provide a\u0000compelling rigorous framework for evaluating program semantics and for\u0000developing analyses and program verification techniques. This is the full\u0000version of the thesis containing appendices.","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}