{"title":"A theory of fine-grained lineage for functions on structured objects","authors":"Sylvain Hallé, Hugo Tremblay","doi":"10.1016/j.tcs.2025.115192","DOIUrl":null,"url":null,"abstract":"<div><div>Lineage is the process of keeping track of the relationship between the inputs of a data processing task and the parts of the output they contribute to produce. Depending on its precise definition, lineage can be seen as a form of database provenance, a means of tracking information flow in computer programs, or be used to express causality and provide counter-examples for the falsity of a logical statement. In this paper, we establish the formal foundations of a notion of lineage for arbitrary abstract functions manipulating objects that are “composite” –that is, can be made of multiple other objects. Three definitions of lineage over functions are formally defined, respectively called explanation, participation and extraction; we then establish explanation relationships for a set of elementary functions, and for compositions thereof. A fully functional implementation of these concepts is finally presented and experimentally evaluated.</div></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1039 ","pages":"Article 115192"},"PeriodicalIF":1.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304397525001306","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Lineage is the process of keeping track of the relationship between the inputs of a data processing task and the parts of the output they contribute to produce. Depending on its precise definition, lineage can be seen as a form of database provenance, a means of tracking information flow in computer programs, or be used to express causality and provide counter-examples for the falsity of a logical statement. In this paper, we establish the formal foundations of a notion of lineage for arbitrary abstract functions manipulating objects that are “composite” –that is, can be made of multiple other objects. Three definitions of lineage over functions are formally defined, respectively called explanation, participation and extraction; we then establish explanation relationships for a set of elementary functions, and for compositions thereof. A fully functional implementation of these concepts is finally presented and experimentally evaluated.
期刊介绍:
Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.