{"title":"基于双层编程模型的数据定价研究","authors":"Yurong Ding, Yingjie Tian","doi":"10.1007/s40745-024-00549-w","DOIUrl":null,"url":null,"abstract":"<div><p>Effective value measurement and pricing methods can greatly promote the healthy development of data sharing, exchange and reuse. However, the uncertainty of data value and neglect of interactivity lead to information asymmetry in the transaction process. A perfect pricing system and well-designed data trading market (hereafter called data market) can widely promote data transactions. We take the three-agents data market as an example to construct a sound data trading process. The data owner who provides data records, the model buyer who is interested in buying machine learning (ML) model instances, and the data broker who interacts between the data owner and the model buyer. Based on the characteristics of data market, like truthfulness, revenue maximization, version control, fairness and non-arbitrage, we propose a data pricing methods based on different model versions. Firstly, we utilize market research and construct a revenue maximization (RM) problem to price the different versions of ML models and solve it with the RM-ILP process. However, the RM model based on market research has two major problems: one is that the model buyer has no incentive to tell the truth, that is, the model buyer will lie in the market research to obtain a lower model price; the other is that it asks the data broker to release version menu in advance, resulting in an inefficient operation of the data market. In view of the defects of the RM transaction model, we propose a model buyers behavior analysis, establish the revenue maximization function based on different data versions to establish a bi-level linear programming model. We further add the incentive compatibility constraint and the individual rationality constraint, taking the utility of the model buyer and the revenue of the data broker into account. This reflects the consumer driven model in the data transaction mode. Finally, the RM-BLP process is proposed to transform RM problem into an equivalent single-level integer programming problem and we solve it with the “Gurobi” solver. The validity of the model is verified by experiments.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Pricing of Data Based on Bi-level Programming Model\",\"authors\":\"Yurong Ding, Yingjie Tian\",\"doi\":\"10.1007/s40745-024-00549-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Effective value measurement and pricing methods can greatly promote the healthy development of data sharing, exchange and reuse. However, the uncertainty of data value and neglect of interactivity lead to information asymmetry in the transaction process. A perfect pricing system and well-designed data trading market (hereafter called data market) can widely promote data transactions. We take the three-agents data market as an example to construct a sound data trading process. The data owner who provides data records, the model buyer who is interested in buying machine learning (ML) model instances, and the data broker who interacts between the data owner and the model buyer. Based on the characteristics of data market, like truthfulness, revenue maximization, version control, fairness and non-arbitrage, we propose a data pricing methods based on different model versions. Firstly, we utilize market research and construct a revenue maximization (RM) problem to price the different versions of ML models and solve it with the RM-ILP process. However, the RM model based on market research has two major problems: one is that the model buyer has no incentive to tell the truth, that is, the model buyer will lie in the market research to obtain a lower model price; the other is that it asks the data broker to release version menu in advance, resulting in an inefficient operation of the data market. In view of the defects of the RM transaction model, we propose a model buyers behavior analysis, establish the revenue maximization function based on different data versions to establish a bi-level linear programming model. We further add the incentive compatibility constraint and the individual rationality constraint, taking the utility of the model buyer and the revenue of the data broker into account. This reflects the consumer driven model in the data transaction mode. Finally, the RM-BLP process is proposed to transform RM problem into an equivalent single-level integer programming problem and we solve it with the “Gurobi” solver. The validity of the model is verified by experiments.</p></div>\",\"PeriodicalId\":36280,\"journal\":{\"name\":\"Annals of Data Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s40745-024-00549-w\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00549-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}
Research on Pricing of Data Based on Bi-level Programming Model
Effective value measurement and pricing methods can greatly promote the healthy development of data sharing, exchange and reuse. However, the uncertainty of data value and neglect of interactivity lead to information asymmetry in the transaction process. A perfect pricing system and well-designed data trading market (hereafter called data market) can widely promote data transactions. We take the three-agents data market as an example to construct a sound data trading process. The data owner who provides data records, the model buyer who is interested in buying machine learning (ML) model instances, and the data broker who interacts between the data owner and the model buyer. Based on the characteristics of data market, like truthfulness, revenue maximization, version control, fairness and non-arbitrage, we propose a data pricing methods based on different model versions. Firstly, we utilize market research and construct a revenue maximization (RM) problem to price the different versions of ML models and solve it with the RM-ILP process. However, the RM model based on market research has two major problems: one is that the model buyer has no incentive to tell the truth, that is, the model buyer will lie in the market research to obtain a lower model price; the other is that it asks the data broker to release version menu in advance, resulting in an inefficient operation of the data market. In view of the defects of the RM transaction model, we propose a model buyers behavior analysis, establish the revenue maximization function based on different data versions to establish a bi-level linear programming model. We further add the incentive compatibility constraint and the individual rationality constraint, taking the utility of the model buyer and the revenue of the data broker into account. This reflects the consumer driven model in the data transaction mode. Finally, the RM-BLP process is proposed to transform RM problem into an equivalent single-level integer programming problem and we solve it with the “Gurobi” solver. The validity of the model is verified by experiments.
期刊介绍:
Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.