Terms such as the Digital Twin of an Organization (DTO) and Hyperautomation (HA) illustrate the desire to autonomously manage and orchestrate processes, just like we aim for autonomously driving cars. Autonomous driving and Autonomous Process Execution Management (APEM) have in common that the goals are pretty straightforward and that each year progress is made, but fully autonomous driving and fully autonomous process execution are more a dream than a reality. For cars, the Society of Automotive Engineers (SAE) identified six levels (0-5), ranging from no driving automation (SAE, Level 0) to full driving automation (SAE, Level 5). This short article defines six levels of Autonomous Process Execution Management (APEM). The goal is to show that the transition from one level to the next will be gradual, just like for self-driving cars.
{"title":"Six Levels of Autonomous Process Execution Management (APEM)","authors":"Wil van der Aalst","doi":"arxiv-2204.11328","DOIUrl":"https://doi.org/arxiv-2204.11328","url":null,"abstract":"Terms such as the Digital Twin of an Organization (DTO) and Hyperautomation\u0000(HA) illustrate the desire to autonomously manage and orchestrate processes,\u0000just like we aim for autonomously driving cars. Autonomous driving and\u0000Autonomous Process Execution Management (APEM) have in common that the goals\u0000are pretty straightforward and that each year progress is made, but fully\u0000autonomous driving and fully autonomous process execution are more a dream than\u0000a reality. For cars, the Society of Automotive Engineers (SAE) identified six\u0000levels (0-5), ranging from no driving automation (SAE, Level 0) to full driving\u0000automation (SAE, Level 5). This short article defines six levels of Autonomous\u0000Process Execution Management (APEM). The goal is to show that the transition\u0000from one level to the next will be gradual, just like for self-driving cars.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Interactive machine learning (IML) is a field of research that explores how to leverage both human and computational abilities in decision making systems. IML represents a collaboration between multiple complementary human and machine intelligent systems working as a team, each with their own unique abilities and limitations. This teamwork might mean that both systems take actions at the same time, or in sequence. Two major open research questions in the field of IML are: "How should we design systems that can learn to make better decisions over time with human interaction?" and "How should we evaluate the design and deployment of such systems?" A lack of appropriate consideration for the humans involved can lead to problematic system behaviour, and issues of fairness, accountability, and transparency. Thus, our goal with this work is to present a human-centred guide to designing and evaluating IML systems while mitigating risks. This guide is intended to be used by machine learning practitioners who are responsible for the health, safety, and well-being of interacting humans. An obligation of responsibility for public interaction means acting with integrity, honesty, fairness, and abiding by applicable legal statutes. With these values and principles in mind, we as a machine learning research community can better achieve goals of augmenting human skills and abilities. This practical guide therefore aims to support many of the responsible decisions necessary throughout the iterative design, development, and dissemination of IML systems.
{"title":"A Brief Guide to Designing and Evaluating Human-Centered Interactive Machine Learning","authors":"Kory W. Mathewson, Patrick M. Pilarski","doi":"arxiv-2204.09622","DOIUrl":"https://doi.org/arxiv-2204.09622","url":null,"abstract":"Interactive machine learning (IML) is a field of research that explores how\u0000to leverage both human and computational abilities in decision making systems.\u0000IML represents a collaboration between multiple complementary human and machine\u0000intelligent systems working as a team, each with their own unique abilities and\u0000limitations. This teamwork might mean that both systems take actions at the\u0000same time, or in sequence. Two major open research questions in the field of\u0000IML are: \"How should we design systems that can learn to make better decisions\u0000over time with human interaction?\" and \"How should we evaluate the design and\u0000deployment of such systems?\" A lack of appropriate consideration for the humans\u0000involved can lead to problematic system behaviour, and issues of fairness,\u0000accountability, and transparency. Thus, our goal with this work is to present a\u0000human-centred guide to designing and evaluating IML systems while mitigating\u0000risks. This guide is intended to be used by machine learning practitioners who\u0000are responsible for the health, safety, and well-being of interacting humans.\u0000An obligation of responsibility for public interaction means acting with\u0000integrity, honesty, fairness, and abiding by applicable legal statutes. With\u0000these values and principles in mind, we as a machine learning research\u0000community can better achieve goals of augmenting human skills and abilities.\u0000This practical guide therefore aims to support many of the responsible\u0000decisions necessary throughout the iterative design, development, and\u0000dissemination of IML systems.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this meta-ethnography, we explore three different angles of Ethical AI design and implementation in a top-down/bottom-up framework, including the philosophical ethical viewpoint, the technical perspective, and framing through a political lens. We will discuss the values and drawbacks of individual and hybrid approaches within this framework. Examples of approaches include ethics either being determined by corporations and governments (coming from the top), or ethics being called for by the people (coming from the bottom), as well as top-down, bottom-up, and hybrid technicalities of how AI is developed within a moral construct, in consideration of its developers and users, with expected and unexpected consequences and long-term impact. This investigation includes real-world case studies, philosophical debate, and theoretical future thought experimentation based on historical facts, current world circumstances, and possible ensuing realities.
{"title":"Contextualizing Artificially Intelligent Morality: A Meta-Ethnography of Top-Down, Bottom-Up, and Hybrid Models for Theoretical and Applied Ethics in Artificial Intelligence","authors":"Jennafer S. Roberts, Laura N. Montoya","doi":"arxiv-2204.07612","DOIUrl":"https://doi.org/arxiv-2204.07612","url":null,"abstract":"In this meta-ethnography, we explore three different angles of Ethical AI\u0000design and implementation in a top-down/bottom-up framework, including the\u0000philosophical ethical viewpoint, the technical perspective, and framing through\u0000a political lens. We will discuss the values and drawbacks of individual and\u0000hybrid approaches within this framework. Examples of approaches include ethics\u0000either being determined by corporations and governments (coming from the top),\u0000or ethics being called for by the people (coming from the bottom), as well as\u0000top-down, bottom-up, and hybrid technicalities of how AI is developed within a\u0000moral construct, in consideration of its developers and users, with expected\u0000and unexpected consequences and long-term impact. This investigation includes\u0000real-world case studies, philosophical debate, and theoretical future thought\u0000experimentation based on historical facts, current world circumstances, and\u0000possible ensuing realities.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Patterson, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, Jeff Dean
Machine Learning (ML) workloads have rapidly grown in importance, but raised concerns about their carbon footprint. Four best practices can reduce ML training energy by up to 100x and CO2 emissions up to 1000x. By following best practices, overall ML energy use (across research, development, and production) held steady at <15% of Google's total energy use for the past three years. If the whole ML field were to adopt best practices, total carbon emissions from training would reduce. Hence, we recommend that ML papers include emissions explicitly to foster competition on more than just model quality. Estimates of emissions in papers that omitted them have been off 100x-100,000x, so publishing emissions has the added benefit of ensuring accurate accounting. Given the importance of climate change, we must get the numbers right to make certain that we work on its biggest challenges.
{"title":"The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink","authors":"David Patterson, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, Jeff Dean","doi":"arxiv-2204.05149","DOIUrl":"https://doi.org/arxiv-2204.05149","url":null,"abstract":"Machine Learning (ML) workloads have rapidly grown in importance, but raised\u0000concerns about their carbon footprint. Four best practices can reduce ML\u0000training energy by up to 100x and CO2 emissions up to 1000x. By following best\u0000practices, overall ML energy use (across research, development, and production)\u0000held steady at <15% of Google's total energy use for the past three years. If\u0000the whole ML field were to adopt best practices, total carbon emissions from\u0000training would reduce. Hence, we recommend that ML papers include emissions\u0000explicitly to foster competition on more than just model quality. Estimates of\u0000emissions in papers that omitted them have been off 100x-100,000x, so\u0000publishing emissions has the added benefit of ensuring accurate accounting.\u0000Given the importance of climate change, we must get the numbers right to make\u0000certain that we work on its biggest challenges.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Leslie, Michael Katell, Mhairi Aitken, Jatinder Singh, Morgan Briggs, Rosamund Powell, Cami Rincón, Thompson Chengeta, Abeba Birhane, Antonella Perini, Smera Jayadeva, Anjali Mazumder
The Advancing Data Justice Research and Practice (ADJRP) project aims to widen the lens of current thinking around data justice and to provide actionable resources that will help policymakers, practitioners, and impacted communities gain a broader understanding of what equitable, freedom-promoting, and rights-sustaining data collection, governance, and use should look like in increasingly dynamic and global data innovation ecosystems. In this integrated literature review we hope to lay the conceptual groundwork needed to support this aspiration. The introduction motivates the broadening of data justice that is undertaken by the literature review which follows. First, we address how certain limitations of the current study of data justice drive the need for a re-location of data justice research and practice. We map out the strengths and shortcomings of the contemporary state of the art and then elaborate on the challenges faced by our own effort to broaden the data justice perspective in the decolonial context. The body of the literature review covers seven thematic areas. For each theme, the ADJRP team has systematically collected and analysed key texts in order to tell the critical empirical story of how existing social structures and power dynamics present challenges to data justice and related justice fields. In each case, this critical empirical story is also supplemented by the transformational story of how activists, policymakers, and academics are challenging longstanding structures of inequity to advance social justice in data innovation ecosystems and adjacent areas of technological practice.
{"title":"Advancing Data Justice Research and Practice: An Integrated Literature Review","authors":"David Leslie, Michael Katell, Mhairi Aitken, Jatinder Singh, Morgan Briggs, Rosamund Powell, Cami Rincón, Thompson Chengeta, Abeba Birhane, Antonella Perini, Smera Jayadeva, Anjali Mazumder","doi":"arxiv-2204.03090","DOIUrl":"https://doi.org/arxiv-2204.03090","url":null,"abstract":"The Advancing Data Justice Research and Practice (ADJRP) project aims to\u0000widen the lens of current thinking around data justice and to provide\u0000actionable resources that will help policymakers, practitioners, and impacted\u0000communities gain a broader understanding of what equitable, freedom-promoting,\u0000and rights-sustaining data collection, governance, and use should look like in\u0000increasingly dynamic and global data innovation ecosystems. In this integrated\u0000literature review we hope to lay the conceptual groundwork needed to support\u0000this aspiration. The introduction motivates the broadening of data justice that\u0000is undertaken by the literature review which follows. First, we address how\u0000certain limitations of the current study of data justice drive the need for a\u0000re-location of data justice research and practice. We map out the strengths and\u0000shortcomings of the contemporary state of the art and then elaborate on the\u0000challenges faced by our own effort to broaden the data justice perspective in\u0000the decolonial context. The body of the literature review covers seven thematic\u0000areas. For each theme, the ADJRP team has systematically collected and analysed\u0000key texts in order to tell the critical empirical story of how existing social\u0000structures and power dynamics present challenges to data justice and related\u0000justice fields. In each case, this critical empirical story is also\u0000supplemented by the transformational story of how activists, policymakers, and\u0000academics are challenging longstanding structures of inequity to advance social\u0000justice in data innovation ecosystems and adjacent areas of technological\u0000practice.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article focuses on the connection between the possibility of quantum computers, the predictability of complex quantum systems in nature, and the issue of free will.
本文重点讨论量子计算机的可能性、自然界复杂量子系统的可预测性和自由意志问题之间的联系。
{"title":"Quantum Computers, Predictability, and Free Will","authors":"Gil Kalai","doi":"arxiv-2204.02768","DOIUrl":"https://doi.org/arxiv-2204.02768","url":null,"abstract":"This article focuses on the connection between the possibility of quantum\u0000computers, the predictability of complex quantum systems in nature, and the\u0000issue of free will.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
From 1967 to 1974, an Electrologica X8 computer was installed at the Institute for Nuclear Research (IKO) in Amsterdam, primarily for online and offline evaluation of experimental data, an application quite different from its `brother's', X8's. During that time, the nuclear detection system `BOL' was in operation to study nuclear reactions. The BOL detector embodied a new and bold concept. It consisted of a large number of state-of-the-art detection units, mounted in a spherical arrangement around a target in a beam of nuclear particles. Two minicomputers performed data acquisition and control of the experiment and supported online visual display of acquired data. The X8 computer, networked with the minicomputers, allowed fast high-level data processing and analysis. Pioneering work in both experimental nuclear physics as well as in programming, turned out to be a surprisingly good combination. For the network with the X8 and the minicomputers, advanced software layers were developed to efficiently and flexibly program extensive data handling.
{"title":"The EL-X8 computer and the BOL detector Networking, programming, time-sharing and data-handling in the Amsterdam nuclear research project `BOL' A personal historical review","authors":"René van Dantzig","doi":"arxiv-2203.11280","DOIUrl":"https://doi.org/arxiv-2203.11280","url":null,"abstract":"From 1967 to 1974, an Electrologica X8 computer was installed at the\u0000Institute for Nuclear Research (IKO) in Amsterdam, primarily for online and\u0000offline evaluation of experimental data, an application quite different from\u0000its `brother's', X8's. During that time, the nuclear detection system `BOL' was\u0000in operation to study nuclear reactions. The BOL detector embodied a new and\u0000bold concept. It consisted of a large number of state-of-the-art detection\u0000units, mounted in a spherical arrangement around a target in a beam of nuclear\u0000particles. Two minicomputers performed data acquisition and control of the\u0000experiment and supported online visual display of acquired data. The X8\u0000computer, networked with the minicomputers, allowed fast high-level data\u0000processing and analysis. Pioneering work in both experimental nuclear physics\u0000as well as in programming, turned out to be a surprisingly good combination.\u0000For the network with the X8 and the minicomputers, advanced software layers\u0000were developed to efficiently and flexibly program extensive data handling.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Couples' relationships affect the physical health and emotional well-being of partners. Automatically recognizing each partner's emotions could give a better understanding of their individual emotional well-being, enable interventions and provide clinical benefits. In the paper, we summarize and synthesize works that have focused on developing and evaluating systems to automatically recognize the emotions of each partner based on couples' interaction or conversation contexts. We identified 28 articles from IEEE, ACM, Web of Science, and Google Scholar that were published between 2010 and 2021. We detail the datasets, features, algorithms, evaluation, and results of each work as well as present main themes. We also discuss current challenges, research gaps and propose future research directions. In summary, most works have used audio data collected from the lab with annotations done by external experts and used supervised machine learning approaches for binary classification of positive and negative affect. Performance results leave room for improvement with significant research gaps such as no recognition using data from daily life. This survey will enable new researchers to get an overview of this field and eventually enable the development of emotion recognition systems to inform interventions to improve the emotional well-being of couples.
{"title":"Emotion Recognition among Couples: A Survey","authors":"George Boateng, Elgar Fleisch, Tobias Kowatsch","doi":"arxiv-2202.08430","DOIUrl":"https://doi.org/arxiv-2202.08430","url":null,"abstract":"Couples' relationships affect the physical health and emotional well-being of\u0000partners. Automatically recognizing each partner's emotions could give a better\u0000understanding of their individual emotional well-being, enable interventions\u0000and provide clinical benefits. In the paper, we summarize and synthesize works\u0000that have focused on developing and evaluating systems to automatically\u0000recognize the emotions of each partner based on couples' interaction or\u0000conversation contexts. We identified 28 articles from IEEE, ACM, Web of\u0000Science, and Google Scholar that were published between 2010 and 2021. We\u0000detail the datasets, features, algorithms, evaluation, and results of each work\u0000as well as present main themes. We also discuss current challenges, research\u0000gaps and propose future research directions. In summary, most works have used\u0000audio data collected from the lab with annotations done by external experts and\u0000used supervised machine learning approaches for binary classification of\u0000positive and negative affect. Performance results leave room for improvement\u0000with significant research gaps such as no recognition using data from daily\u0000life. This survey will enable new researchers to get an overview of this field\u0000and eventually enable the development of emotion recognition systems to inform\u0000interventions to improve the emotional well-being of couples.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smart Cities are happening everywhere around us and yet they are still incomprehensibly far from directly impacting everyday life. What needs to happen to make cities really smart? Digital Twins (DTs) represent their Physical Twin (PT) in the real world through models, sensed data, context awareness, and interactions. A Digital Twin of a city appears to offer the right combination to make the Smart City accessible and thus usable. However, without appropriate interfaces, the complexity of a city cannot be represented. Ultimately, fully leveraging the potential of Smart Cities requires going beyond the Digital Twin. Can this issue be addressed? I advance embedding the Digital Twin into the Physical Twin, i.e. Fused Twins. Thus, this fusion allows access to data where it is generated in a context that can make it easily understandable. The Fused Twins paradigm is the formalization of this vision. Prototypes of Fused Twins are appearing at an neck-break speed from different domains but Smart Cities will be the context where Fused Twins will predominantly be seen in the future. This paper reviews Digital Twins to understand how Fused Twins can be constructed from Augmented Reality, Geographic Information Systems, Building/City Information Models and Digital Twins and provides an overview of current research and future directions.
{"title":"The Hitchhiker's Guide to Fused Twins -- A Conceptualization to Access Digital Twins in situ in Smart Cities","authors":"Jascha Grübel","doi":"arxiv-2202.07104","DOIUrl":"https://doi.org/arxiv-2202.07104","url":null,"abstract":"Smart Cities are happening everywhere around us and yet they are still\u0000incomprehensibly far from directly impacting everyday life. What needs to\u0000happen to make cities really smart? Digital Twins (DTs) represent their\u0000Physical Twin (PT) in the real world through models, sensed data, context\u0000awareness, and interactions. A Digital Twin of a city appears to offer the\u0000right combination to make the Smart City accessible and thus usable. However,\u0000without appropriate interfaces, the complexity of a city cannot be represented.\u0000Ultimately, fully leveraging the potential of Smart Cities requires going\u0000beyond the Digital Twin. Can this issue be addressed? I advance embedding the\u0000Digital Twin into the Physical Twin, i.e. Fused Twins. Thus, this fusion allows\u0000access to data where it is generated in a context that can make it easily\u0000understandable. The Fused Twins paradigm is the formalization of this vision.\u0000Prototypes of Fused Twins are appearing at an neck-break speed from different\u0000domains but Smart Cities will be the context where Fused Twins will\u0000predominantly be seen in the future. This paper reviews Digital Twins to\u0000understand how Fused Twins can be constructed from Augmented Reality,\u0000Geographic Information Systems, Building/City Information Models and Digital\u0000Twins and provides an overview of current research and future directions.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The modern increase in data production is driven by multiple factors, and several stakeholders from various sectors contribute to it. Although drawing a comparison of the sizes at stake for different big data players is hard due to the lack of official data, this report tries to reconstruct the yearly orders of magnitude generated by some of the most important organizations by mining several online sources. The estimation is based on retrieving meaningful unitary data production measures for each of the big data sources considered, and the yearly amounts are then obtained by conjecturing reasonable per-unit sizes. The final result is summarized in the form of a bubble plot.
{"title":"Survey of Big Data sizes in 2021","authors":"Luca Clissa","doi":"arxiv-2202.07659","DOIUrl":"https://doi.org/arxiv-2202.07659","url":null,"abstract":"The modern increase in data production is driven by multiple factors, and\u0000several stakeholders from various sectors contribute to it. Although drawing a\u0000comparison of the sizes at stake for different big data players is hard due to\u0000the lack of official data, this report tries to reconstruct the yearly orders\u0000of magnitude generated by some of the most important organizations by mining\u0000several online sources. The estimation is based on retrieving meaningful\u0000unitary data production measures for each of the big data sources considered,\u0000and the yearly amounts are then obtained by conjecturing reasonable per-unit\u0000sizes. The final result is summarized in the form of a bubble plot.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"138 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}