Anees Al-Najjar, Nageswara S. V. Rao, Ramanan Sankaran, Helia Zandi, Debangshu Mukherjee, Maxim Ziatdinov, Craig Bridges
We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available across the ecosystem network. We demonstrate automated measurement transfers and remote steering operations in a microscopy use case for materials research over an ecosystem of Nion microscopes and computing platforms connected over site networks. The proposed framework is currently under further refinement and being adopted to science workflows with automated remote experiments steering for autonomous chemistry laboratories and smart energy grid simulations.
{"title":"Cyber Framework for Steering and Measurements Collection Over Instrument-Computing Ecosystems","authors":"Anees Al-Najjar, Nageswara S. V. Rao, Ramanan Sankaran, Helia Zandi, Debangshu Mukherjee, Maxim Ziatdinov, Craig Bridges","doi":"arxiv-2307.06883","DOIUrl":"https://doi.org/arxiv-2307.06883","url":null,"abstract":"We propose a framework to develop cyber solutions to support the remote\u0000steering of science instruments and measurements collection over\u0000instrument-computing ecosystems. It is based on provisioning separate data and\u0000control connections at the network level, and developing software modules\u0000consisting of Python wrappers for instrument commands and Pyro server-client\u0000codes that make them available across the ecosystem network. We demonstrate\u0000automated measurement transfers and remote steering operations in a microscopy\u0000use case for materials research over an ecosystem of Nion microscopes and\u0000computing platforms connected over site networks. The proposed framework is\u0000currently under further refinement and being adopted to science workflows with\u0000automated remote experiments steering for autonomous chemistry laboratories and\u0000smart energy grid simulations.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Experiments require human decisions in the design process, which in turn are reformulated and summarized as inputs into a system (computational or otherwise) to generate the experimental design. I leverage this system to promote a language of experimental designs by proposing a novel computational framework, called "the grammar of experimental designs", to specify experimental designs based on an object-oriented programming system that declaratively encapsulates the experimental structure. The framework aims to engage human cognition by building experimental designs with modular functions that modify a targeted singular element of the experimental design object. The syntax and semantics of the framework are built upon consideration from multiple perspectives. While the core framework is language-agnostic, the framework is implemented in the `edibble` R-package. A range of examples is shown to demonstrate the utility of the framework.
{"title":"Towards a unified language in experimental designs propagated by a software framework","authors":"Emi Tanaka","doi":"arxiv-2307.11593","DOIUrl":"https://doi.org/arxiv-2307.11593","url":null,"abstract":"Experiments require human decisions in the design process, which in turn are\u0000reformulated and summarized as inputs into a system (computational or\u0000otherwise) to generate the experimental design. I leverage this system to\u0000promote a language of experimental designs by proposing a novel computational\u0000framework, called \"the grammar of experimental designs\", to specify\u0000experimental designs based on an object-oriented programming system that\u0000declaratively encapsulates the experimental structure. The framework aims to\u0000engage human cognition by building experimental designs with modular functions\u0000that modify a targeted singular element of the experimental design object. The\u0000syntax and semantics of the framework are built upon consideration from\u0000multiple perspectives. While the core framework is language-agnostic, the\u0000framework is implemented in the `edibble` R-package. A range of examples is\u0000shown to demonstrate the utility of the framework.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Complexity is an important characteristic of any business process. The key assumption of much research in Business Process Management is that process complexity has a negative impact on process performance. So far, behavioral studies have measured complexity based on the perception of process stakeholders. The aim of this study is to investigate if such a connection can be supported based on the analysis of event log data. To do so, we employ a set of 38 metrics that capture different dimensions of process complexity. We use these metrics to build various regression models that explain process performance in terms of throughput time. We find that process complexity as captured in event logs explains the throughput time of process executions to a considerable extent, with the respective R-squared reaching up to 0.96. Our study offers implications for empirical research on process performance and can serve as a toolbox for practitioners.
{"title":"The Impact of Process Complexity on Process Performance: A Study using Event Log Data","authors":"Maxim Vidgof, Bastian Wurm, Jan Mendling","doi":"arxiv-2307.06106","DOIUrl":"https://doi.org/arxiv-2307.06106","url":null,"abstract":"Complexity is an important characteristic of any business process. The key\u0000assumption of much research in Business Process Management is that process\u0000complexity has a negative impact on process performance. So far, behavioral\u0000studies have measured complexity based on the perception of process\u0000stakeholders. The aim of this study is to investigate if such a connection can\u0000be supported based on the analysis of event log data. To do so, we employ a set\u0000of 38 metrics that capture different dimensions of process complexity. We use\u0000these metrics to build various regression models that explain process\u0000performance in terms of throughput time. We find that process complexity as\u0000captured in event logs explains the throughput time of process executions to a\u0000considerable extent, with the respective R-squared reaching up to 0.96. Our\u0000study offers implications for empirical research on process performance and can\u0000serve as a toolbox for practitioners.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138542812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a novel approach to scientific discovery using an artificial intelligence (AI) environment known as ChatGPT, developed by OpenAI. This is the first paper entirely generated with outputs from ChatGPT. We demonstrate how ChatGPT can be instructed through a gamification environment to define and benchmark hypothetical physical theories. Through this environment, ChatGPT successfully simulates the creation of a new improved model, called GPT$^4$, which combines the concepts of GPT in AI (generative pretrained transformer) and GPT in physics (generalized probabilistic theory). We show that GPT$^4$ can use its built-in mathematical and statistical capabilities to simulate and analyze physical laws and phenomena. As a demonstration of its language capabilities, GPT$^4$ also generates a limerick about itself. Overall, our results demonstrate the promising potential for human-AI collaboration in scientific discovery, as well as the importance of designing systems that effectively integrate AI's capabilities with human intelligence.
{"title":"Towards The Ultimate Brain: Exploring Scientific Discovery with ChatGPT AI","authors":"Gerardo Adesso","doi":"arxiv-2308.12400","DOIUrl":"https://doi.org/arxiv-2308.12400","url":null,"abstract":"This paper presents a novel approach to scientific discovery using an\u0000artificial intelligence (AI) environment known as ChatGPT, developed by OpenAI.\u0000This is the first paper entirely generated with outputs from ChatGPT. We\u0000demonstrate how ChatGPT can be instructed through a gamification environment to\u0000define and benchmark hypothetical physical theories. Through this environment,\u0000ChatGPT successfully simulates the creation of a new improved model, called\u0000GPT$^4$, which combines the concepts of GPT in AI (generative pretrained\u0000transformer) and GPT in physics (generalized probabilistic theory). We show\u0000that GPT$^4$ can use its built-in mathematical and statistical capabilities to\u0000simulate and analyze physical laws and phenomena. As a demonstration of its\u0000language capabilities, GPT$^4$ also generates a limerick about itself. Overall,\u0000our results demonstrate the promising potential for human-AI collaboration in\u0000scientific discovery, as well as the importance of designing systems that\u0000effectively integrate AI's capabilities with human intelligence.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In data storage and transmission, file compression is a common technique for reducing the volume of data, reducing data storage space and transmission time and bandwidth. However, there are significant differences in the compression performance of different types of file formats, and the benefits vary. In this paper, 22 file formats with approximately 178GB of data were collected and the Zlib algorithm was used for compression experiments to compare performance in order to investigate the compression gains of different file types. The experimental results show that some file types are poorly compressed, with almost constant file size and long compression time, resulting in lower gains; some other file types are significantly reduced in file size and compression time after compression, which can effectively reduce the data volume. Based on the above experimental results, this paper will then selectively reduce the data volume by compression in data storage and transmission for the file types in order to obtain the maximum compression yield.
{"title":"Compression Performance Analysis of Different File Formats","authors":"Han Yang, Guangjun Qin, Yongqing Hu","doi":"arxiv-2308.12275","DOIUrl":"https://doi.org/arxiv-2308.12275","url":null,"abstract":"In data storage and transmission, file compression is a common technique for\u0000reducing the volume of data, reducing data storage space and transmission time\u0000and bandwidth. However, there are significant differences in the compression\u0000performance of different types of file formats, and the benefits vary. In this\u0000paper, 22 file formats with approximately 178GB of data were collected and the\u0000Zlib algorithm was used for compression experiments to compare performance in\u0000order to investigate the compression gains of different file types. The\u0000experimental results show that some file types are poorly compressed, with\u0000almost constant file size and long compression time, resulting in lower gains;\u0000some other file types are significantly reduced in file size and compression\u0000time after compression, which can effectively reduce the data volume. Based on\u0000the above experimental results, this paper will then selectively reduce the\u0000data volume by compression in data storage and transmission for the file types\u0000in order to obtain the maximum compression yield.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed MokbelUniversity of Minnesota, Minneapolis, USA, Mahmoud SakrUniversité Libre, Brussels, Belgium, Li XiongEmory University, Atlanta, USA, Andreas ZüfleEmory University, Atlanta, USA, Jussara AlmeidaFederal University of Minas Gerais, Belo Horizonte, Brazil, Walid ArefPurdue University, West Lafayette, USA, Gennady AndrienkoFraunhofer IAIS, St. Augustin, Germany, Natalia AndrienkoFraunhofer IAIS, St. Augustin, Germany, Yang CaoKyoto University, Kyoto, Japan, Sanjay ChawlaQatar Computing Research Institute, Doha, Qatar, Reynold ChengUniversity of Hong Kong, Hong Kong, China, Panos ChrysanthisUniversity of Pittsburgh, Pennsylvania, USA, Xiqi FeiGeorge Mason University, Fairfax, USA, Gabriel GhinitaUniversity of Massachusetts at Boston, Boston, USA, Anita GraserAustrian Institute of Technology, Vienna, Austria, Dimitrios GunopulosUniversity of Athens, Greece, Christian JensenAalborg University, Denmark, Joon-Sook KimOak Ridge National Laboratory, USA, Kyoung-Sook KimAIST, Tokyo Waterfront, Japan, Peer KrögerUniversity of Kiel, Germany, John KrummUniversity of Southern California, Log Angeles, USA, Johannes LauerHERE Technologies, Germany, Amr MagdyUniversity of California, Riverside, USA, Mario NascimentoNortheastern University, Boston, USA, Siva RavadaOracle Corp., Nashua, USA, Matthias RenzUniversity of Kiel, Germany, Dimitris SacharidisUniversité Libre, Brussels, Belgium, Cyrus ShahabiUniversity of Southern California, Log Angeles, USA, Flora SalimUniversity of New South Wales, Sydney, Australia, Mohamed SarwatArizona State University, Tempe, Maxime SchoemansUniversité Libre, Brussels, Belgium, Bettina SpeckmannTU Eindhoven, Netherlands, Egemen TaninUniversity of Melbourne, Australia, Yannis TheodoridisUniversity of Piraeus, Greece, Kristian TorpAalborg University, Denmark, Goce TrajcevskiIowa State University, Ames, USA, Marc van KreveldUtrecht University, Netherlands, Carola WenkTulane University, New Orleans, USA, Martin WernerTechnical University of Munich, Munich, Germany, Raymond WongHong Kong University of Science and Technology, Hong Kong, China, Song WuUniversité Libre, Brussels, Belgium, Jianqiu XuNanjing University of Aeronautics and Astronautics, Nanjing, China, Moustafa YoussefAUC and Alexandria University, Egypt, Demetris ZeinalipourUniversity of Cyprus, Nicosia, Cyprus, Mengxuan ZhangIowa State University, Ames, USA, Esteban ZimányiUniversité Libre, Brussels, Belgium
Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years.
{"title":"Towards Mobility Data Science (Vision Paper)","authors":"Mohamed MokbelUniversity of Minnesota, Minneapolis, USA, Mahmoud SakrUniversité Libre, Brussels, Belgium, Li XiongEmory University, Atlanta, USA, Andreas ZüfleEmory University, Atlanta, USA, Jussara AlmeidaFederal University of Minas Gerais, Belo Horizonte, Brazil, Walid ArefPurdue University, West Lafayette, USA, Gennady AndrienkoFraunhofer IAIS, St. Augustin, Germany, Natalia AndrienkoFraunhofer IAIS, St. Augustin, Germany, Yang CaoKyoto University, Kyoto, Japan, Sanjay ChawlaQatar Computing Research Institute, Doha, Qatar, Reynold ChengUniversity of Hong Kong, Hong Kong, China, Panos ChrysanthisUniversity of Pittsburgh, Pennsylvania, USA, Xiqi FeiGeorge Mason University, Fairfax, USA, Gabriel GhinitaUniversity of Massachusetts at Boston, Boston, USA, Anita GraserAustrian Institute of Technology, Vienna, Austria, Dimitrios GunopulosUniversity of Athens, Greece, Christian JensenAalborg University, Denmark, Joon-Sook KimOak Ridge National Laboratory, USA, Kyoung-Sook KimAIST, Tokyo Waterfront, Japan, Peer KrögerUniversity of Kiel, Germany, John KrummUniversity of Southern California, Log Angeles, USA, Johannes LauerHERE Technologies, Germany, Amr MagdyUniversity of California, Riverside, USA, Mario NascimentoNortheastern University, Boston, USA, Siva RavadaOracle Corp., Nashua, USA, Matthias RenzUniversity of Kiel, Germany, Dimitris SacharidisUniversité Libre, Brussels, Belgium, Cyrus ShahabiUniversity of Southern California, Log Angeles, USA, Flora SalimUniversity of New South Wales, Sydney, Australia, Mohamed SarwatArizona State University, Tempe, Maxime SchoemansUniversité Libre, Brussels, Belgium, Bettina SpeckmannTU Eindhoven, Netherlands, Egemen TaninUniversity of Melbourne, Australia, Yannis TheodoridisUniversity of Piraeus, Greece, Kristian TorpAalborg University, Denmark, Goce TrajcevskiIowa State University, Ames, USA, Marc van KreveldUtrecht University, Netherlands, Carola WenkTulane University, New Orleans, USA, Martin WernerTechnical University of Munich, Munich, Germany, Raymond WongHong Kong University of Science and Technology, Hong Kong, China, Song WuUniversité Libre, Brussels, Belgium, Jianqiu XuNanjing University of Aeronautics and Astronautics, Nanjing, China, Moustafa YoussefAUC and Alexandria University, Egypt, Demetris ZeinalipourUniversity of Cyprus, Nicosia, Cyprus, Mengxuan ZhangIowa State University, Ames, USA, Esteban ZimányiUniversité Libre, Brussels, Belgium","doi":"arxiv-2307.05717","DOIUrl":"https://doi.org/arxiv-2307.05717","url":null,"abstract":"Mobility data captures the locations of moving objects such as humans,\u0000animals, and cars. With the availability of GPS-equipped mobile devices and\u0000other inexpensive location-tracking technologies, mobility data is collected\u0000ubiquitously. In recent years, the use of mobility data has demonstrated\u0000significant impact in various domains including traffic management, urban\u0000planning, and health sciences. In this paper, we present the emerging domain of\u0000mobility data science. Towards a unified approach to mobility data science, we\u0000envision a pipeline having the following components: mobility data collection,\u0000cleaning, analysis, management, and privacy. For each of these components, we\u0000explain how mobility data science differs from general data science, we survey\u0000the current state of the art and describe open challenges for the research\u0000community in the coming years.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas LambertRIATE, CNRS, Timothée GiraudRIATE, CNRS, Ronan YsebaertRIATE, UPCité
This chapter deepens cartographic communication through a cartographic multirepresentation exercise. Using a single dataset on World population data, the chapter presents a series of 13 different maps to illustrate how mapping is primarily a matter of choices and methods.
{"title":"Enjeux de communication dans la multirepr{é}sentation cartographique reproductible","authors":"Nicolas LambertRIATE, CNRS, Timothée GiraudRIATE, CNRS, Ronan YsebaertRIATE, UPCité","doi":"arxiv-2306.10862","DOIUrl":"https://doi.org/arxiv-2306.10862","url":null,"abstract":"This chapter deepens cartographic communication through a cartographic\u0000multirepresentation exercise. Using a single dataset on World population data,\u0000the chapter presents a series of 13 different maps to illustrate how mapping is\u0000primarily a matter of choices and methods.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"110 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Origin-destination~(OD) flow modeling is an extensively researched subject across multiple disciplines, such as the investigation of travel demand in transportation and spatial interaction modeling in geography. However, researchers from different fields tend to employ their own unique research paradigms and lack interdisciplinary communication, preventing the cross-fertilization of knowledge and the development of novel solutions to challenges. This article presents a systematic interdisciplinary survey that comprehensively and holistically scrutinizes OD flows from utilizing fundamental theory to studying the mechanism of population mobility and solving practical problems with engineering techniques, such as computational models. Specifically, regional economics, urban geography, and sociophysics are adept at employing theoretical research methods to explore the underlying mechanisms of OD flows. They have developed three influential theoretical models: the gravity model, the intervening opportunities model, and the radiation model. These models specifically focus on examining the fundamental influences of distance, opportunities, and population on OD flows, respectively. In the meantime, fields such as transportation, urban planning, and computer science primarily focus on addressing four practical problems: OD prediction, OD construction, OD estimation, and OD forecasting. Advanced computational models, such as deep learning models, have gradually been introduced to address these problems more effectively. Finally, based on the existing research, this survey summarizes current challenges and outlines future directions for this topic. Through this survey, we aim to break down the barriers between disciplines in OD flow-related research, fostering interdisciplinary perspectives and modes of thinking.
{"title":"An Interdisciplinary Survey on Origin-destination Flows Modeling: Theory and Techniques","authors":"Can Rong, Jingtao Ding, Yong Li","doi":"arxiv-2306.10048","DOIUrl":"https://doi.org/arxiv-2306.10048","url":null,"abstract":"Origin-destination~(OD) flow modeling is an extensively researched subject\u0000across multiple disciplines, such as the investigation of travel demand in\u0000transportation and spatial interaction modeling in geography. However,\u0000researchers from different fields tend to employ their own unique research\u0000paradigms and lack interdisciplinary communication, preventing the\u0000cross-fertilization of knowledge and the development of novel solutions to\u0000challenges. This article presents a systematic interdisciplinary survey that\u0000comprehensively and holistically scrutinizes OD flows from utilizing\u0000fundamental theory to studying the mechanism of population mobility and solving\u0000practical problems with engineering techniques, such as computational models.\u0000Specifically, regional economics, urban geography, and sociophysics are adept\u0000at employing theoretical research methods to explore the underlying mechanisms\u0000of OD flows. They have developed three influential theoretical models: the\u0000gravity model, the intervening opportunities model, and the radiation model.\u0000These models specifically focus on examining the fundamental influences of\u0000distance, opportunities, and population on OD flows, respectively. In the\u0000meantime, fields such as transportation, urban planning, and computer science\u0000primarily focus on addressing four practical problems: OD prediction, OD\u0000construction, OD estimation, and OD forecasting. Advanced computational models,\u0000such as deep learning models, have gradually been introduced to address these\u0000problems more effectively. Finally, based on the existing research, this survey\u0000summarizes current challenges and outlines future directions for this topic.\u0000Through this survey, we aim to break down the barriers between disciplines in\u0000OD flow-related research, fostering interdisciplinary perspectives and modes of\u0000thinking.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a new method of natural language acquisition for robots that does not require the conversion of speech to text. Folks'Talks employs voice2voice technology that enables a robot to understand the meaning of what it is told and to have the ability to learn and understand new languages - inclusive of accent, dialect, and physiological differences. To do this, sound processing and computer vision are incorporated to give the robot a sense of spatiotemporal causality. The "language model" we are proposing equips a robot to imitate a natural speaker's conversational behavior by thinking contextually and articulating its surroundings.
{"title":"Robo Sapiens","authors":"Chaim Ash, Amelia Hans","doi":"arxiv-2310.08323","DOIUrl":"https://doi.org/arxiv-2310.08323","url":null,"abstract":"This paper proposes a new method of natural language acquisition for robots\u0000that does not require the conversion of speech to text. Folks'Talks employs\u0000voice2voice technology that enables a robot to understand the meaning of what\u0000it is told and to have the ability to learn and understand new languages -\u0000inclusive of accent, dialect, and physiological differences. To do this, sound\u0000processing and computer vision are incorporated to give the robot a sense of\u0000spatiotemporal causality. The \"language model\" we are proposing equips a robot\u0000to imitate a natural speaker's conversational behavior by thinking contextually\u0000and articulating its surroundings.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Terrance Liu, Jingwu Tang, Giuseppe Vietri, Zhiwei Steven Wu
We study the problem of efficiently generating differentially private synthetic data that approximate the statistical properties of an underlying sensitive dataset. In recent years, there has been a growing line of work that approaches this problem using first-order optimization techniques. However, such techniques are restricted to optimizing differentiable objectives only, severely limiting the types of analyses that can be conducted. For example, first-order mechanisms have been primarily successful in approximating statistical queries only in the form of marginals for discrete data domains. In some cases, one can circumvent such issues by relaxing the task's objective to maintain differentiability. However, even when possible, these approaches impose a fundamental limitation in which modifications to the minimization problem become additional sources of error. Therefore, we propose Private-GSD, a private genetic algorithm based on zeroth-order optimization heuristics that do not require modifying the original objective. As a result, it avoids the aforementioned limitations of first-order optimization. We empirically evaluate Private-GSD against baseline algorithms on data derived from the American Community Survey across a variety of statistics--otherwise known as statistical queries--both for discrete and real-valued attributes. We show that Private-GSD outperforms the state-of-the-art methods on non-differential queries while matching accuracy in approximating differentiable ones.
{"title":"Generating Private Synthetic Data with Genetic Algorithms","authors":"Terrance Liu, Jingwu Tang, Giuseppe Vietri, Zhiwei Steven Wu","doi":"arxiv-2306.03257","DOIUrl":"https://doi.org/arxiv-2306.03257","url":null,"abstract":"We study the problem of efficiently generating differentially private\u0000synthetic data that approximate the statistical properties of an underlying\u0000sensitive dataset. In recent years, there has been a growing line of work that\u0000approaches this problem using first-order optimization techniques. However,\u0000such techniques are restricted to optimizing differentiable objectives only,\u0000severely limiting the types of analyses that can be conducted. For example,\u0000first-order mechanisms have been primarily successful in approximating\u0000statistical queries only in the form of marginals for discrete data domains. In\u0000some cases, one can circumvent such issues by relaxing the task's objective to\u0000maintain differentiability. However, even when possible, these approaches\u0000impose a fundamental limitation in which modifications to the minimization\u0000problem become additional sources of error. Therefore, we propose Private-GSD,\u0000a private genetic algorithm based on zeroth-order optimization heuristics that\u0000do not require modifying the original objective. As a result, it avoids the\u0000aforementioned limitations of first-order optimization. We empirically evaluate\u0000Private-GSD against baseline algorithms on data derived from the American\u0000Community Survey across a variety of statistics--otherwise known as statistical\u0000queries--both for discrete and real-valued attributes. We show that Private-GSD\u0000outperforms the state-of-the-art methods on non-differential queries while\u0000matching accuracy in approximating differentiable ones.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"238 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138522103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}