{"title":"Broaden Multidisciplinary Data Science Research by an Innovative Cyberinfrastructure Platform","authors":"Dan Lo, Kai Qian, Yong Shi, H. Shahriar, Chung Ng","doi":"10.1109/COMPSAC54236.2022.00074","DOIUrl":null,"url":null,"abstract":"Data science, machine learning, and distributed computational models have evolved dramatically over the last decade. Cloud and cluster computing is full-fledged and ready for processing big data. Data driven research and decision have become the trend in multiple disciplines. However, very few organizations have experienced the full impact or competitive advantage from their advanced data analytics initiatives despite significant investments in data science and machine learning. There are a number of issues resulting in such a phenomenon including difficult to maintain and configure a cluster, complex transition from a platform to another, sophisticated programming interfaces to machine learning libraries, network congestion, and most importantly lake of well-trained personnel to sanitize and analyze data. We propose a flexible heterogeneous computing cluster with off-the-shelf computers and a Blockly programming interface for multidisciplinary users such as cybersecurity ana-lyst, biologist, geologist, musician, and choreographer.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC54236.2022.00074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data science, machine learning, and distributed computational models have evolved dramatically over the last decade. Cloud and cluster computing is full-fledged and ready for processing big data. Data driven research and decision have become the trend in multiple disciplines. However, very few organizations have experienced the full impact or competitive advantage from their advanced data analytics initiatives despite significant investments in data science and machine learning. There are a number of issues resulting in such a phenomenon including difficult to maintain and configure a cluster, complex transition from a platform to another, sophisticated programming interfaces to machine learning libraries, network congestion, and most importantly lake of well-trained personnel to sanitize and analyze data. We propose a flexible heterogeneous computing cluster with off-the-shelf computers and a Blockly programming interface for multidisciplinary users such as cybersecurity ana-lyst, biologist, geologist, musician, and choreographer.