{"title":"COTS集群上的并行数据索引与存储","authors":"Anil L. Pereira","doi":"10.1109/CSCI51800.2020.00236","DOIUrl":null,"url":null,"abstract":"In this paper, a data transfer, indexing and storage system on a commodity-off-the-shelf cluster using parallel processing, asynchronous file Input/Output, direct memory access and asynchronous User Datagram Protocol sockets is proposed. Also, a performance evaluation framework for the system is described. There are two main considerations in developing the system. First, as data communication networks support increased data rates due to fiber optical cables and more efficient network devices, better data transfer and storage methods are required to exploit the speed of the networks. Second, applications in particle physics, climate modeling and weapon systems simulation generate petabytes of data from a single experiment. The challenge is to index and store the data as soon as it is produced and preprocessed by several instruments.","PeriodicalId":336929,"journal":{"name":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"05 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parallel Data Indexing and Storage on a COTS Cluster\",\"authors\":\"Anil L. Pereira\",\"doi\":\"10.1109/CSCI51800.2020.00236\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a data transfer, indexing and storage system on a commodity-off-the-shelf cluster using parallel processing, asynchronous file Input/Output, direct memory access and asynchronous User Datagram Protocol sockets is proposed. Also, a performance evaluation framework for the system is described. There are two main considerations in developing the system. First, as data communication networks support increased data rates due to fiber optical cables and more efficient network devices, better data transfer and storage methods are required to exploit the speed of the networks. Second, applications in particle physics, climate modeling and weapon systems simulation generate petabytes of data from a single experiment. The challenge is to index and store the data as soon as it is produced and preprocessed by several instruments.\",\"PeriodicalId\":336929,\"journal\":{\"name\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"05 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI51800.2020.00236\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI51800.2020.00236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallel Data Indexing and Storage on a COTS Cluster
In this paper, a data transfer, indexing and storage system on a commodity-off-the-shelf cluster using parallel processing, asynchronous file Input/Output, direct memory access and asynchronous User Datagram Protocol sockets is proposed. Also, a performance evaluation framework for the system is described. There are two main considerations in developing the system. First, as data communication networks support increased data rates due to fiber optical cables and more efficient network devices, better data transfer and storage methods are required to exploit the speed of the networks. Second, applications in particle physics, climate modeling and weapon systems simulation generate petabytes of data from a single experiment. The challenge is to index and store the data as soon as it is produced and preprocessed by several instruments.