{"title":"The Role of NonSQL Databases in Big Data","authors":"Antonio Sarasa Cabezuelo","doi":"10.1201/9780429507670-5","DOIUrl":"https://doi.org/10.1201/9780429507670-5","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"34 1","pages":"93-112"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81707665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1201/9780429507670-15
Christophe Thovex
{"title":"When Big Data and Data Science Prefigured Ambient Intelligence","authors":"Christophe Thovex","doi":"10.1201/9780429507670-15","DOIUrl":"https://doi.org/10.1201/9780429507670-15","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"222 1","pages":"319-342"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75166339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Barranco-Chamorro, S. Muñoz-Armayones, A. Romero-Losada, F. Romero-Campero
{"title":"Multivariate Projection Techniques to Reduce Dimensionality in Large Datasets","authors":"I. Barranco-Chamorro, S. Muñoz-Armayones, A. Romero-Losada, F. Romero-Campero","doi":"10.1201/9780429507670-7","DOIUrl":"https://doi.org/10.1201/9780429507670-7","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"12 1","pages":"133-160"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81739128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1201/9780429507670-11
Theodora Chaspari, Adela C. Timmons, G. Margolin
{"title":"Population-Specific and Personalized (PSP) Models of Human Behavior for Leveraging Smart and Connected Data","authors":"Theodora Chaspari, Adela C. Timmons, G. Margolin","doi":"10.1201/9780429507670-11","DOIUrl":"https://doi.org/10.1201/9780429507670-11","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"1 1","pages":"243-258"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80264648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1201/9780429507670-16
Edward T. Chen
{"title":"Ethical Issues and Considerations of Big Data","authors":"Edward T. Chen","doi":"10.1201/9780429507670-16","DOIUrl":"https://doi.org/10.1201/9780429507670-16","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"342 1","pages":"343-358"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78053473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Yan, Xueqin Liang, Wenxiu Ding, Xixun Yu, Mingjun Wang, R. Deng
{"title":"Encrypted Big Data Deduplication in Cloud Storage","authors":"Zheng Yan, Xueqin Liang, Wenxiu Ding, Xixun Yu, Mingjun Wang, R. Deng","doi":"10.1201/9780429507670-4","DOIUrl":"https://doi.org/10.1201/9780429507670-4","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"7 1","pages":"63-92"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81436262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shashikant Ilager, R. Wankar, Raghavendra Kune, R. Buyya
Due to the surge in the volume of data generated and rapid advancement in Artificial Intelligence (AI) techniques like machine learning and deep learning, the existing traditional computing models have become inadequate to process an enormous volume of data and the complex application logic for extracting intrinsic information. Computing accelerators such as Graphics processing units (GPUs) have become de facto SIMD computing system for many big data and machine learning applications. On the other hand, the traditional computing model has gradually switched from conventional ownership-based computing to subscription-based cloud computing model. However, the lack of programming models and frameworks to develop cloud-native applications in a seamless manner to utilize both CPU and GPU resources in the cloud has become a bottleneck for rapid application development. To support this application demand for simultaneous heterogeneous resource usage, programming models and new frameworks are needed to manage the underlying resources effectively. Aneka is emerged as a popular PaaS computing model for the development of Cloud applications using multiple programming models like Thread, Task, and MapReduce in a single container .NET platform. Since, Aneka addresses MIMD application development that uses CPU based resources and GPU programming like CUDA is designed for SIMD application development, here, the chapter discusses GPU PaaS computing model for Aneka Clouds for rapid cloud application development for .NET platforms. The popular opensource GPU libraries are utilized and integrated it into the existing Aneka task programming model. The scheduling policies are extended that automatically identify GPU machines and schedule respective tasks accordingly. A case study on image processing is discussed to demonstrate the system, which has been built using PaaS Aneka SDKs and CUDA library.
{"title":"GPU PaaS Computation Model in Aneka Cloud Computing Environment","authors":"Shashikant Ilager, R. Wankar, Raghavendra Kune, R. Buyya","doi":"10.1201/9780429507670-2","DOIUrl":"https://doi.org/10.1201/9780429507670-2","url":null,"abstract":"Due to the surge in the volume of data generated and rapid advancement in Artificial Intelligence (AI) techniques like machine learning and deep learning, the existing traditional computing models have become inadequate to process an enormous volume of data and the complex application logic for extracting intrinsic information. Computing accelerators such as Graphics processing units (GPUs) have become de facto SIMD computing system for many big data and machine learning applications. On the other hand, the traditional computing model has gradually switched from conventional ownership-based computing to subscription-based cloud computing model. However, the lack of programming models and frameworks to develop cloud-native applications in a seamless manner to utilize both CPU and GPU resources in the cloud has become a bottleneck for rapid application development. To support this application demand for simultaneous heterogeneous resource usage, programming models and new frameworks are needed to manage the underlying resources effectively. Aneka is emerged as a popular PaaS computing model for the development of Cloud applications using multiple programming models like Thread, Task, and MapReduce in a single container .NET platform. Since, Aneka addresses MIMD application development that uses CPU based resources and GPU programming like CUDA is designed for SIMD application development, here, the chapter discusses GPU PaaS computing model for Aneka Clouds for rapid cloud application development for .NET platforms. The popular opensource GPU libraries are utilized and integrated it into the existing Aneka task programming model. The scheduling policies are extended that automatically identify GPU machines and schedule respective tasks accordingly. A case study on image processing is discussed to demonstrate the system, which has been built using PaaS Aneka SDKs and CUDA library.","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"13 1","pages":"19-40"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85612353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work has been developed under the Fundacao para a Ciencia e Tecnologia PDLAB project UID/MULTI/04111/2016.
这项工作是在Fundacao para Ciencia e technologies PDLAB项目UID/MULTI/04111/2016下进行的。
{"title":"The Role of Smart Data in Inference of Human Behavior and Interaction","authors":"Rute C. Sofia, Liliana Carvalho, F. M. Pereira","doi":"10.1201/9780429507670-9","DOIUrl":"https://doi.org/10.1201/9780429507670-9","url":null,"abstract":"This work has been developed under the Fundacao para a Ciencia e Tecnologia PDLAB project UID/MULTI/04111/2016.","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"37 1","pages":"191-214"},"PeriodicalIF":0.0,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78056696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-07DOI: 10.1109/BIGCOMP.2019.8679482
Zhuoyue Zhao, Eric Lo, Kenny Q. Zhu, Chris Liu
The Apache Spark stack has enabled fast large-scale data processing. Despite a rich library of statistical models and inference algorithms, it does not give domain users the ability to develop their own models. The emergence of probabilistic programming languages has showed the promise of developing sophisticated probabilistic models in a succinct and programmatic way. These frameworks have the potential of automatically generating inference algorithms for the user defined models and answering various statistical queries about the model. It is a perfect time to unite these two great directions to produce a programmable big data analysis framework. We thus propose, InferSpark, a probabilistic programming framework on top of Apache Spark. Efficient statistical inference can be easily implemented on this framework and inference process can leverage the distributed main memory processing power of Spark. This framework makes statistical inference on big data possible and speed up the penetration of probabilistic programming into the data engineering domain.
{"title":"InferSpark: Statistical Inference at Scale","authors":"Zhuoyue Zhao, Eric Lo, Kenny Q. Zhu, Chris Liu","doi":"10.1109/BIGCOMP.2019.8679482","DOIUrl":"https://doi.org/10.1109/BIGCOMP.2019.8679482","url":null,"abstract":"The Apache Spark stack has enabled fast large-scale data processing. Despite a rich library of statistical models and inference algorithms, it does not give domain users the ability to develop their own models. The emergence of probabilistic programming languages has showed the promise of developing sophisticated probabilistic models in a succinct and programmatic way. These frameworks have the potential of automatically generating inference algorithms for the user defined models and answering various statistical queries about the model. It is a perfect time to unite these two great directions to produce a programmable big data analysis framework. We thus propose, InferSpark, a probabilistic programming framework on top of Apache Spark. Efficient statistical inference can be easily implemented on this framework and inference process can leverage the distributed main memory processing power of Spark. This framework makes statistical inference on big data possible and speed up the penetration of probabilistic programming into the data engineering domain.","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"5 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2017-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81834788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-01-01DOI: 10.1109/BIGCOMP.2014.6741439
Pengfei Jia, Chunkai Zhang, Zhenyu He
{"title":"A new sampling approach for classification of imbalanced data sets with high density","authors":"Pengfei Jia, Chunkai Zhang, Zhenyu He","doi":"10.1109/BIGCOMP.2014.6741439","DOIUrl":"https://doi.org/10.1109/BIGCOMP.2014.6741439","url":null,"abstract":"","PeriodicalId":93400,"journal":{"name":"... International Conference on Big Data and Smart Computing. International Conference on Big Data and Smart Computing","volume":"738 1","pages":"217-222"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76474726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}