Sergio Davies, Alexander D. Rast, F. Galluppi, S. Furber
As an asynchronous universal multiprocessor for real-time neural simulation, SpiNNaker presents timing concerns not present in synchronous systems. In this paper we present a series of tools that solve the problem of synchronising a multichip distributed simulation containing multiple independent time domains. These tools hint at an important neural modelling capability of the SpiNNaker system: the ability to decouple the system time from the model time, leading to an abstract-time neural modelling platform.
{"title":"Maintaining real-time synchrony on SpiNNaker","authors":"Sergio Davies, Alexander D. Rast, F. Galluppi, S. Furber","doi":"10.1145/2016604.2016622","DOIUrl":"https://doi.org/10.1145/2016604.2016622","url":null,"abstract":"As an asynchronous universal multiprocessor for real-time neural simulation, SpiNNaker presents timing concerns not present in synchronous systems. In this paper we present a series of tools that solve the problem of synchronising a multichip distributed simulation containing multiple independent time domains. These tools hint at an important neural modelling capability of the SpiNNaker system: the ability to decouple the system time from the model time, leading to an abstract-time neural modelling platform.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121490860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes Memorage, a novel system architecture that synergistically manages persistent RAM (PRAM) main memory and a PRAM storage device. Memorage leverages the existing OS virtual memory manager to globally manage PRAM resources and to enhance the utilization of the available PRAM resources. Preliminary experimental and analytical evaluation suggests that Memorage can improve the performance of memory-intensive workloads (by 4.6X on average and up to 9.4X under the examined configuration). It also increases PRAM utilization and significantly extends the longetivity of the PRAM main memory (by 8X).
{"title":"Dynamic co-management of persistent RAM main memory and storage resources","authors":"J. Jung, Sangyeun Cho","doi":"10.1145/2016604.2016620","DOIUrl":"https://doi.org/10.1145/2016604.2016620","url":null,"abstract":"This paper proposes Memorage, a novel system architecture that synergistically manages persistent RAM (PRAM) main memory and a PRAM storage device. Memorage leverages the existing OS virtual memory manager to globally manage PRAM resources and to enhance the utilization of the available PRAM resources. Preliminary experimental and analytical evaluation suggests that Memorage can improve the performance of memory-intensive workloads (by 4.6X on average and up to 9.4X under the examined configuration). It also increases PRAM utilization and significantly extends the longetivity of the PRAM main memory (by 8X).","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124453450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge and complement these performance models and tools, by modeling and analyzing a lesser known, but very severe performance pitfall, called Partition Camping, in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of GPU kernels by up to seven-fold. There is no existing tool that can detect the partition camping effect in GPU kernels. Unlike the traditional performance modeling approaches, we predict a performance range that bounds the partition camping effect in the GPU kernel. Our idea of predicting a performance range, instead of the exact performance, is more realistic due to the large performance variations induced by partition camping. We design and develop the prediction model by first characterizing the effects of partition camping with an indigenous suite of micro-benchmarks. We then apply rigorous statistical regression techniques over the micro-benchmark data to predict the performance bounds of real GPU kernels, with and without the partition camping effect. We test the accuracy of our performance model by analyzing three real applications with known memory access patterns and partition camping effects. Our results show that the geometric mean of errors in our performance range prediction model is within 12% of the actual execution times. We also develop and present a very easy-to-use spreadsheet based tool called CampProf, which is a visual front-end to our performance range prediction model and can be used to gain insights into the degree of partition camping in GPU kernels. Lastly, we demonstrate how CampProf can be used to visually monitor the performance improvements in the kernels, as the partition camping effect is being removed.
{"title":"Bounding the effect of partition camping in GPU kernels","authors":"Ashwin M. Aji, Mayank Daga, Wu-chun Feng","doi":"10.1145/2016604.2016637","DOIUrl":"https://doi.org/10.1145/2016604.2016637","url":null,"abstract":"Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge and complement these performance models and tools, by modeling and analyzing a lesser known, but very severe performance pitfall, called Partition Camping, in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of GPU kernels by up to seven-fold. There is no existing tool that can detect the partition camping effect in GPU kernels.\u0000 Unlike the traditional performance modeling approaches, we predict a performance range that bounds the partition camping effect in the GPU kernel. Our idea of predicting a performance range, instead of the exact performance, is more realistic due to the large performance variations induced by partition camping. We design and develop the prediction model by first characterizing the effects of partition camping with an indigenous suite of micro-benchmarks. We then apply rigorous statistical regression techniques over the micro-benchmark data to predict the performance bounds of real GPU kernels, with and without the partition camping effect. We test the accuracy of our performance model by analyzing three real applications with known memory access patterns and partition camping effects. Our results show that the geometric mean of errors in our performance range prediction model is within 12% of the actual execution times.\u0000 We also develop and present a very easy-to-use spreadsheet based tool called CampProf, which is a visual front-end to our performance range prediction model and can be used to gain insights into the degree of partition camping in GPU kernels. Lastly, we demonstrate how CampProf can be used to visually monitor the performance improvements in the kernels, as the partition camping effect is being removed.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115020638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Franceschelli, P. Burgio, Giuseppe Tagliavini, A. Marongiu, M. Ruggiero, M. Lombardi, Alessio Bonfietti, M. Milano, L. Benini
We present MPOpt-Cell, an architecture-aware framework for high-productivity development and efficient execution of stream applications on the CELL BE Processor. It enables developers to quickly build Synchronous Data Flow (SDF) applications using a simple and intuitive programming interface based on a set of compiler directives that capture the key abstractions of SDF. The compiler backend and system runtime efficiently manage hardware resources.
{"title":"MPOpt-Cell: a high-performance data-flow programming environment for the CELL BE processor","authors":"A. Franceschelli, P. Burgio, Giuseppe Tagliavini, A. Marongiu, M. Ruggiero, M. Lombardi, Alessio Bonfietti, M. Milano, L. Benini","doi":"10.1145/2016604.2016618","DOIUrl":"https://doi.org/10.1145/2016604.2016618","url":null,"abstract":"We present MPOpt-Cell, an architecture-aware framework for high-productivity development and efficient execution of stream applications on the CELL BE Processor. It enables developers to quickly build Synchronous Data Flow (SDF) applications using a simple and intuitive programming interface based on a set of compiler directives that capture the key abstractions of SDF. The compiler backend and system runtime efficiently manage hardware resources.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132818864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Loop structures in programs have been regarded as a primary source of finding parallelism from sequential codes. In this paper, we present a new technique that dynamically detects precise loop structures with their inter-procedural nests on a dynamic binary translation system. Using precompiled application binary code as an input, our mechanism generates the simple but precise markers when they are loaded from their binary code image, and at runtime monitors loop structures with inter-procedural nesting on the fly using Loop-Call Context Graph. We implement our mechanism and evaluate it using SPEC CPU benchmark suite. The results show that our mechanism reveals precise loop structures with interprocedural loop nesting successfully. The results also show that ours can reduce overheads for loop analysis compared with the existing ones. These indicate that our mechanism can be applied to runtime optimization and parallelization as well as hints for performance tuning.
{"title":"On-the-fly detection of precise loop nests across procedures on a dynamic binary translation system","authors":"Yukinori Sato, Y. Inoguchi, Tadao Nakamura","doi":"10.1145/2016604.2016634","DOIUrl":"https://doi.org/10.1145/2016604.2016634","url":null,"abstract":"Loop structures in programs have been regarded as a primary source of finding parallelism from sequential codes. In this paper, we present a new technique that dynamically detects precise loop structures with their inter-procedural nests on a dynamic binary translation system. Using precompiled application binary code as an input, our mechanism generates the simple but precise markers when they are loaded from their binary code image, and at runtime monitors loop structures with inter-procedural nesting on the fly using Loop-Call Context Graph. We implement our mechanism and evaluate it using SPEC CPU benchmark suite. The results show that our mechanism reveals precise loop structures with interprocedural loop nesting successfully. The results also show that ours can reduce overheads for loop analysis compared with the existing ones. These indicate that our mechanism can be applied to runtime optimization and parallelization as well as hints for performance tuning.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114739609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The information-technology platform is being radically transformed with the widespread adoption of the cloud computing model supported by data centers containing large numbers of multicore servers. While cloud computing platforms can potentially enable a rich variety of distributed applications, the need to exploit multiscale parallelism at the inter-node and intra-node level poses significantly new challenges for software. Recent advances in the Google MapReduce and Hadoop frameworks have led to simplified programming models for a restricted class of distributed batch-processing applications. However, these frameworks do not support richer distributed application structures beyond map-reduce, and do not offer any solutions for exploiting shared-memory multicore parallelism at the intra-node level.
{"title":"CnC-Hadoop: a graphical coordination language for distributed multiscale parallelism","authors":"Riyaz Haque, David M. Peixotto, Vivek Sarkar","doi":"10.1145/2016604.2016626","DOIUrl":"https://doi.org/10.1145/2016604.2016626","url":null,"abstract":"The information-technology platform is being radically transformed with the widespread adoption of the cloud computing model supported by data centers containing large numbers of multicore servers. While cloud computing platforms can potentially enable a rich variety of distributed applications, the need to exploit multiscale parallelism at the inter-node and intra-node level poses significantly new challenges for software. Recent advances in the Google MapReduce and Hadoop frameworks have led to simplified programming models for a restricted class of distributed batch-processing applications. However, these frameworks do not support richer distributed application structures beyond map-reduce, and do not offer any solutions for exploiting shared-memory multicore parallelism at the intra-node level.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133227608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liu Peng, A. Nakano, Guangming Tan, P. Vashishta, Dongrui Fan, Hao Zhang, R. Kalia, Fenglong Song
Molecular dynamics (MD) simulation has broad applications, but its irregular memory-access pattern makes performance optimization a challenge. This paper presents a joint application/architecture study to enhance on-chip parallelism of MD on Godson-T -like many-core architecture. First, a preprocessing leveraging an adaptive divide-and-conquer framework is designed to exploit locality through memory hierarchy with software controlled memory. Then we propose three incremental optimization strategies: (1) a novel data-layout to re-organize linked-list cell data structures to improve data locality; (2) an on-chip locality-aware parallel algorithm to enhance data reuse; and (3) a pipelining algorithm to hide latency to shared memory. Experiments on Godson-T simulator exhibit strong-scaling parallel efficiency 0.99 on 64 cores, which is confirmed by an FPGA emulator. Detailed analysis shows that optimizations utilizing architectural features to maximize data locality and to enhance data reuse benefit scalability most. Furthermore, a simple performance model suggests that the optimization scheme is likely to scale well toward exascale. Certain architectural features are found essential for these optimizations, which could guide future hardware developments.
{"title":"Performance analysis and optimization of molecular dynamics simulation on Godson-T many-core processor","authors":"Liu Peng, A. Nakano, Guangming Tan, P. Vashishta, Dongrui Fan, Hao Zhang, R. Kalia, Fenglong Song","doi":"10.1145/2016604.2016643","DOIUrl":"https://doi.org/10.1145/2016604.2016643","url":null,"abstract":"Molecular dynamics (MD) simulation has broad applications, but its irregular memory-access pattern makes performance optimization a challenge. This paper presents a joint application/architecture study to enhance on-chip parallelism of MD on Godson-T -like many-core architecture. First, a preprocessing leveraging an adaptive divide-and-conquer framework is designed to exploit locality through memory hierarchy with software controlled memory. Then we propose three incremental optimization strategies: (1) a novel data-layout to re-organize linked-list cell data structures to improve data locality; (2) an on-chip locality-aware parallel algorithm to enhance data reuse; and (3) a pipelining algorithm to hide latency to shared memory. Experiments on Godson-T simulator exhibit strong-scaling parallel efficiency 0.99 on 64 cores, which is confirmed by an FPGA emulator. Detailed analysis shows that optimizations utilizing architectural features to maximize data locality and to enhance data reuse benefit scalability most. Furthermore, a simple performance model suggests that the optimization scheme is likely to scale well toward exascale. Certain architectural features are found essential for these optimizations, which could guide future hardware developments.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125131326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme Ottoni, G. Chinya, Gerolf Hoflehner, Jamison D. Collins, Amit Kumar, E. Schuchman, D. Ditzel, Ronak Singhal, Hong Wang
While the out-of-order engine has been the mainstream micro-architecture-design paradigm to achieve high performance, Transmeta took a different approach using dynamic binary translation (BT). To enable detailed comparison of these two radically different processor-design approaches, it is natural to leverage well-established simulation-based methodologies. However, BT-based processor designs pose new challenges to standard sampling-based simulation methodologies. This paper describes these challenges, and it also introduces the AstroLIT methodology to address them.
{"title":"AstroLIT: enabling simulation-based microarchitecture comparison between Intel® and Transmeta designs","authors":"Guilherme Ottoni, G. Chinya, Gerolf Hoflehner, Jamison D. Collins, Amit Kumar, E. Schuchman, D. Ditzel, Ronak Singhal, Hong Wang","doi":"10.1145/2016604.2016629","DOIUrl":"https://doi.org/10.1145/2016604.2016629","url":null,"abstract":"While the out-of-order engine has been the mainstream micro-architecture-design paradigm to achieve high performance, Transmeta took a different approach using dynamic binary translation (BT). To enable detailed comparison of these two radically different processor-design approaches, it is natural to leverage well-established simulation-based methodologies. However, BT-based processor designs pose new challenges to standard sampling-based simulation methodologies. This paper describes these challenges, and it also introduces the AstroLIT methodology to address them.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115911117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Shabbir, S. Stuijk, Akash Kumar, H. Corporaal, B. Mesman
Modern multimedia systems must support a variety of different use-cases. Multi-processors Systems-on-Chip (MPSoCs) are used to realize these systems. A system designer has to dimension the size of an MPSoC such that the performance constraints of the applications are satisfied in all use-cases. In this paper, we present an approach to design MPSoCs that can meet the throughput constraints of a set of applications while minimizing the resource requirements.
{"title":"An MPSoC design approach for multiple use-cases of throughput constrainted applications","authors":"A. Shabbir, S. Stuijk, Akash Kumar, H. Corporaal, B. Mesman","doi":"10.1145/2016604.2016628","DOIUrl":"https://doi.org/10.1145/2016604.2016628","url":null,"abstract":"Modern multimedia systems must support a variety of different use-cases. Multi-processors Systems-on-Chip (MPSoCs) are used to realize these systems. A system designer has to dimension the size of an MPSoC such that the performance constraints of the applications are satisfied in all use-cases. In this paper, we present an approach to design MPSoCs that can meet the throughput constraints of a set of applications while minimizing the resource requirements.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127033845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaining knowledge out of vast datasets is a main challenge in data-driven applications nowadays. Sparse grids provide a numerical method for both classification and regression in data mining which scales only linearly in the number of data points and is thus well-suited for huge amounts of data. Due to the recursive nature of sparse grid algorithms, they impose a challenge for the parallelization on modern hardware architectures such as accelerators. In this paper, we present the parallelization on several current task- and data-parallel platforms, covering multi-core CPUs with vector units, GPUs, and hybrid systems. Furthermore, we analyze the suitability of parallel programming languages for the implementation. Considering hardware, we restrict ourselves to the x86 platform with SSE and AVX vector extensions and to NVIDIA's Fermi architecture for GPUs. We consider both multi-core CPU and GPU architectures independently, as well as hybrid systems with up to 12 cores and 2 Fermi GPUs. With respect to parallel programming, we examine both the open standard OpenCL and Intel Array Building Blocks, a recently introduced high-level programming approach. As the baseline, we use the best results obtained with classically parallelized sparse grid algorithms and their OpenMP-parallelized intrinsics counterpart (SSE and AVX instructions), reporting both single and double precision measurements. The huge data sets we use are a real-life dataset stemming from astrophysics and an artificial one which exhibits challenging properties. In all settings, we achieve excellent results, obtaining speedups of more than 60 using single precision on a hybrid system.
{"title":"Multi- and many-core data mining with adaptive sparse grids","authors":"A. Heinecke, D. Pflüger","doi":"10.1145/2016604.2016640","DOIUrl":"https://doi.org/10.1145/2016604.2016640","url":null,"abstract":"Gaining knowledge out of vast datasets is a main challenge in data-driven applications nowadays. Sparse grids provide a numerical method for both classification and regression in data mining which scales only linearly in the number of data points and is thus well-suited for huge amounts of data. Due to the recursive nature of sparse grid algorithms, they impose a challenge for the parallelization on modern hardware architectures such as accelerators. In this paper, we present the parallelization on several current task- and data-parallel platforms, covering multi-core CPUs with vector units, GPUs, and hybrid systems. Furthermore, we analyze the suitability of parallel programming languages for the implementation.\u0000 Considering hardware, we restrict ourselves to the x86 platform with SSE and AVX vector extensions and to NVIDIA's Fermi architecture for GPUs. We consider both multi-core CPU and GPU architectures independently, as well as hybrid systems with up to 12 cores and 2 Fermi GPUs. With respect to parallel programming, we examine both the open standard OpenCL and Intel Array Building Blocks, a recently introduced high-level programming approach. As the baseline, we use the best results obtained with classically parallelized sparse grid algorithms and their OpenMP-parallelized intrinsics counterpart (SSE and AVX instructions), reporting both single and double precision measurements. The huge data sets we use are a real-life dataset stemming from astrophysics and an artificial one which exhibits challenging properties. In all settings, we achieve excellent results, obtaining speedups of more than 60 using single precision on a hybrid system.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124877562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}