Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091089
Nazario Irizarry
Designing controllers to orchestrate repetitive compute flows in both embedded and multi-node heterogeneous compute systems can be a tedious activity that gets increasingly difficult as more constraints are placed on compute elements and the system and as internal connections get more complex. It becomes difficult to manually analyze the timing characteristics and resource utilization profiles for the most beneficial flow solutions when there are multiple busses, networks, data buffers, and processor choices. The controller design must consider the sequencing of the operations, the movement of data, the utilization of limited resources, and the mechanics of controlling the system while satisfying system limitations. This paper presents a model for expressing resources, constraints, and flows, then automatically finding a flow solution and generating a controller. Automation frees the engineer to analyze timing profiles and to implement generic interfaces that the generated controller can use to interact with and command the system automatically.
{"title":"Model-based compute orchestration for resource-constrained repeating flows","authors":"Nazario Irizarry","doi":"10.1109/HPEC.2017.8091089","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091089","url":null,"abstract":"Designing controllers to orchestrate repetitive compute flows in both embedded and multi-node heterogeneous compute systems can be a tedious activity that gets increasingly difficult as more constraints are placed on compute elements and the system and as internal connections get more complex. It becomes difficult to manually analyze the timing characteristics and resource utilization profiles for the most beneficial flow solutions when there are multiple busses, networks, data buffers, and processor choices. The controller design must consider the sequencing of the operations, the movement of data, the utilization of limited resources, and the mechanics of controlling the system while satisfying system limitations. This paper presents a model for expressing resources, constraints, and flows, then automatically finding a flow solution and generating a controller. Automation frees the engineer to analyze timing profiles and to implement generic interfaces that the generated controller can use to interact with and command the system automatically.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127758466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091072
Nhut-Minh Ho, W. Wong
With the growing importance of deep learning and energy-saving approximate computing, half precision floating point arithmetic (FP16) is fast gaining popularity. Nvidia's recent Pascal architecture was the first GPU that offered FP16 support. However, when actual products were shipped, programmers soon realized that a naïve replacement of single precision (FP32) code with half precision led to disappointing performance results, even if they are willing to tolerate the increase in error precision reduction brings. In this paper, we developed an automated conversion framework to help users migrate their CUDA code to better exploit Pascal's half precision capability. Using our tools and techniques, we successfully convert many benchmarks from single precision arithmetic to half precision equivalent, and achieved significant speedup improvement in many cases. In the best case, a 3× speedup over the FP32 version was achieved. We shall also discuss some new issues and opportunities that the Pascal GPUs brought.
{"title":"Exploiting half precision arithmetic in Nvidia GPUs","authors":"Nhut-Minh Ho, W. Wong","doi":"10.1109/HPEC.2017.8091072","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091072","url":null,"abstract":"With the growing importance of deep learning and energy-saving approximate computing, half precision floating point arithmetic (FP16) is fast gaining popularity. Nvidia's recent Pascal architecture was the first GPU that offered FP16 support. However, when actual products were shipped, programmers soon realized that a naïve replacement of single precision (FP32) code with half precision led to disappointing performance results, even if they are willing to tolerate the increase in error precision reduction brings. In this paper, we developed an automated conversion framework to help users migrate their CUDA code to better exploit Pascal's half precision capability. Using our tools and techniques, we successfully convert many benchmarks from single precision arithmetic to half precision equivalent, and achieved significant speedup improvement in many cases. In the best case, a 3× speedup over the FP32 version was achieved. We shall also discuss some new issues and opportunities that the Pascal GPUs brought.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125882934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091036
Yang Hu, P. Kumar, Guy Swope, Huimin Huang
Triangle counting is widely used in many applications including spam detection, link recommendation, and social network analysis. The DARPA Graph Challenge seeks a scalable solution for triangle counting on big graphs. In this paper we present TriX, a scalable triangle counting framework, which is comprised of a 2-D graph partition strategy and a binary search based intersection algorithm designed for GPUs. The 2-D partition provides balanced work division among multiple GPUs. On the other hand, binary search based intersection achieves fine-grained parallelism on GPUs via intra-warp scheduling and coalesced memory access. TriX is able to scale to a large number of GPUs, and count triangles on billion-node graph (2 billion node, 64 billion edges) within 35 minutes, achieving over 16 million traverse edges per second (TEPS).
{"title":"TriX: Triangle counting at extreme scale","authors":"Yang Hu, P. Kumar, Guy Swope, Huimin Huang","doi":"10.1109/HPEC.2017.8091036","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091036","url":null,"abstract":"Triangle counting is widely used in many applications including spam detection, link recommendation, and social network analysis. The DARPA Graph Challenge seeks a scalable solution for triangle counting on big graphs. In this paper we present TriX, a scalable triangle counting framework, which is comprised of a 2-D graph partition strategy and a binary search based intersection algorithm designed for GPUs. The 2-D partition provides balanced work division among multiple GPUs. On the other hand, binary search based intersection achieves fine-grained parallelism on GPUs via intra-warp scheduling and coalesced memory access. TriX is able to scale to a large number of GPUs, and count triangles on billion-node graph (2 billion node, 64 billion edges) within 35 minutes, achieving over 16 million traverse edges per second (TEPS).","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130586647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091065
R. Rajaei, B. Shafai, A. Ramezani
Not only networks are ubiquitous in real world, but also networked dynamics provide a more precise scheme required to better understanding of surrounding phenomena and data. This network-centric approach can be applied to analyze time series data of any type. An abundant prevalence of time series observations demand inference of causality in addition to accurate prediction. In this paper, a fuzzy improved interaction network underlying generalized Lotka-Volterra dynamics is introduced and referred to as FuzzIN. FuzzIN offers a top-down method to predict and describe potential connectivity information embedded in time series. Using FuzzIN, the current paper tries to study the effects of healthcare systems in population health across 21 OECD countries between 1999 and 2012 via OECD Health Data. It is shown that FuzzIN performs well due to its capability of handling nonlinearities, complex interconnectivities and uncertainties in the observed data and excels compared statistical methods. Hence, the relationships are inferred and healthcare systems' performance is discussed by FuzzIN parameters and rules. These estimates can be used to highlight health indicators and problems and to make awareness of development and implementation of effective, targeted public health policies and activities.
{"title":"A top-down scheme of descriptive time series data analysis for healthy life: Introducing a fuzzy amended interaction network","authors":"R. Rajaei, B. Shafai, A. Ramezani","doi":"10.1109/HPEC.2017.8091065","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091065","url":null,"abstract":"Not only networks are ubiquitous in real world, but also networked dynamics provide a more precise scheme required to better understanding of surrounding phenomena and data. This network-centric approach can be applied to analyze time series data of any type. An abundant prevalence of time series observations demand inference of causality in addition to accurate prediction. In this paper, a fuzzy improved interaction network underlying generalized Lotka-Volterra dynamics is introduced and referred to as FuzzIN. FuzzIN offers a top-down method to predict and describe potential connectivity information embedded in time series. Using FuzzIN, the current paper tries to study the effects of healthcare systems in population health across 21 OECD countries between 1999 and 2012 via OECD Health Data. It is shown that FuzzIN performs well due to its capability of handling nonlinearities, complex interconnectivities and uncertainties in the observed data and excels compared statistical methods. Hence, the relationships are inferred and healthcare systems' performance is discussed by FuzzIN parameters and rules. These estimates can be used to highlight health indicators and problems and to make awareness of development and implementation of effective, targeted public health policies and activities.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"595 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130866702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091037
C. Voegele, Yi-Shan Lu, Sreepathi Pai, K. Pingali
We describe CPU and GPU implementations of parallel triangle-counting and k-truss identification in the Galois and IrGL systems. Both systems are based on a graph-centric abstraction called the operator formulation of algorithms. Depending on the input graph, our implementations are two to three orders of magnitude faster than the reference implementations provided by the IEEE HPEC static graph challenge.
{"title":"Parallel triangle counting and k-truss identification using graph-centric methods","authors":"C. Voegele, Yi-Shan Lu, Sreepathi Pai, K. Pingali","doi":"10.1109/HPEC.2017.8091037","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091037","url":null,"abstract":"We describe CPU and GPU implementations of parallel triangle-counting and k-truss identification in the Galois and IrGL systems. Both systems are based on a graph-centric abstraction called the operator formulation of algorithms. Depending on the input graph, our implementations are two to three orders of magnitude faster than the reference implementations provided by the IEEE HPEC static graph challenge.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131911007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091058
Martin Winter, Rhaleb Zayer, M. Steinberger
In this paper, we present a new, dynamic graph data structure, built to deliver high update rates while keeping a low memory footprint using autonomous memory management directly on the GPU. By transferring the memory management to the GPU, efficient updating of the graph structure and fast initialization times are enabled as no additional memory allocation calls or reallocation procedures are necessary since they are handled directly on the device. In comparison to previous work, this optimized approach allows for significantly lower initialization times (up to 300× faster) and much higher update rates for significant changes to the graph structure and equal rates for small changes. The framework provides different update implementations tailored specifically to different graph properties, enabling over 100 million of updates per second and keeping tens of millions of vertices and hundreds of millions of edges in memory without transferring data back and forth between device and host.
{"title":"Autonomous, independent management of dynamic graphs on GPUs","authors":"Martin Winter, Rhaleb Zayer, M. Steinberger","doi":"10.1109/HPEC.2017.8091058","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091058","url":null,"abstract":"In this paper, we present a new, dynamic graph data structure, built to deliver high update rates while keeping a low memory footprint using autonomous memory management directly on the GPU. By transferring the memory management to the GPU, efficient updating of the graph structure and fast initialization times are enabled as no additional memory allocation calls or reallocation procedures are necessary since they are handled directly on the device. In comparison to previous work, this optimized approach allows for significantly lower initialization times (up to 300× faster) and much higher update rates for significant changes to the graph structure and equal rates for small changes. The framework provides different update implementations tailored specifically to different graph properties, enabling over 100 million of updates per second and keeping tens of millions of vertices and hundreds of millions of edges in memory without transferring data back and forth between device and host.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129545146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091053
Shahir Mowlaei
In this paper we propose a vectorized sorted set intersection approach for the task of counting the exact number of triangles of a graph on CPU cores. The computation is factorized into reordering and counting kernels where the reordering kernel builds upon the Reverse Cuthill-McKee heuristic.
{"title":"Triangle counting via vectorized set intersection","authors":"Shahir Mowlaei","doi":"10.1109/HPEC.2017.8091053","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091053","url":null,"abstract":"In this paper we propose a vectorized sorted set intersection approach for the task of counting the exact number of triangles of a graph on CPU cores. The computation is factorized into reordering and counting kernels where the reordering kernel builds upon the Reverse Cuthill-McKee heuristic.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131338419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091088
Shreyas G. Singapura, R. Kannan, V. Prasanna
3D memory is becoming an increasingly popular technology to overcome the performance gap between memory and processors. It has led to the development of new architectures with scratchpad memory, which offer high bandwidth and user-controlled access features. The ideal performance of this scratchpad memory is peak bandwidth for any random block access. However, 3D memories come with their constraints on the "ideal" access patterns for which high bandwidth is guaranteed and the actual bandwidth is significantly lower for other access patterns. In this paper, we address the challenge of achieving high bandwidth for random block accesses to 3D memory. We present optimal data layout which achieves maximum bandwidth for each vault irrespective of the block accessed in a vault. Our data layout expressed as a mapping function determined by the architecture parameters exploits inter-layer pipelining to map the elements of each block among various layers of a vault in a specific pattern. By doing so, our data layout can absorb the latency of accesses to banks in the same layer and more importantly, hide the latency of accesses to different rows in the same bank irrespective of the block being accessed. We compare the performance of our proposed data layout with existing data layout using PARSEC 2.0 benchmarks. Our experimental results demonstrate as high as 56% improvement in access time in comparison with the existing data layout across various workloads.
{"title":"Optimal data layout for block-level random accesses to scratchpad","authors":"Shreyas G. Singapura, R. Kannan, V. Prasanna","doi":"10.1109/HPEC.2017.8091088","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091088","url":null,"abstract":"3D memory is becoming an increasingly popular technology to overcome the performance gap between memory and processors. It has led to the development of new architectures with scratchpad memory, which offer high bandwidth and user-controlled access features. The ideal performance of this scratchpad memory is peak bandwidth for any random block access. However, 3D memories come with their constraints on the \"ideal\" access patterns for which high bandwidth is guaranteed and the actual bandwidth is significantly lower for other access patterns. In this paper, we address the challenge of achieving high bandwidth for random block accesses to 3D memory. We present optimal data layout which achieves maximum bandwidth for each vault irrespective of the block accessed in a vault. Our data layout expressed as a mapping function determined by the architecture parameters exploits inter-layer pipelining to map the elements of each block among various layers of a vault in a specific pattern. By doing so, our data layout can absorb the latency of accesses to banks in the same layer and more importantly, hide the latency of accesses to different rows in the same bank irrespective of the block being accessed. We compare the performance of our proposed data layout with existing data layout using PARSEC 2.0 benchmarks. Our experimental results demonstrate as high as 56% improvement in access time in comparison with the existing data layout across various workloads.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122233811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091024
Doru-Thom Popovici, F. Franchetti, Tze Meng Low
Implementing complex arithmetic routines with Single Instruction Multiple Data (SIMD) instructions requires the use of instructions that are usually not found in their real arithmetic counter-parts. These instructions, such as shuffles and addsub, are often bottlenecks for many complex arithmetic kernels as modern architectures usually can perform more real arithmetic operations than execute instructions for complex arithmetic. In this work, we focus on using a variety of data layouts (mixed format) for storing complex numbers at different stages of the computation so as to limit the use of these instructions. Using complex matrix multiplication and Fast Fourier Transforms (FFTs) as our examples, we demonstrate that performance improvements of up to 2× can be attained with mixed format within the computational routines. We also described how existing algorithms can be easily modified to implement the mixed format complex layout.
{"title":"Mixed data layout kernels for vectorized complex arithmetic","authors":"Doru-Thom Popovici, F. Franchetti, Tze Meng Low","doi":"10.1109/HPEC.2017.8091024","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091024","url":null,"abstract":"Implementing complex arithmetic routines with Single Instruction Multiple Data (SIMD) instructions requires the use of instructions that are usually not found in their real arithmetic counter-parts. These instructions, such as shuffles and addsub, are often bottlenecks for many complex arithmetic kernels as modern architectures usually can perform more real arithmetic operations than execute instructions for complex arithmetic. In this work, we focus on using a variety of data layouts (mixed format) for storing complex numbers at different stages of the computation so as to limit the use of these instructions. Using complex matrix multiplication and Fast Fourier Transforms (FFTs) as our examples, we demonstrate that performance improvements of up to 2× can be attained with mixed format within the computational routines. We also described how existing algorithms can be easily modified to implement the mixed format complex layout.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123564841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/HPEC.2017.8091046
Tze Meng Low, Varun Nagaraj Rao, Matthew Kay Fei Lee, Doru-Thom Popovici, F. Franchetti, Scott McMillan
Linear algebra-based approaches to exact triangle counting often require sparse matrix multiplication as a primitive operation. Non-linear algebra approaches to the same problem often assume that the adjacency matrix of the graph is not available. In this paper, we show that both approaches can be unified into a single approach that separates the data format from the algorithm design. By not casting the triangle counting algorithm into matrix multiplication, a different algorithm that counts each triangle exactly once can be identified. In addition, by choosing the appropriate sparse matrix format, we show that the same algorithm is equivalent to the compact-forward algorithm attained assuming that the adjacency matrix of the graph is not available. We show that our approach yields an initial implementation that is between 69 and more than 2000 times faster than the reference implementation. We also show that the initial implementation can be easily parallelized on shared memory systems.
{"title":"First look: Linear algebra-based triangle counting without matrix multiplication","authors":"Tze Meng Low, Varun Nagaraj Rao, Matthew Kay Fei Lee, Doru-Thom Popovici, F. Franchetti, Scott McMillan","doi":"10.1109/HPEC.2017.8091046","DOIUrl":"https://doi.org/10.1109/HPEC.2017.8091046","url":null,"abstract":"Linear algebra-based approaches to exact triangle counting often require sparse matrix multiplication as a primitive operation. Non-linear algebra approaches to the same problem often assume that the adjacency matrix of the graph is not available. In this paper, we show that both approaches can be unified into a single approach that separates the data format from the algorithm design. By not casting the triangle counting algorithm into matrix multiplication, a different algorithm that counts each triangle exactly once can be identified. In addition, by choosing the appropriate sparse matrix format, we show that the same algorithm is equivalent to the compact-forward algorithm attained assuming that the adjacency matrix of the graph is not available. We show that our approach yields an initial implementation that is between 69 and more than 2000 times faster than the reference implementation. We also show that the initial implementation can be easily parallelized on shared memory systems.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123710798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}