Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16]

Journal of Information Science and Engineering, Vol. 32 No. 3, pp. 517-539 (May 2016)

An OpenMP Programming Toolkit for Hybrid CPU/GPU Clusters Based on Software Unified Memory*

Department of Electrical Engineering
National Kaohsiung University of Applied Sciences
Kaohsiung, 807 Taiwan
E-mail:; {sunneo, jaredlin}

Recently, hybrid CPU/GPU cluster has drawn much attention from the researchers of high performance computing because of amazing energy efficiency and adaptable resource exploitation. However, the programming of hybrid CPU/GPU clusters is very complex because it requires users to learn new programming interfaces such as CUDA and OpenCL, and combine them with MPI and OpenMP. To address this problem, we propose a novel OpenMP toolkit called HyCOMP (Hybrid Cluster OpenMP) for hybrid CPU/GPU clusters in this paper. This toolkit is developed based on a novel page-based distributed shared memory system called SUM (software unified memory) which is aimed at emulating a virtual shared memory space over distributed CPUs and GPUs. Compared to traditional page-based DSM systems, SUM can effectively prevent GPUs from performance degradation caused by the latency of handling an enormous number of page faults coming from host-to-device memory copies. Moreover, HyCOMP can automatically achieve load balance of heterogeneous processors. Consequently, HyCOMP dramatically reduces the programming complexity of hybrid CPU/GPU clusters while simultaneously maintains the execution performance of user programs.

Keywords: hybrid CPU/GPU cluster, OpenMP, distributed shared memory, page-based, pre-fetching, load balance

Full Text () Retrieve PDF document (201605_01.pdf)

Received January 29, 2015; revised May 21, 2015; accepted June 24, 2015.
Communicated by Cho-Li Wang.
+ Corresponding author.
* This work is supported by National Science Council of Taiwan under the research project numbered as NSC-99-2221-E-151-055-MY3.