Performance Comparison with OpenMP Parallelization for Multi-core Systems
IEEE 9th International Symposium on Parallel and Distributed Processing with Applications (ISPA), 2011
@inproceedings{yang2011performance,
title={Performance Comparison with OpenMP Parallelization for Multi-core Systems},
author={Yang, C.T. and Chang, T.C. and Wang, H.Y. and Chu, W.C.C. and Chang, C.H.},
booktitle={Parallel and Distributed Processing with Applications (ISPA), 2011 IEEE 9th International Symposium on},
pages={232–237},
year={2011},
organization={IEEE}
}
Today, the multi-core processor has occupied more and more market shares, and the programming personnel also must face the collision brought by the revolution of multi-core processor. Semiconductor scaling limits and associated power and thermal challenges limit performance growth for single-core microprocessors. This reason leads many microprocessor vendors to turn instead to multi-core chip organizations. So programmer or compiler explicitly parallelize the software is the key for enhance the performance on multi-core chip. At the same time, parallel processing is not only the opportunity but also a challenge. The programmer or compiler explicitly parallelize the software is the key for enhance the performance on multi-core chip. In this paper, what we want to know is there any effective way that can reduce our time on rewrite or can automatically parallel the program for multi-processing purpose and do speedup the processing. We discussed some tools that can automatically generate OpenMP directives from serial C/C++ codes, and compare them with each other include normal C/C++ code, and run on general computer and embedded system. Also we compared some tools that are specifically designed to extract the most of data parallelism from C and FORTRAN kernels and translate them into NVIDIA CUDA or OpenCL to know how mush fast after use them.
August 8, 2011 by hgpu