In this thesis, we present a CUDA-implementation of two sub-steps of the Parallel Multilevel Partition of Unity Method (PMPUM). The PMPUM is a method for the approximation of Partial Differential Equations (PDEs) whose main computational effort is caused by the integration of the weak formulation. Therefore, an efficient CUDA-implementation of the required steps could speed up a given PMPUM-implementation. The core of this thesis is the analysis of the applicability of CUDA in the PMPUM. To this end the required steps, the decomposition of the domain and the integration, were implemented using CUDA. The analysis showed, that the usage of CUDA can speed up the implementation and identified the limitations of the implementation. We give recommendations how to improve these limitations and expect the performance to increase further with these recommendations applied.