https://hgpu.org/?p=24543
Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures