https://hgpu.org/?p=18237
clMF: A fine-grained and portable alternating least squares algorithm for parallel matrix factorization