https://hgpu.org/?p=28343
Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs