10917

Posts

Nov, 12

Sailfish: a flexible multi-GPU implementation of the lattice Boltzmann method

We present Sailfish, an open source fluid simulation package implementing the lattice Boltzmann method (LBM) on modern Graphics Processing Units (GPUs) using CUDA/OpenCL. We take a novel approach to GPU code implementation and use run-time code generation techniques and a high level programming language (Python) to achieve state of the art performance, while allowing easy […]
Nov, 12

High speed cipher cracking: the case of Keeloq on CUDA

Graphic Processing Units (GPU) are increasingly popular in the field of high-performance computing for their ability to provide computational power for massively parallel problems at a reduced cost. However, the programming model exposed by the GPGPU software development tools is often insufficient to achieve full performance, and a major rethinking of algorithmic choices is needed. […]
Nov, 12

A Hybrid GPU/CPU FFT Library for Large FFT Problems

Graphic Processing Units (GPU) has been proved to be a promising platform to accelerate large size Fast Fourier Transform (FFT) computation. However, current GPU-based FFT implementation only uses GPU to compute, but employs CPU as a mere memory-transfer controller. The computation power in today’s high-performance CPU is wasted. In this project, a hybrid optimization framework […]
Nov, 12

Performance Evaluation of R with Intel Xeon Phi Coprocessor

Over the years, R has been adopted as a major data analysis and mining tool in many domain fields. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little […]
Nov, 12

GPU-Based Sparse Voxel Octree Raytracing for Rendering of Procedurally Generated Terrain

Within the field of Computer Graphics, there have been two competing approaches to doing rendering, namely rasterisation and raytracing. Rasterisation became, and has been, the dominant of the two methods for realtime rendering for a long period of time. With recent developments in graphics hardware, however, raytracing is starting to gain popularity once again. At […]
Nov, 11

Accelerating calculations of RNA secondary structure partition functions using GPUs

BACKGROUND: RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts […]
Nov, 11

Preliminary Experiments with XKaapi on Intel Xeon Phi Coprocessor

This paper presents preliminary performance comparisons of parallel applications developed natively for the Intel Xeon Phi accelerator using three different parallel programming environments and their associated runtime systems. We compare Intel OpenMP, Intel CilkPlus and XKaapi together on the same benchmark suite and we provide comparisons between an Intel Xeon Phi coprocessor and a Sandy […]
Nov, 11

Explorations of the Viability of ARM and Xeon Phi for Physics Processing

We report on our investigations into the viability of the ARM processor and the Intel Xeon Phi co-processor for scientific computing. We describe our experience porting software to these processors and running benchmarks using real physics applications to explore the potential of these processors for production physics processing.
Nov, 11

A High Performance Random Number Generator Using Heterogeneous Computing Platform

The power of high performance computing (HPC) heavily depends on the ability to efficiently enhancing huge amounts of parallelism. Random numbers or pseudo random numbers are very important for the efficient implementation for stochastic algorithms. Multi-core CPU and many-core Graphic Processing Units (GPUs) are conductive accelerator to produce the countless random numbers. Nevertheless, GPU does […]
Nov, 11

First Steps Towards More Numerical Reproducibility

Questions whether numerical simulation is reproducible or not have been reported in several sensitive applications. Numerical reproducibility failure mainly comes from the finite precision of computer arithmetic. Results of floating-point computation depends on the computer arithmetic precision and on the order of arithmetic operations. Massive parallel HPC which merges, for instance, many-core CPU and GPU, […]
Nov, 10

Optimization of real-time ultrasound PCIe data streaming and OpenCL processing for SAFT imaging

Our goal is to develop a complete ultrasound platform based on real-time SAFT (Synthetic Aperture Focusing Technique) GPU processing. We are planning to integrate all the ultrasound modules and processing resources (GPU) in a single rack enclosure with the PCIe switch fabric backplane. The first developed module (RX64) provides acquisition and streaming of 64 ultrasound […]
Nov, 10

Toward Better Computation Models for Modern Machines

Modern computers are not random access machines (RAMs). They have a memory hierarchy, multiple cores, and a virtual memory. We address the computational cost of the address translation in the virtual memory and difficulties in design of parallel algorithms on modern many-core machines. Starting point for our work on virtual memory is the observation that […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: