Posts
Jun, 2
GPU-accelerated generation of correctly-rounded elementary functions
The IEEE 754-2008 standard recommends the correct rounding of elementary functions. This requires to solve the Table Maker’s Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lefevre algorithm, on Graphics Processing Units (GPU) which are massively parallel architectures with a partial SIMD execution (Single […]
Jun, 2
GPUburn: A System to Test and Mitigate GPU Hardware Failures
Due to many factors such as, high transistor density, high frequency, and low voltage, today’s processors are more than ever subject to hardware failures. These errors have various impacts depending on the location of the error and the type of processor. Because of the hierarchical structure of the compute units and work scheduling, the hardware […]
Jun, 2
Novel Multi-Layer Network Decomposition Boosting Acceleration of Multi-core Algorithms
Complex networks are a technique for the modeling and analysis of large data sets in many scientific and engineering disciplines. Due to their excessive size conventional algorithms and single core processors struggle with the efficient processing of such networks. Employing multi-core graphic processing units (GPUs) could provide sufficient processing power for the analysis of such […]
Jun, 2
On GPU Fourier Transformations
The Fourier Transform is one of the most in uential mathematical equations of our time. The Discrete Fourier Transform (DFT) (which is equal to the Fourier Transform for signals with equally spaced samples) has been improved by a more efficient algorithm called the Fast Fourier Transform contributed by Cooley-Tukey[8] and Gentlemen-Sande[11]. Improvements since then have […]
Jun, 2
Task scheduling in hybrid CPU-GPU systems
The distribution of workload among available computational units is an essential problem for every parallel system. It has been attended thoroughly from many perspectives, such as thread scheduling in operating systems, task scheduling in frameworks for parallel computations, or constrained scheduling in real-time systems. However, each system has unique properties and requirements, thus we cannot […]
May, 31
Composition and Reuse with Compiled Domain-Specific Languages
Programmers who need high performance currently rely on low-level, architecture-specific programming models (e.g. OpenMP for CMPs, CUDA for GPUs, MPI for clusters). Performance optimization with these frameworks usually requires expertise in the specific programming model and a deep understanding of the target architecture. Domain-specific languages (DSLs) are a promising alternative, allowing compilers to map problem-specific […]
May, 31
Geometric Optimisation using Karva for Graphical Processing Units
Population-based evolutionary algorithms continue to play an important role in artifically intelligent systems, but can not always easily use parallel computation. We have combined a geometric (any-space) particle swarm optimisation algorithm with use of Ferreira’s Karva language of gene expression programming to produce a hybrid that can accelerate the genetic operators and which can rapidly […]
May, 31
GPU Multiple Sequence Alignment Fourier-Space Cross-Correlation Alignment
The aim of this project is to explore the possible application of Graphics Processors (GPUs) to accelerate and speed up sequence alignment by Fourier-space cross-correlation. Aligning signals using cross-correlations is a well studied approach in the world of signal processing, but has found relatively little reception in the realm of computational genomics. As long as […]
May, 31
Improved Performance of CaFE and IRIS Model Fitting Using CUDA
Label-free optical bionsensors are known to be accurate and reliable tools for measuring and monitoring certain biomolecular interactions. In recent years, new techniques and technologies have emerged that enable high-throughput biosensing at lower system size, cost, and complexity. In particular, the LED-based Interferometric Reflectance Imaging Sensor (IRIS) has been demonstrated as a viable alternative to […]
May, 31
Tiling optimizations for stencil computations
This thesis studies the techniques of tiling optimizations for stencil programs. Traditionally, research on tiling optimizations mainly focuses on tessellating tiling, atomic tiles and regular tile shapes. This thesis studies several novel tiling techniques which are out of the scope of traditional research. In order to represent a general tiling scheme uniformly, a unified tiling […]
May, 31
CUDA 5.5 Features and Release Candidate (RC) program (webinar)
The CUDA 5.5 RC is now available to CUDA Registered Developers on https://developer.nvidia.com/registered-developer-programs. Ujval Kapasi, NVIDIA CUDA Product Manager, will provide an overview of the new features of CUDA 5.5. CUDA Registered Developers already have access to the toolkit downloads – but we’ll provide brief instructions on how to download and install the binaries and […]
May, 30
Performance Tradeoff Spectrum of Integer and Floating Point Applications
Floating point precision and performance and the ratio of floating point units to integer processing elements on a graphics processing unit accelerator all continue to present complex tradeoffs for optimising core utilisation on modern devices. We investigate various hybrid CPU and GPU combinations using a range of different GPU models occupying different points in this […]