Posts
Nov, 12
Efficiently Computing Tensor Eigenvalues on a GPU
The tensor eigenproblem has many important applications, generating both mathematical and application-specific interest in the properties of tensor eigenpairs and methods for computing them. A tensor is an m-way array, generalizing the concept of a matrix (a 2-way array). Kolda and Mayo have recently introduced a generalization of the matrix power method for computing real-valued […]
Nov, 12
rCUDA: Reducing the number of GPU-based accelerators in high performance clusters
The increasing computing requirements for GPUs (Graphics Processing Units) have favoured the design and marketing of commodity devices that nowadays can also be used to accelerate general purpose computing. Therefore, future high performance clusters intended for HPC (High Performance Computing) will likely include such devices. However, high-end GPU-based accelerators used in HPC feature a considerable […]
Nov, 12
A Run-Time Adaptive FPGA Architecture for Monte Carlo Simulations
Field Programmable Gate Arrays (FPGAs) are now considered to be one of the preferred computing platforms for high performance computing applications, such as Monte Carlo simulations, due to their large computational power and low power consumption. Unlike other state-of-the-art computing platforms, such as General Purpose Processors (GPPs) and General Purpose Graphics Processing Units (GPGPU), FPGAs […]
Nov, 12
Evaluation of an accelerator architecture for Speckle Reducing Anisotropic Diffusion
Increasing chip power density has brought application specific accelerator architectures to the forefront as an energy and area efficient solution. While GPGPU systems take advantage of specialized hardware to perform computationally intensive tasks faster than chip multiprocessor (CMP) systems, accelerators are hardware units that are designed to execute a specific application efficiently. Real-time ultrasound imaging […]
Nov, 12
A multi-GPU acceleration for 3D imaging of the prostate
Transrectal Electric Impedance Tomography (TREIT) has been proposed jointly with ultrasound (US) imaging of the prostate to enhance the standard clinical imaging. Reconstructing TREIT images involves a solution of an inverse problem. The reconstruction is based on two steps: solving and updating an estimate of the dielectric property distribution through solution of an inverse problem. […]
Nov, 12
Sustainable GPU Computing at Scale
General purpose GPU (GPGPU) computing has produced the fastest running supercomputers in the world. For continued sustainable progress, GPU computing at scale also need to address two open issues: a) how increase applications mean time between failures (MTBF) as we increase supercomputer’s component counts, and b) how to minimize unnecessary energy consumption. Since energy consumption […]
Nov, 12
Exploiting Heterogeneity for Energy Efficiency in Chip Multiprocessors
Heterogeneous multicores are envisioned to be a promising design paradigm to combat today’s challenges of power, memory, and reliability walls that are impeding chip design using deep submicron technology. Future multicores are expected to integrate multiple different cores, including GPGPUs, custom accelerators and configurable cores. In this paper, we introduce an important dimension-technology-using which heterogeneity […]
Nov, 12
GSNP: A DNA Single-Nucleotide Polymorphism Detection System with GPU Acceleration
We have developed GSNP, a software package with GPU acceleration, for single-nucleotide polymorphism detection on DNA sequences generated from second-generation sequencing equipment. Compared with SOAPsnp, a popular, high-performance CPU-based SNP detection tool, GSNP has several distinguishing features: First, we design a sparse data representation format to reduce memory access as well as branch divergence. Second, […]
Nov, 12
Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture
The fast development of Graphics Processing Unit (GPU) leads to the popularity of General-purpose usage of GPU (GPGPU). So far, most modern computers are CPU-GPGPU heterogeneous architecture and CPU is used as host processor. In this work, we promote a multithread file chunking prototype system, which is able to exploit the hardware organization of the […]
Nov, 12
Creating HW/SW co-designed MPSoPC’s from high level programming models
FPGA densities have continued to follow Moore’s law and can now support a complete multiprocessor system on programmable chip. The benefits of the FPGA include the ability to build a customized MPSoC system consisting of heterogeneous processing resources, interconnects and memory hierarchies that best match the requirements of each application. In this paper we outline […]
Nov, 12
A translator framework for Dynamic Programming problems
The advent of multicore systems, joined to the potential acceleration of the graphics processing units, has given us a low cost computation capability unprecedented. The new systems alleviate some well known important architectural problems at the expense of a considerable increment of the programmability wall. The heterogeneity, both at architectural and programming level at the […]
Nov, 12
Compiling for a heterogeneous vector image processor
We present a new compilation strategy, implemented at a small cost, to optimize image applications developed on top of a high level image processing library for an heterogeneous processor with a vector image processing accelerator. The library provides the semantics of the image computations. The pipelined structure of the accelerator allows to compute whole expressions […]