11923

Posts

Apr, 17

Feasibility Analysis of Bilateral Filtering by General Purpose Graphical Processing Unit Computing

Digital Image Processing is an evergreen area of research in the signal processing domain. Denoising of digital images is one of the most fundamental operations that is performed in the pre-processing stage of almost all image processing operations. This important feature makes denoising as one of the lucrative research areas within the broad area of […]
Apr, 17

Bayesian Neural Networks for Genetic Association Studies of Complex Disease

Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association […]
Apr, 16

CISE 2014 – Asian Conference on Computer and Information Science and Engineering, CISE 2014

The Asian Conference on Computer and Information Science and Engineering will incorporate all topics within the field of computer and information science. This inaugural event promises to attract experts within the field of computer and information science and engineering, and allow for professors, researchers and university students to collaborate on this ever-growing field.
Apr, 16

Performance-aware component composition for GPU-based systems

This thesis addresses issues associated with efficiently programming modern heterogeneous GPU-based systems, containing multicore CPUs and one or more programmable Graphics Processing Units (GPUs). We use ideas from component-based programming to address programming, performance and portability issues of these heterogeneous systems. Specifically, we present three approaches that all use the idea of having multiple implementations […]
Apr, 16

On optimization techniques for the matrix multiplication on hybrid CPU+GPU platforms

The use of auto-tuning techniques in a matrix multiplication routine for hybrid CPU+GPU platforms is analyzed. Basic models of the execution time of the hybrid routine and information obtained during its installation are used to optimize the execution time with a balanced assignation of the computation to the computing components in the heterogeneous system. Satisfactory […]
Apr, 16

Dynamic Instrumentation and Optimization for GPU Applications

Parallel architectures like GPUs are a tantalizing compute fabric for performance-hungry developers. While GPUs enable order-of-magnitude performance increases in many data-parallel application domains, writing efficient codes that can actually manifest those increases is a non-trivial endeavor, typically requiring developers to exercise specialized architectural features exposed directly in the programming model. Achieving good performance on GPUs […]
Apr, 16

New Efficient Method To Solve Longest Overlap Region Problem For Noncoding DNA Sequence

With early hardware limitations of the GPU (lack of synchronization primitives and limited memory caching mechanisms)can make GPU-based computation inefficient, and emerging DNA sequence technologies open up more opportunities for molecular biology. This paper presents the issues of parallel implementation of longest overlap region Problem on a multiprocessor GPU using the Compute Unified Device Architecture […]
Apr, 16

A Way For Accelerating The DNA Sequence Reconstruction Problem By CUDA

Traditionally, we usually utilize the method of shotgun to cut a DNA sequence into pieces and we have to reconstruct the original DNA sequence from the pieces, those are widely used method for DNA assembly. Emerging DNA sequence technologies open up more opportunities for molecular biology. This paper introduce a new method to improve the […]
Apr, 14

Fast Burrows Wheeler Compression Using CPU and GPU

In this paper, we present an all-core implementation of Burrows Wheeler Compression algorithm that exploits all computing resources on a system. Our focus is to provide significant benefit to everyday users on common end-to-end applications by exploiting the parallelism of multiple CPU cores and many-core GPU on their machines. The all-core framework is suitable for […]
Apr, 14

Scheduling Dataflow Execution Across Multiple Accelerators

Dataflow execution engines such as MapReduce, DryadLINQ and PTask have enjoyed success because they simplify development for a class of important parallel applications. Expressing the computation as a dataflow graph allows the runtime, and not the programmer, to own problems such as synchronization, data movement and scheduling – leveraging dynamic information to inform strategy and […]
Apr, 14

A First Order Primal-Dual Algorithm for Nonconvex TV^q Regularization

We propose an efficient first order primal-dual method for solving variational problems with nonconvex regularization such as TV^q. It is based on the recent idea in [1] to reformulate an existing primal-dual algorithm for convex optimization using Moreau’s identity. A systematic comparison to recent state of the art algorithms for nonconvex optimization (iteratively reweighted l1 […]
Apr, 14

An Approach to Efficient FEM Simulations on Graphics Processing Units Using CUDA

The paper presents a highly efficient way of simulating the dynamic behavior of deformable objects by means of the finite element method (FEM) with computations performed on Graphics Processing Units (GPU). The presented implementation reduces bottlenecks related to memory accesses by grouping the necessary data per node pairs, in contrast to the classical way done […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: