1829

Posts

Nov, 27

Predictive Runtime Code Scheduling for Heterogeneous Architectures

Heterogeneous architectures are currently widespread. With the advent of easy-to-program general purpose GPUs, virtually every recent desktop computer is a heterogeneous system. Combining the CPU and the GPU brings great amounts of processing power. However, such architectures are often used in a restricted way for domain-specific applications like scientific applications and games, and they tend […]
Nov, 27

Exploiting graphical processing units for data-parallel scientific applications

Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms – regular mesh field […]
Nov, 27

CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units

Motif discovery in biological sequences is of prime importance and a major challenge in computational biology. Consequently, numerous motif discovery tools have been developed to date. However, the rapid growth of both genomic sequence and gene transcription data, establishes the need for the development of scalable motif discovery tools. An approach to improve the runtime […]
Nov, 27

Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment

The Smith-Waterman (SW) algorithm is the most accurate sequence alignment approach used by computational biologists for DNA matching. However it’s computational complexity makes SW impractical to use in clinical environment compared to much faster but less accurate sequence alignment technique such as BLAST. High performance computing community is examining alternative multi core architectures such as […]
Nov, 27

Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects

The emergence and continuing use of multi-core architectures and graphics processing units require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) and Matrix Algebra on GPU and Multics Architectures (MAGMA) are two […]
Nov, 27

The multikernel: a new OS architecture for scalable multicore systems

Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for […]
Nov, 27

Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs

Compute Unified Device Architecture (CUDA) was used to design and implement molecular dynamics (MD) simulations on graphics processing units (GPU). With an NVIDIA Tesla C870, a 20-60 fold speedup over that of one core of the Intel Xeon 5430 CPU was achieved, reaching up to 150 Gflops. MD simulation of cavity flow and particle-bubble interaction […]
Nov, 27

Solving Sparse Linear Systems on NVIDIA Tesla GPUs

Current many-core GPUs have enormous processing power, and unlocking this power for general-purpose computing is very attractive due to their low cost and efficient power utilization. However, the fine-grained parallelism and the stream-programming model supported by these GPUs require a paradigm shift, especially for algorithm designers. In this paper we present the design of a […]
Nov, 27

The Virtual Marathon: Parallel Computing Supports Crowd Simulations

To be realistic, an urban model must include appropriate numbers of pedestrians, vehicles, and other dynamic entities. Using a parallel computing architecture, researchers simulated a marathon with more than a million participants. To simulate participant behavior, they used fuzzy logic on a GPU to perform millions of inferences in real time.
Nov, 27

Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition

In large vocabulary continuous speech recognition (LVCSR) the acoustic model computations often account for the largest processing overhead. Our weighted finite state transducer (WFST) based decoding engine can utilize a commodity graphics processing unit (GPU) to perform the acoustic computations to move this burden off the main processor. In this paper we describe our new […]
Nov, 27

Accuracy and performance of graphics processors: A Quantum Monte Carlo application case study

The tradeoffs of accuracy and performance are as yet an unsolved problem when dealing with Graphics Processing Units (GPUs) as a general-purpose computation device. Their high performance and low cost makes them a desirable target for scientific computation, and new language efforts help address the programming challenges of data parallel algorithms and memory management. But […]
Nov, 26

Evaluating the use of GPUs in liver image segmentation and HMMER database searches

In this paper we present the results of parallelizing two life sciences applications, Markov random fields-based (MRF) liver segmentation and HMMER’s Viterbi algorithm, using GPUs. We relate our experiences in porting both applications to the GPU as well as the techniques and optimizations that are most beneficial. The unique characteristics of both algorithms are demonstrated […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: