Posts
Nov, 27
Quantum computer simulation using the CUDA programming model
Quantum computing emerges as a field that captures a great theoretical interest. Its simulation represents a problem with high memory and computational requirements which makes advisable the use of parallel platforms. In this work we deal with the simulation of an ideal quantum computer on the Compute Unified Device Architecture (CUDA), as such a problem […]
Nov, 27
Predictive Runtime Code Scheduling for Heterogeneous Architectures
Heterogeneous architectures are currently widespread. With the advent of easy-to-program general purpose GPUs, virtually every recent desktop computer is a heterogeneous system. Combining the CPU and the GPU brings great amounts of processing power. However, such architectures are often used in a restricted way for domain-specific applications like scientific applications and games, and they tend […]
Nov, 27
Exploiting graphical processing units for data-parallel scientific applications
Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms – regular mesh field […]
Nov, 27
CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units
Motif discovery in biological sequences is of prime importance and a major challenge in computational biology. Consequently, numerous motif discovery tools have been developed to date. However, the rapid growth of both genomic sequence and gene transcription data, establishes the need for the development of scalable motif discovery tools. An approach to improve the runtime […]
Nov, 27
Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment
The Smith-Waterman (SW) algorithm is the most accurate sequence alignment approach used by computational biologists for DNA matching. However it’s computational complexity makes SW impractical to use in clinical environment compared to much faster but less accurate sequence alignment technique such as BLAST. High performance computing community is examining alternative multi core architectures such as […]
Nov, 27
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
The emergence and continuing use of multi-core architectures and graphics processing units require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) and Matrix Algebra on GPU and Multics Architectures (MAGMA) are two […]
Nov, 27
The multikernel: a new OS architecture for scalable multicore systems
Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for […]
Nov, 27
Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs
Compute Unified Device Architecture (CUDA) was used to design and implement molecular dynamics (MD) simulations on graphics processing units (GPU). With an NVIDIA Tesla C870, a 20-60 fold speedup over that of one core of the Intel Xeon 5430 CPU was achieved, reaching up to 150 Gflops. MD simulation of cavity flow and particle-bubble interaction […]
Nov, 27
Solving Sparse Linear Systems on NVIDIA Tesla GPUs
Current many-core GPUs have enormous processing power, and unlocking this power for general-purpose computing is very attractive due to their low cost and efficient power utilization. However, the fine-grained parallelism and the stream-programming model supported by these GPUs require a paradigm shift, especially for algorithm designers. In this paper we present the design of a […]
Nov, 27
The Virtual Marathon: Parallel Computing Supports Crowd Simulations
To be realistic, an urban model must include appropriate numbers of pedestrians, vehicles, and other dynamic entities. Using a parallel computing architecture, researchers simulated a marathon with more than a million participants. To simulate participant behavior, they used fuzzy logic on a GPU to perform millions of inferences in real time.
Nov, 27
Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition
In large vocabulary continuous speech recognition (LVCSR) the acoustic model computations often account for the largest processing overhead. Our weighted finite state transducer (WFST) based decoding engine can utilize a commodity graphics processing unit (GPU) to perform the acoustic computations to move this burden off the main processor. In this paper we describe our new […]
Nov, 27
Accuracy and performance of graphics processors: A Quantum Monte Carlo application case study
The tradeoffs of accuracy and performance are as yet an unsolved problem when dealing with Graphics Processing Units (GPUs) as a general-purpose computation device. Their high performance and low cost makes them a desirable target for scientific computation, and new language efforts help address the programming challenges of data parallel algorithms and memory management. But […]