Posts
Nov, 27
An Extension of the StarSs Programming Model for Platforms with Multiple GPUs
While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/power ratio can be attained using specialized hardware accelerators. These accelerators require specific SDK or programming languages which are not always easy to program. Thus, the impact of the new programming paradigms on the […]
Nov, 27
Parallel smoothing of quad meshes
Abstract For use in real-time applications, we present a fast algorithm for converting a quad mesh to a smooth, piecewise polynomial surface on the Graphics Processing Unit (GPU). The surface has well-defined normals everywhere and closely mimics the shape of Catmull-Clark subdivision surfaces. It consists of bicubic splines wherever possible, and a new class of […]
Nov, 27
Quantum computer simulation using the CUDA programming model
Quantum computing emerges as a field that captures a great theoretical interest. Its simulation represents a problem with high memory and computational requirements which makes advisable the use of parallel platforms. In this work we deal with the simulation of an ideal quantum computer on the Compute Unified Device Architecture (CUDA), as such a problem […]
Nov, 27
Predictive Runtime Code Scheduling for Heterogeneous Architectures
Heterogeneous architectures are currently widespread. With the advent of easy-to-program general purpose GPUs, virtually every recent desktop computer is a heterogeneous system. Combining the CPU and the GPU brings great amounts of processing power. However, such architectures are often used in a restricted way for domain-specific applications like scientific applications and games, and they tend […]
Nov, 27
Exploiting graphical processing units for data-parallel scientific applications
Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms – regular mesh field […]
Nov, 27
CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units
Motif discovery in biological sequences is of prime importance and a major challenge in computational biology. Consequently, numerous motif discovery tools have been developed to date. However, the rapid growth of both genomic sequence and gene transcription data, establishes the need for the development of scalable motif discovery tools. An approach to improve the runtime […]
Nov, 27
Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment
The Smith-Waterman (SW) algorithm is the most accurate sequence alignment approach used by computational biologists for DNA matching. However it’s computational complexity makes SW impractical to use in clinical environment compared to much faster but less accurate sequence alignment technique such as BLAST. High performance computing community is examining alternative multi core architectures such as […]
Nov, 27
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
The emergence and continuing use of multi-core architectures and graphics processing units require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) and Matrix Algebra on GPU and Multics Architectures (MAGMA) are two […]
Nov, 27
The multikernel: a new OS architecture for scalable multicore systems
Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for […]
Nov, 27
Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs
Compute Unified Device Architecture (CUDA) was used to design and implement molecular dynamics (MD) simulations on graphics processing units (GPU). With an NVIDIA Tesla C870, a 20-60 fold speedup over that of one core of the Intel Xeon 5430 CPU was achieved, reaching up to 150 Gflops. MD simulation of cavity flow and particle-bubble interaction […]
Nov, 27
Solving Sparse Linear Systems on NVIDIA Tesla GPUs
Current many-core GPUs have enormous processing power, and unlocking this power for general-purpose computing is very attractive due to their low cost and efficient power utilization. However, the fine-grained parallelism and the stream-programming model supported by these GPUs require a paradigm shift, especially for algorithm designers. In this paper we present the design of a […]
Nov, 27
The Virtual Marathon: Parallel Computing Supports Crowd Simulations
To be realistic, an urban model must include appropriate numbers of pedestrians, vehicles, and other dynamic entities. Using a parallel computing architecture, researchers simulated a marathon with more than a million participants. To simulate participant behavior, they used fuzzy logic on a GPU to perform millions of inferences in real time.