7488

Posts

Apr, 7

Robust Computational Tools for Multiple Testing With Genetic Association Studies

Resolving the interplay of the genetic components of a complex disease is a challenging endeavor. Over the past several years, genome-wide association studies (GWAS) have emerged as a popular approach at locating common genetic variation within the human genome associated with disease risk. Assessing genetic-phenotype associations upon hundreds of thousands of genetic markers using the […]
Apr, 7

GPU-based Line Probing Techniques for Mikami Routing Algorithm

Graphic processing unit (GPU), which contains hundreds of processing cores, is becoming a popular device for high performance computation in multi-core era. With strictly computation regularity characteristic, specific algorithms are key challenges for performance speed-up. In this paper, we propose a parallel CUDA-Mikami routing algorithm on NVIDIA’s GPU. A 32-bit routing grid encoding is proposed […]
Apr, 7

An Efficient Parallel GPU Evaluation of Small Angle X-Ray Scattering Profiles

The inference of protein structure from experimental data is of crucial interest in science, medicine and biotechnology. Unfortunately, high-resolution experimental methods can not yet provide a detailed analysis of the ensemble of conformations adopted under physiological conditions. Low resolution techniques are often better suited for this task. Small angle X-ray scattering (SAXS) plays a major […]
Apr, 6

OpenCL framework for a CPU, GPU, and FPGA Platform

With the availability of multi-core processors, high capacity FPGAs, and GPUs, a heterogeneous platform with tremendous raw computing capacity can be constructed consisting of any number of these computing elements. However, one of the major challenges for constructing such a platform is the lack of a standardized framework under which an application’s computational task and […]
Apr, 6

High-Performance Energy-Efficient Multicore Embedded Computing

With Moore’s law supplying billions of transistors on-chip, embedded systems are undergoing a transition from single-core to multicore to exploit this high-transistor density for high performance. Embedded systems differ from traditional high-performance supercomputers in that power is a first-order constraint for embedded systems; whereas, performance is the major benchmark for supercomputers. The increase in on-chip […]
Apr, 6

Fast Spoken Query Detection Using Lower-Bound Dynamic Time Warping on Graphical Processing Units

In this paper we present a fast unsupervised spoken term detection system based on lower-bound Dynamic Time Warping (DTW) search on Graphical Processing Units (GPUs). The lower-bound estimate and the K nearest neighbor DTW search are carefully designed to fit the GPU parallel computing architecture. In a spoken term detection task on the TIMIT corpus, […]
Apr, 6

Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing

Ray tracing is an attractive technique for visualizing scientific data because it can produce high quality images that faithfully represent physically-based phenomena. Its embarrassingly parallel reputation makes it a natural candidate for visualizing large data sets on distributed memory clusters, especially for machines without specialized graphics hardware. Unfortunately, the traditional recursive ray tracing algorithm is […]
Apr, 6

A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD) method using the SSE (streaming (single instruction multiple data) SIMD extensions) instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the […]
Apr, 6

GPGPU for orbital function evaluation with a new updating scheme

We accelerated an {it ab-initio} QMC electronic structure calculation by using GPGPU. The bottleneck of the calculation for extended solid systems is replaced by CUDA-GPGPU subroutine kernels which build up spline basis set expansions of electronic orbital functions at each Monte Carlo step. We achieved 30.8 times faster evaluation for the bottleneck, confirmed on the […]
Apr, 5

Visualization of Pareto Solutions by Spherical Self-Organizing Map and It’s acceleration on a GPU

In this study, we visualize Pareto-optimum solutions derived from multiple-objective optimization using spherical self-organizing maps (SOMs) that lay out SOM data in three dimensions. There have been a wide range of studies involving plane SOMs where Pareto-optimal solutions are mapped to a plane. However, plane SOMs have an issue that similar data differing in a […]
Apr, 5

A Programmable Processing Array Architecture Supporting Dynamic Task Scheduling and Module-Level Prefetching

Massively Parallel Processing Arrays (MPPA) constitute programmable hardware accelerators that excel in the execution of applications exhibiting Data-Level Parallelism (DLP). The concept of employing such programmable accelerators as sidekicks to the more traditional, general-purpose processing cores has very recently entered the mainstream; both Intel and AMD have introduced processor architectures integrating a Graphics Processing Unit […]
Apr, 5

Enhancing Performance of Simulations using GPGPU

General Purpose GPU computing, or GPGPU, is the use of a GPU (graphics processing unit) to do general purpose scientific and engineering computing. The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing platform. The sequential part of the application runs on the CPU and the computationally-intensive […]

Recent source codes

* * *

* * *

HGPU group © 2010-2026 hgpu.org

All rights belong to the respective authors

Contact us: