4314

Posts

Jun, 2

Sparse matrix computations on manycore GPU’s

Modern microprocessors are becoming increasingly parallel devices, and GPUs are at the leading edge of this trend. Designing parallel algorithms for manycore chips like the GPU can present interesting challenges, particularly for computations on sparse data structures. One particularly common example is the collection of sparse matrix solvers and combinatorial graph algorithms that form the […]
Jun, 2

A Reverse-Projecting Pixel-Level Painting Algorithm

Traditional Texturing using a set of two dimensional image maps is an established and widespread practice. However, it is difficult to parameterize a model in texture space, particularly with representations such as implicit surfaces, subdivision surfaces, and very dense or detailed polygonal meshes. Based on an adaptive octree textures definition, this paper proposes a direct […]
Jun, 2

Massively Parallel Finite Element Simulator for Full-Chip STI Stress Analysis

In modern integrated circuit (IC) designs with feature size finer than 90nm, the stress among different material layers is playing an important role in determining device performance. The stress can be classified into two categories, stress deliberately introduced during semiconductor process, and stress unintentionally formed through the synergy of different processing steps. Among different types […]
Jun, 2

RNA secondary structure prediction using dynamic programming algorithm – A review and proposed work

Ribonucleic acid (RNA) plays a fundamental and important role in cellular life forms and their function is directly related to their structure. RNA secondary structure prediction is a significant area of study for many scientists seeking insights into potential drug interactions or innovative new treatment methodologies. Predicting structure can overcome many issues related with physical […]
Jun, 2

Accelerating All-Atom Normal Mode Analysis with Graphics Processing Unit

All-atom normal mode analysis (NMA) is an efficient way to predict the collective motions in a given macromolecule, which is essential for the understanding of protein biological function and drug design. However, the calculations are limited in time scale mainly because the required diagonalization of the Hessian matrix by Householder-QR transformation is a computationally exhausting […]
Jun, 2

MapCG: writing parallel program portable between CPU and GPU

Graphics Processing Units (GPU) have been playing an important role in the general purpose computing market recently. The common approach to program GPU today is to write GPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve very good performance, it raises serious portability issues: programmers are required to […]
Jun, 2

Acceleration of LOD-FDTD Method Using Fundamental Scheme on Graphics Processor Units

This letter presents the acceleration of locally one-dimensional finite-difference time-domain (LOD-FDTD) method using fundamental scheme on graphics processor units (GPUs). Compared to the conventional scheme, the fundamental LOD-FDTD (denoted as FLOD-FDTD) scheme has its right-hand sides cast in the simplest form without involving matrix operators. This leads to a substantial reduction in floating-point operations as […]
Jun, 2

Swarm’s flight: Accelerating the particles using C-CUDA

With the development of Graphics Processing Units (GPU) and the Compute Unified Device Architecture (CUDA) platform, several areas of knowledge are being benefited with the reduction of the computing time. Our goal is to show how optimization algorithms inspired by Swarm Intelligence can take profit from this technology. In this paper, we provide an implementation […]
Jun, 2

Spherical harmonic transform on heterogeneous architectures using hybrid programming

Spherical Harmonic Transforms (SHT) are at the heart of many scientific and practical ap- plications ranging from climate modeling to cosmological observations. In many of these areas a new wave of exciting, cutting-edge science goals have been recently proposed calling for simulations and analyses of actual experimental or observational data at very high resolutions, accompanied […]
Jun, 1

GPS forward model computing study on CPU/GPU co-processing parallel system using CUDA

Profiles of refraction and bending angle, which computed through the forward model for GPSRO (Global Positioning System radio occultation), are extremely important for GPS radio occultation data assimilation to the forecast system of NWP (Numerical Weather Prediction). The daily processing of GPS RO data in assimilation system costs amount of time, thus there is an […]
Jun, 1

GPUMLib: A new Library to combine Machine Learning algorithms with Graphics Processing Units

The Graphics Processing Unit (GPU) is a highly parallel, many-core device with enormous computational power, especially well-suited to address Machine Learning (ML) problems that can be expressed as data-parallel computations. As problems become increasingly demanding, parallel implementations of ML algorithms become critical for developing hybrid intelligent real-world applications. The relative low cost of GPUs combined […]
Jun, 1

A GPU/CUDA implementation of the collection-diffusion model to compute SER of large area and complex circuits

This work reports the CUDA implementation of the collection-diffusion model to compute the soft-error rate (SER) of large area and/or complex circuits on graphics processing units (GPU). We detail the time parallelization introduced in the algorithm to accelerate by one order of magnitude the SER calculation. Code performances are evaluated on a NVIDIA Tesla C1060 […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: