high performance computing on graphics processing units: hgpu.org

Posts

Mar, 18

Volume Raycasting Performance Using DirectCompute

Volume rendering is quite an old concept of representing images, dating back to the 1980’s. It is very useful in the medical field for visualizing the results of a computer tomography (CT) and magnet resonance tomography (MRT) in 3D. Apart from these two major applications for volume rendering, there aren’t many other fields of usage […]

Mar, 18

GPU-Based Cloud Service for Smith-Waterman algorithm Using Frequency Distance Filtration Scheme

As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing […]

CUDA

Mar, 18

Parallelization with Different API on Multicore Architecture

Soft matter as a research topic extends over fields from a multitude of disciplines. Biological systems are nearly exclusively composed of soft matter. Nearly everything that animals eat is considered soft matter. Large parts of chemistry deal with soft matter, such as the whole field of polymers. Many materials, expecially modern ones, are soft matter. […]

CUDA

Mar, 18

GPU Accelerated Multiple Deoxyribose Nucleic Acid Sequence Parallel Matching

In this paper, a contrastive evaluation of massive parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU(kepler architecture). Due to the more regular execution flow of the indexed binary search algorithm, the more efficient use of the […]

CUDA

Mar, 16

Accelerating Computer Vision Algorithms Using OpenCL on Mobile GPU – A Case Study

Recently, general-purpose computing on graphics processing units (GPGPU) has been enabled on mobile devices thanks to the emerging heterogeneous programming models such as OpenCL. The capability of GPGPU on mobile devices opens a new era for mobile computing and can enable many computationally demanding computer vision algorithms on mobile devices. As a case study, this […]

OpenCL

Mar, 16

Forensics on GPU Coprocessing in Databases – Research Challenges, First Experiments, and Countermeasures

Recently, using GPUs for coprocessing in database systems has been shown to be beneficial. However, information systems processing confidential data cannot benefit from GPU acceleration yet because knowledge of security issues and forensicexaminations on GPUs are still fragmentary. In this paper, we point out key challenges and research questions related to forensics and anti-forensics on […]

CUDA

Mar, 16

Efficiency of Parallelization of Neural Network Algorithm on Graphic Cards

In this paper we are testing the efficiency of parallelization with use of graphic cards. There are many applications where such systems occurs in common, so we choose the domain of artificial neural networks. Actually sold graphic cards gives us strong potential in speeding up calculations and card vendors provide us with even more, giving […]

CUDA

Mar, 16

Astronomical Photometric Data Reduction Using GPGPU

Astronomical photometry is one of the sciences, that benefit from the recent technological development in order to augment the quality and the quantity of the processed data. The planned projects, such as the European SOLARIS and the American LSST promises to generate the amount of data that will be a challenge for modern astronomical data […]

CUDA

Mar, 16

Exploring power efficiency and optimizations targeting heterogeneous applications

Graphics processing units (GPUs) have become widely accepted as the computing platform of choice in many high performance computing domains, due to the potential for approaching or exceeding the performance of a large cluster of CPUs for many parallel applications. The availability of programming standards such as OpenCL makes the use of GPUs even more […]

OpenCL

Mar, 15

iTree: Exploring Time-Varying Data using Indexable Tree

Significant advances have been made in time-varying data analysis and visualization, mainly in improving our ability to identify temporal trends and classify the underlying data. However, the ability to perform cost-effective data querying and indexing is often not incorporated, which posts a serious limitation as the size of timevarying data continue to grow. In this […]

CUDA

Mar, 15

Real-time Rendering of Melting Objects in Video Games

We present a method for simulating the melting and flowing of material in burning objects fast enough to be of use in video games where most of the graphical and computational resources are needed elsewhere. The standard practice of using particle engines or fluid dynamics for melting are far too costly for use in this […]

CUDA

Mar, 15

Convergence and Scalarization for Data-Parallel Architectures

Modern throughput processors such as GPUs achieve high performance and efficiency by exploiting data parallelism in application kernels expressed as threaded code. One drawback of this approach compared to conventional vector architectures is redundant execution of instructions that are common across multiple threads, resulting in energy inefficiency due to excess instruction dispatch, register file accesses, […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Volume Raycasting Performance Using DirectCompute

GPU-Based Cloud Service for Smith-Waterman algorithm Using Frequency Distance Filtration Scheme

Parallelization with Different API on Multicore Architecture

GPU Accelerated Multiple Deoxyribose Nucleic Acid Sequence Parallel Matching

Accelerating Computer Vision Algorithms Using OpenCL on Mobile GPU – A Case Study

Forensics on GPU Coprocessing in Databases – Research Challenges, First Experiments, and Countermeasures

Efficiency of Parallelization of Neural Network Algorithm on Graphic Cards

Astronomical Photometric Data Reduction Using GPGPU

Exploring power efficiency and optimizations targeting heterogeneous applications

iTree: Exploring Time-Varying Data using Indexable Tree

Real-time Rendering of Melting Objects in Video Games

Convergence and Scalarization for Data-Parallel Architectures

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)