9275

Posts

Apr, 8

Development of methods for the processing of mining images using genetic algorithms

In this paper we describe the extension of system FOTOM capabilities with respect to segmentation of specific mining images. We focus on methods that are inherently resistant against noise present in experimental pit at VSB Technical University. Here, we describe procedures employing proven active contours and evolutionary algorithms for recognizing points of interest in the […]
Apr, 8

Highly Scalable Multiplication for Distributed Sparse Multivariate Polynomials on Many-core Systems

We present a highly scalable algorithm for multiplying sparse multivariate polynomials represented in a distributed format. This algo- rithm targets not only the shared memory multicore computers, but also computers clusters or specialized hardware attached to a host computer, such as graphics processing units or many-core coprocessors. The scal- ability on the large number of […]
Apr, 7

Atomic-free Irregular Computations on GPUs

Atomic instructions are a key ingredient of codes that operate on irregular data structures like trees and graphs. It is well known that atomics can be expensive, especially on massively parallel GPUs, and are often on the critical path of a program. In this paper, we present two high-level methods to eliminate atomics in irregular […]
Apr, 7

Exploring complex quantum systems with a hybrid CPU-GPU computing platform

One of the most striking features of quantum mechanics is the exponential growth of resources, required to find the states of a composite system, with the size of the system. This also is the origin of the two main bottlenecks in numerical studies of complex quantum systems, that are (i) diagonalizations of big matrices and […]
Apr, 7

Speed up Large Integer Multiplication Using Fourier Transforms and CUDA Technology

Multiplying large integers is an operation that has many applications in Computational Science. Many cryptographic algorithms require operations on very large subsets of the integer numbers. Using Fast Fourier Transforms (FFT) and Graphics Processing Unit (GPU), we can speed up integer multiplication and make an effective multiplication algorithm. CUDA technology used to perform FFT on […]
Apr, 7

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Sparse matrix-matrix multiplication (SpMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpMM operation into three, […]
Apr, 7

A new CUDA-based GPU implementation of the two-dimensional Athena code

We present a new version of the Athena code, which solves magnetohydrodynamic equations in two-dimensional space. This new implementation, which we have named Athena-GPU, uses CUDA architecture to allow the code execution on Graphical Processor Unit (GPU). The Athena-GPU code is an unofficial, modified version of the Athena code which was originally designed for Central […]
Apr, 6

23rd Annual International Conference on Computer Science and Software Engineering, CASCON 2013

CASCON 2013 is the 23rd annual international conference hosted by CAS Research, IBM Canada Software Lab. Using the motto, “Innovation that matters”, this conference provides an exciting forum for exchanging ideas and experience in the ever-expanding and critical fields of software engineering and computing. The theme of this year, “Ecosystem of Engagement”, highlights the confluence […]
Apr, 6

OpenCL C++

With the success of programming models such as Khronos’ OpenCL, heterogeneous computing is going mainstream. However, these models are low-level, even when considering them as systems programming models. For example, OpenCL is effectively an extended subset of C99, limited to the type unsafe procedural abstraction that C has provided for more than 30 years. Computer […]
Apr, 6

A Performance Study of Zero Crossing Rate (ZCR) on Graphics Processors (GPUs) Using CUDA

The Ability to harness the power of the Graphics Processor Unit (GPU) enables us to show dramatic increases in computing performance using a parallel computing platform and programming model such as Nvidia CUDA. Compute Unified Device Architecture (CUDA) is NVIDIAs graphics programming API to perform General Purpose Graphics Processing Unit Programming (GPGPU). The General Purpose […]
Apr, 6

Improving GPU Performance Prediction with Data Transfer Modeling

Accelerators such as graphics processors (GPUs) have become increasingly popular for high performance scientific computing. Often, much effort is invested in creating and optimizing GPU code without any guaranteed performance benefit. To reduce this risk, performance models can be used to project a kernel’s GPU performance potential before it is ported. However, raw GPU execution […]
Apr, 6

Real-Time Object-Space Edge Detection using OpenCL

At its most basic, object-space edge detection iterates through all polygonal edges in each mesh to find those edges that satisfy one or more edge tests. Those that do are expanded and rendered, while the remainder are ignored. These 3D edges, and their resulting accuracy and customizability, set objectspace methods apart from all other categories […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: