Sep, 13

Analysis of GPU-based convolution for acoustic wave propagation modeling with finite differences: Fortran to CUDA-C step-by-step

By projecting observed microseismic data backward in time to when fracturing occurred, it is possible to locate the fracture events in space, assuming a correct velocity model. In order to achieve this task in near real-time, a robust computational system to handle backward propagation, or Reverse Time Migration (RTM), is required. We can then test […]
Sep, 13

Performance and Power Optimization of GPU Architectures for General-purpose Computing

Power-performance efficiency has become a central focus that is challenging in heterogeneous processing platforms as the power constraints have to be established without hindering the high performance. In this dissertation, a framework for optimizing the power and performance of GPUs in the context of general-purpose computing in GPUs (GPGPU) is proposed. To optimize the leakage […]
Sep, 13

HTML5 WebSocket protocol and its application to distributed computing

HTML5 WebSocket protocol brings real time communication in web browsers to a new level. Daily, new products are designed to stay permanently connected to the web. WebSocket is the technology enabling this revolution. WebSockets are supported by all current browsers, but it is still a new technology in constant evolution. WebSockets are slowly replacing older […]
Sep, 11

Pattern Matching in OpenCL: GPU vs CPU Energy Consumption on Two Mobile Chipsets

Adaptations of the Aho-Corasick (AC) algorithm on high performance graphics processors (also called GPUs) have garnered increasing attention in recent years. However, no results have been reported regarding their implementations on mobile GPUs. In this paper, we show that implementing a state-of-the-art Aho-Corasick parallel algorithm on a mobile GPU delivers significant speedups. We study a […]
Sep, 11

Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications

GPUs have been proven very effective for structured applications. However, emerging data intensive applications are increasingly unstructured – irregular in their memory and control flow behavior over massive data sets. While the irregularity in these applications can result in poor workload balance among fine-grained threads or coarse-grained blocks, one can still observe dynamically formed pockets […]
Sep, 11

Parallelized Seeded Region Growing using CUDA

This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intent to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared […]
Sep, 11

Ray Traced Rendering Using GPGPU Devices

Ray tracing is a very popular way to draw 3-D scenes onto a 2-D image. The technique produces a very high degree of visual realism with regard to shadows, reflection, and refraction. The drawback of this technique is the fact that it is extremely computationally expensive. This expense has been a barrier to using ray […]
Sep, 11

Enhancing R with Advanced Compilation Tools and Methods

I describe an approach to compiling common idioms in R code directly to native machine code and illustrate it with several examples. Not only can this yield significant performance gains, but it allows us to use new approaches to computing in R. Importantly, the compilation requires no changes to R itself, but is done entirely […]
Sep, 9

Parallel Multi-dimensional Range Query Processing with R-Trees on GPU

The general purpose computing on graphics processing unit (GP-GPU) has emerged as a new cost effective parallel computing paradigm in high performance computing research that enables large amount of data to be processed in parallel. Large scale scientific data intensive applications have been playing an important role in modern high performance computing research. A common […]
Sep, 9

Improvements to Physically Based Cloth Simulation

Physically based cloth simulation in computer graphics has come a long way since the 1980s. Although extensive methods have been developed, physically based cloth animation remains challenging in a number of aspects, including the efficient simulation of complex internal dynamics, better performance and the generation of more effects of friction in collisions, to name but […]
Sep, 9

A Validation Testsuite for OpenACC 1.0

Directive-based programming models provide high-level of abstraction thus hiding complex low-level details of the underlying hardware from the programmer. One such model is OpenACC that is also a portable programming model allowing programmers to write applications that offload portions of work from a host CPU to an attached accelerator (GPU or a similar device). The […]
Sep, 9

A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing

The past years have witnessed many dedicated open-source projects that built and maintain implementations of Support Vector Machines (SVM), parallelized for GPU, multi-core CPUs and distributed systems. Up to this point, no comparable effort has been made to parallelize the Elastic Net, despite its popularity in many high impact applications, including genetics, neuroscience and systems […]
Page 30 of 781« First...1020...2829303132...405060...Last »

* * *

* * *

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: