13384

Posts

Jan, 13

Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors

A variety of real-life mobile sensing applications are becoming available, especially in the life-logging, fitness tracking and health monitoring domains. These applications use mobile sensors embedded in smart phones to recognize human activities in order to get a better understanding of human behavior. While progress has been made, human activity recognition remains a challenging task. […]
Jan, 13

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O(n^3)-time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can […]
Jan, 13

Thorough Evaluation of GPU Shared Memory Load and Store Instructions

This work focuses on measuring the number of GPU clock cycles necessary to execute load/store instructions in both bank conflict and bank conflict-free shared memory access patterns. To this end, a varying number of parameters have been considered in the experiments, including the number of warps (w), the number of memory bank conflicts (k) as […]
Jan, 12

A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches

Recent trends of CMOS scaling and increasing number of on-chip cores have led to a large increase in the size of on-chip caches. Since SRAM has low density and consumes large amount of leakage power, its use in designing on-chip caches has become more challenging. To address this issue, researchers are exploring the use of […]
Jan, 10

Face Recognition: A Tutorial on Computational Aspects

Face recognition is a sophisticated problem requiring a significant commitment of computer resources. A modern GPU architecture provides a practical platform for performing face recognition in real time. The majority of the calculations of an eigenpicture implementation of face recognition are matrix multiplications. For this type of computation, a conventional computer GPU is capable of […]
Jan, 10

Dynamic Feature-Adaptive Subdivision

Feature-adaptive subdivision (FAS) is one of the state-of-the art real-time rendering methods for subdivision surfaces on modern GPUs. It enables efficient and accurate rendering of subdivision surfaces in many interactive applications, such as video games or authoring tools. In this paper, we present dynamic feature-adaptive subdivision (DFAS), which improves upon FAS by enabling an independent […]
Jan, 10

Digital Signal Processing using Stream High Performance Computing: A 512-input Broadband Correlator for Radio Astronomy

A "large-N" correlator that makes use of Field Programmable Gate Arrays and Graphics Processing Units has been deployed as the digital signal processing system for the Long Wavelength Array station at Owens Valley Radio Observatory (LWA-OV), to enable the Large Aperture Experiment to Detect the Dark Ages (LEDA). The system samples a ~100MHz baseband and […]
Jan, 10

Exploring GPU Memory Performance Using Digital Image Processing Algorithms

Leveraging the incredible parallel computational power of graphics processing units (GPUs) is a proven method for accelerating general applications. Efficient utilization of the GPU remains one of the greatest challenges facing programmers. The performance of GPU applications is extremely reliant on memory performance, to the point that it can be considered a critical bottleneck. This […]
Jan, 10

Image Super-Resolution Using Deep Convolutional Networks

We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can […]
Jan, 9

International Workshop on OpenCL

The International Workshop on OpenCL (IWOCL – “eye-wok-ul”) is an annual meeting and community of users, researchers, developers and suppliers that share best practice, and promote the evolution and advancement of the OpenCL standard for parallel programming of heterogeneous systems.
Jan, 8

CHO: A Benchmark Suite for OpenCL-based FPGA Accelerators

Programming FPGAs with OpenCL-based high-level synthesis frameworks is gaining attention with a number of commercial and research frameworks announced. However, there are no benchmarks for evaluating these frameworks. To this end, we present CHO benchmark suite an extension of CHStone, a commonly used C-based high-level synthesis benchmark suite, for OpenCl. We characterise CHO at various […]
Jan, 8

Cardiac Dysrhythmia Detection with GPU-Accelerated Neural Networks

Cardiac dysrhythmia is responsible for over half a million deaths in the United States annually. In this work, we evaluate the performance of neural networks on classifying electrocardiogram (ECG) sequences as normal or abnormal (arrhythmia). Using neural networks as our primary learning model, we explain our model’s performance and discuss hyperparameter tuning. Comparing the results […]
Page 20 of 797« First...10...1819202122...304050...Last »

* * *

* * *

Like us on Facebook

HGPU group

237 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1439 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: