13728

Posts

Mar, 14

4th International Conference on Image, Vision and Computing (ICIVC 2015), 2015

2015 4th International Conference on Image, Vision and Computing (ICIVC 2015) http://www.icivc.org/ Date: September 19-20, 2015 Venue: Hong Kong Submission Deadline: 2015-06-05 Topics: Image acquisition Detection and Estimation of Signal Parameters Image processing Signal Identification Medical image processing Nonlinear Signals and Systems Pattern recognition and analysis Time-Frequency Signal Analysis Visualization Signal Reconstruction Image coding and […]
Mar, 12

GPGPU Performance and Power Estimation Using Machine Learning

Graphics Processing Units (GPUs) have numerous configuration and design options, including core frequency, number of parallel compute units (CUs), and available memory bandwidth. At many stages of the design process, it is important to estimate how application performance and power are impacted by these options. This paper describes a GPU performance and power estimation model […]
Mar, 12

Implementing Machine Learning Algorithms on GPUs for Real-Time Traffic Sign Classification

This paper investigates traffic sign classification, which is an important problem to solve for autonomous driving. Linear discriminant analysis and convolutional neural networks achieved an accuracy of 98.25% and 98.75% respectively when classifying eight different types of traffic signs. The CNN was implemented on a GPU for real-time traffic sign classification: testing time for the […]
Mar, 12

CUDA accelerated large scale vehicular area network simulator

Both size and computational activities of Vehicular Area Network (VANET) are growing. Simulation of VANETs not only requires the simulation of network standards, but also the mobility of nodes. Such dynamic system involves computation of node distance, routing protocols, application layer, data send, data receive, etc. The simulation model of VANET requires both hardware and […]
Mar, 12

RadixBoost: A Hardware Acceleration Structure for Scalable Radix Sort on Graphic Processors

In this paper, we propose RadixBoost, a hardware acceleration structure for scalable 32-bit integer radix sort on GPU. The whole structure is integrated into a GPU microarchitecture as a special functional unit and can be started by new instructions. Our design enables a significantly faster sorting procedure for general purpose GPU computing. The RadixBoost architecture […]
Mar, 12

FastTree: A Hardware KD-Tree Construction Acceleration Engine for Real-Time Ray Tracing

The ray tracing algorithm is well-known for its ability to generate photo-realistic rendering effects. Recent years have witnessed a renewed momentum in pushing it to real-time for better user experience. Today the construction of acceleration structures, e.g., kd-tree, has become the bottleneck of ray tracing. A dedicated hardware architecture, FastTree, was proposed for kd-tree construction […]
Mar, 8

HOCL: A Family of Embedded Languages

We address the increasingly varied capabilities of specialized computing platforms by introducing a growing family of functionally-limited mini-languages, implemented as embedded domain specific languages (EDSLs) in Haskell, that may be composed to harness the computational features offered by a variety of hardware platforms. This development is based on a novel modular representation of the EDSL […]
Mar, 8

Converting Data-Parallelism to Task-Parallelism by Rewrites: Purely Functional Programs Across Multiple GPUs

High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. To this end, we present a compositional translation that fissions data-parallel programs in the Accelerate language, allowing subsequent compiler and […]
Mar, 8

An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems

Graph processing is increasingly used in knowledge economies and in science, in advanced marketing, social networking, bioinformatics, etc. A number of graph-processing systems, including the GPU-enabled Medusa and Totem, have been developed recently. Understanding their performance is key to system selection, tuning, and improvement. Previous performance evaluation studies have been conducted for CPU-based graph-processing systems, […]
Mar, 8

HPerf: A Lightweight Profiler for Task Distribution on CPU+GPU Platforms

Heterogeneous computing has emerged as one of the major computing platforms in many domains. Although there have been several proposals to aid programming for heterogeneous computing platforms, optimizing applications on heterogeneous computing platforms is not an easy task. Identifying which parallel regions (or tasks) should run on GPUs or CPUs is one of the critical […]
Mar, 8

Generating Performance Portable Code using Rewrite Rules: From High-level Functional Expressions to High-Performance OpenCL Code

Computing systems have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort. This results in a tension between performance and code portability. Typically, code is either tuned in an low-level imperative language using hardware-specific optimizations to […]
Mar, 6

Lyra2: Password Hashing Scheme with improved security against time-memory trade-offs

We present Lyra2, a password hashing scheme (PHS) based on cryptographic sponges. Lyra2 was designed to be strictly sequential (i.e., not easily parallelizable), providing strong security even against attackers that uses multiple processing cores (e.g., custom hardware or a powerful GPU). At the same time, it is very simple to implement in software and allows […]
Page 5 of 794« First...34567...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

231 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1429 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: