Sep, 3

Fast 4D Sheared Filtering for Interactive Rendering of Distribution Effects

Soft shadows, depth of field, and diffuse global illumination are common distribution effects, usually rendered by Monte Carlo ray tracing. Physically correct, noise-free images can require hundreds or thousands of ray samples per pixel, and take a long time to compute. Recent approaches have exploited sparse sampling and filtering; the filtering is either fast (axisaligned), […]
Sep, 3

DeepPy: Pythonic deep learning

This technical report introduces DeepPy – a deep learning framework built on top of NumPy with GPU acceleration. DeepPy bridges the gap between highperformance neural networks and the ease of development from Python/NumPy. Users with a background in scientific computing in Python will quickly be able to understand and change the DeepPy codebase as it […]
Aug, 31

Deep Learning on FPGAs

The recent successes of deep learning are largely attributed to the advancement of hardware acceleration technologies, which can accommodate the incredible growth of data sizes and model complexity. The current solution involves using clusters of graphics processing units (GPU) to achieve performance beyond that of general purpose processors (GPP), but the use of field programmable […]
Aug, 31

SafeGPU: Contract- and Library-Based GPGPU for Object-Oriented Languages

Using GPUs as general-purpose processors has revolutionized parallel computing by providing, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to their widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper proposes a programming approach, […]
Aug, 31

Optimization of RAID Erasure Coding Algorithms for Intel Xeon Phi

In this work we describe and consider some features of implementing RAID erasure coding algorithms for Intel Xeon Phi coprocessor. We propose some algorithmic and technical improvements of encoding and decoding performance both in native and offload modes. Proposed approaches are designed to maximize the efficiency of Intel MIC architecture. We suggest new approach to […]
Aug, 31

Flux tubes at Finite Temperature

We show the flux tubes produced by static quark-antiquark, quark-quark and quark-gluon charges at finite temperature. The sources are placed in the lattice with fundamental and adjoint Polyakov loops. We compute the square densities of the chromomagnetic and chromoelectric fields above and below the phase transition. Our results are gauge invariant and produced in pure […]
Aug, 31

A Performance Model and Optimization Strategies for Automatic GPU Code Generation of PDE Systems Described by a Domain-Specific Language

Stencil computations are a class of algorithms operating on multi-dimensional arrays also called grid functions (GFs), which update array elements using their nearest-neighbors. This type of computation forms the basis for computer simulations across almost every field of science, such as computational fluid dynamics. Its mostly regular data access patterns potentially enable it to take […]
Aug, 28

Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API

Current trends in multicore platform design indicate that heterogeneous systems are here to stay. Such systems include processors with specialized accelerators supporting different instruction sets and different types of memory spaces among several other features. Unfortunately, these features increase the effort for programming and porting applications to different target platforms. To solve this problem, effective […]
Aug, 28

DawnCC: a Source-to-Source Automatic Parallelizer of C and C++ Programs

Dedicated graphics processing chips have become a standard component in most modern systems, making their powerful parallel computing capabilities more accessible to developers. Amongst the tools created to aid programmers in the task of parallelizing applications, directive-based standards are some of the most widely used. These standards, such as OpenACC and OpenMP, facilitate the conversion […]
Aug, 28

Benchmarking State-of-the-Art Deep Learning Software Tools

Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools coming to public. Training a deep network is usually a very time-consuming process. To address the huge computational challenge in deep learning, many tools exploit hardware features such […]
Aug, 28

Accelerating finite-rate chemical kinetics with coprocessors: comparing vectorization methods on GPUs, MICs, and CPUs

Efficient ordinary differential equation solvers for chemical kinetics must take into account the available thread and instruction-level parallelism of the underlying hardware, especially on many-core coprocessors, as well as the numerical efficiency. A stiff Rosenbrock and nonstiff Runge-Kutta solver are implemented using the single instruction, multiple thread (SIMT) and single instruction, multiple data (SIMD) paradigms […]
Aug, 28

Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA

Relativistic fluid dynamics is a major component in dynamical simulations of the quark-gluon plasma created in relativistic heavy-ion collisions. Simulations of the full three-dimensional dissipative dynamics of the quark-gluon plasma with fluctuating initial conditions are computationally expensive and typically require some degree of parallelization. In this paper, we present a GPU implementation of the Kurganov-Tadmor […]
Page 20 of 904« First...10...1819202122...304050...Last »

* * *

* * *

TwitterAPIExchange Object
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1484571722
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1484571722
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => BkQBII+SUpqibvMX9NBaq3aOz3c=

    [url] => https://api.twitter.com/1.1/users/show.json
Follow us on Facebook
Follow us on Twitter

HGPU group

2125 peoples are following HGPU @twitter

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: