16460

Posts

Aug, 28

DawnCC: a Source-to-Source Automatic Parallelizer of C and C++ Programs

Dedicated graphics processing chips have become a standard component in most modern systems, making their powerful parallel computing capabilities more accessible to developers. Amongst the tools created to aid programmers in the task of parallelizing applications, directive-based standards are some of the most widely used. These standards, such as OpenACC and OpenMP, facilitate the conversion […]
Aug, 28

Benchmarking State-of-the-Art Deep Learning Software Tools

Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools coming to public. Training a deep network is usually a very time-consuming process. To address the huge computational challenge in deep learning, many tools exploit hardware features such […]
Aug, 28

Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API

Current trends in multicore platform design indicate that heterogeneous systems are here to stay. Such systems include processors with specialized accelerators supporting different instruction sets and different types of memory spaces among several other features. Unfortunately, these features increase the effort for programming and porting applications to different target platforms. To solve this problem, effective […]
Aug, 28

Accelerating finite-rate chemical kinetics with coprocessors: comparing vectorization methods on GPUs, MICs, and CPUs

Efficient ordinary differential equation solvers for chemical kinetics must take into account the available thread and instruction-level parallelism of the underlying hardware, especially on many-core coprocessors, as well as the numerical efficiency. A stiff Rosenbrock and nonstiff Runge-Kutta solver are implemented using the single instruction, multiple thread (SIMT) and single instruction, multiple data (SIMD) paradigms […]
Aug, 28

Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA

Relativistic fluid dynamics is a major component in dynamical simulations of the quark-gluon plasma created in relativistic heavy-ion collisions. Simulations of the full three-dimensional dissipative dynamics of the quark-gluon plasma with fluctuating initial conditions are computationally expensive and typically require some degree of parallelization. In this paper, we present a GPU implementation of the Kurganov-Tadmor […]
Aug, 23

Fast Multidimensional Image Processing with OpenCL

Multidimensional image data, i.e., images with three or more dimensions, are used in many areas of science. Multidimensional image processing is supported in Python and MATLAB. VisionGL is an open source library that provides a set of image processing functions and can help the programmer by automatically generating code. The objective of this work is […]
Aug, 23

Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs

Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic variants such as Bayesian networks. Inference-based algorithms are powerful techniques for solving discrete optimization problems, which can be used […]
Aug, 23

MetaMorph: A Library Framework for Interoperable Kernels on Multi- and Many-core Clusters

To attain scalable performance efficiently, the HPC community expects future exascale systems to consist of multiple nodes, each with different types of hardware accelerators. In addition to GPUs and Intel MICs, additional candidate accelerators include embedded multiprocessors and FPGAs. End users need appropriate tools to efficiently use the available compute resources in such systems, both […]
Aug, 23

MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs

A particularly challenging class of problems arising in many applications, called batched problems, involves linear algebra operations on many small-sized matrices. We proposed and designed batched BLAS (Basic Linear Algebra Subroutines), Level-2 GEMV and Level-3 GEMM, to solve them. We illustrate how to optimize batched GEMV and GEMM to assist batched advance factorization (e.g. bi-diagonalization) […]
Aug, 23

Hybrid CPU-GPU Framework for Network Motifs

Massively parallel architectures such as the GPU are becoming increasingly important due to the recent proliferation of data. In this paper, we propose a key class of hybrid parallel graphlet algorithms that leverages multiple CPUs and GPUs simultaneously for computing k-vertex induced subgraph statistics (called graphlets). In addition to the hybrid multi-core CPU-GPU framework, we […]
Aug, 18

Streaming Applications on Heterogeneous Platforms

Using multiple streams can improve the overall system performance by mitigating the data transfer overhead on heterogeneous systems. Currently, very few cases have been streamed to demonstrate the streaming performance impact and a systematic investigation of streaming necessity and how-to over a large number of test cases remains a gap. In this paper, we use […]
Aug, 18

GPU-Acceleration of In-Memory Data Analytics

Hardware advances strongly influence the database system design. The flattening speed of CPU cores makes many-core accelerators, such as GPUs, a vital alternative to explore for processing the ever-increasing amounts of data. GPUs have a significantly higher degree of parallelism than multi-core CPUs but their cores are simpler. As a result, they do not face […]
Page 21 of 904« First...10...1920212223...304050...Last »

* * *

* * *

TwitterAPIExchange Object
(
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
        (
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1484688931
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1484688931
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => 4xHpKDb3UBMA7igfwSypoT5hi+Y=
        )

    [url] => https://api.twitter.com/1.1/users/show.json
)
Follow us on Facebook
Follow us on Twitter

HGPU group

2129 peoples are following HGPU @twitter

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: