high performance computing on graphics processing units: hgpu.org

Posts

Sep, 17

gSLICr: SLIC superpixels at over 250Hz

We introduce a parallel GPU implementation of the Simple Linear Iterative Clustering (SLIC) superpixel segmentation. Using a single graphic card, our implementation achieves speedups of up to 83x from the standard sequential implementation. Our implementation is fully compatible with the standard sequential implementation and the software is now available online and is open source.

CUDA

Sep, 17

Scalable Metropolis Monte Carlo for simulation of hard shapes

We design and implement HPMC, a scalable hard particle Monte Carlo simulation toolkit, and release it open source as part of HOOMD-blue. HPMC runs in parallel on many CPUs and many GPUs using domain decomposition. We employ BVH trees instead of cell lists on the CPU for fast performance, especially with large particle size disparity, […]

CUDA

Sep, 15

linalg: Matrix Computations in Apache Spark

We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark comes with the mllib.linalg library, which provides abstractions and implementations for distributed matrices. Using these abstractions, we highlight the computations that were more challenging to distribute. When translating single-node algorithms to run on a distributed cluster, we observe […]

CUDA

Sep, 15

Refinements in Syntactic Parsing

Syntactic parsing is one of the core tasks of natural language processing, with many appli- cations in downstream NLP tasks, from machine translation and summarization to relation extraction and coreference resolution. Parsing performance on English texts, particularly well-edited newswire text, is generally regarded as quite good. However, state-of-the-art constituency parsers produce incorrect parses for more […]

CUDA

Sep, 15

PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming

Programming accelerators such as GPUs with low-level APIs and languages such as OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic parallelization and domain specific languages (DSLs) have been proposed to hide complexity and regain performance portability. We present PENCIL, a rigorously-defined subset of GNU C99-enriched with additional language constructs-that enables compilers to exploit […]

OpenCL

Sep, 15

Efficient Convolutional Neural Networks for Pixelwise Classification on Heterogeneous Hardware Systems

This work presents and analyzes three convolutional neural network (CNN) models for efficient pixelwise classification of images. When using convolutional neural networks to classify single pixels in patches of a whole image, a lot of redundant computations are carried out when using sliding window networks. This set of new architectures solve this issue by either […]

CUDA

•

OpenCL

Sep, 15

A GPU-based Parallel Ant Colony Algorithm for Scientific Workflow Scheduling

Scientific workflow scheduling problem is a combinatorial optimization problem. In the real application, the scientific workflow generally has thousands of task nodes. Scheduling large-scale workflow has huge computational overhead. In this paper, a parallel algorithm for scientific workflow scheduling is proposed so that the computing speed can be improved greatly. Our method used ant colony […]

CUDA

Sep, 10

5th International Conference on Industrial Technology and Management (ICITM), 2016

Topics: Decision Analysis and Methods E-Business and E-Commerce Engineering Economy and Cost Analysis Engineering Education and Training Facilities Planning and Management Global Manufacturing and Management Human Factors Information Processing and Engineering Intelligent Systems Manufacturing Systems Operations Research Production Planning and Control Project Management Quality Control and Management Reliability and Maintenance Engineering Safety, Security and Risk […]

Sep, 10

5th International Conference on Educational and Information Technology (ICEIT), 2016

Topics: Database Technology Artificial Intelligence Computer architecture Software Engineering Computer Graphics Computer Application Control Technology Systems Engineering Service learning Learning models Faculty development Distance Education for Computers Life-long education Computer Education for Particular Group Other Computer Education Active learning Computer Education for Graduates Computer Education for Undergraduates Network Technology Communication Technology Other Advanced Technology Undergraduate […]

Sep, 10

2nd International Conference on Knowledge (ICK), 2016

Topics: T1 • Novel Algorithms T2 • Association Rules T3 • Knowledge engineering and management T4 • Classification and T5 • Clustering T6 • Text analysis and text understanding T7 • Machine Learning T8 • Privacy Preserving Data Mining T9 • Statistical Methods T10 • Parallel and Distributed Data Mining T11 • Interactive and Online […]

Sep, 10

International Conference on Advances in Mechanical Design (ICAMD), 2016

Submission Methods: Please log in Electronic Submission System (.pdf). http://www.easychair.org/conferences/?conf=icamd2016 Paper Publication: Paper accepted by ICAMD 2016 will be published in one of the following publications after review process. * International Journal of Mechanical Engineering and Robotics Research (ISSN: 2278-0149) Indexing: Index Corpernicus, ProQuest, UDL, Google Scholar, Open J-Gate; etc. Call 4 Papers: Actuator Systems […]

Sep, 10

7th International Conference on Mechatronics and Manufacturing (ICMM), 2016

Submission Methods: Please log in Electronic Submission System (.pdf). http://www.easychair.org/conferences/?conf=icmm2016 Paper Publication: Paper accepted by ICMM 2016 will be published in one of the following publications after review process. *Applied Mechanics and Materials Journal (ISSN: 1660-9336) Indexing: Volumes are submitted for indexing to Elsevier: SCOPUS and Ei Compendex (CPX). Cambridge Scientific Abstracts (CSA), Chemical Abstracts […]