14278

Posts

Jul, 17

DeepFont: Identify Your Font from An Image

As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) problem, and advance the state-of-the-art remarkably by developing the DeepFont system. First of all, we build up the […]
Jul, 17

Overhauling SC atomics in C11 and OpenCL

Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A […]
Jul, 15

Performance analysis of GPGPU and CPU On AES Encryption

The advancements in computing have led to tremendous increase in the amount of data being generated every minute, which needs to be stored or transferred maintaining high level of security. The military and armed forces today heavily rely on computers to store huge amount of important and secret data, that holds a big deal for […]
Jul, 15

Automatic Optimization of Thread Mapping for a GPGPU Programming Framework

Although General Purpose computation on Graphics Processing Units (GPGPU) is widely used for the high-performance computing, standard programming frameworks such as CUDA and OpenCL are still difficult to use.They require low-level specifications and the hand-optimization is a large burden. Therefore we are developing an easier framework named MESI-CUDA. Based on a virtual shared memory model, […]
Jul, 15

Concurrent Manipulation of Dynamic Data Structures in OpenCL

With the emergence of general purpose GPU (GPGPU) programming, concurrent data processing of large arrays of data has gained a significant boost in performance. However, due to the memory architecture between the host and GPU device and other limitations in the instructions available on GPUs, the implementation of dynamic data structures, like linked list and […]
Jul, 15

Recovering Historical Climate Records using Artificial Neural Networks in GPU

This article presents a parallel implementation of Artificial Neural Networks over Graphic Processing Units, and its application for recovering historical climate records from the Digi-Clima project. Several strategies are introduced to handle large volumes of historical pluviometer records, and the parallel deployment is described. The experimental evaluation demonstrates that the proposed approach is useful for […]
Jul, 15

Adaptive parallelism mapping in dynamic environments using machine learning

Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mainstream parallel applications execute in the same system competing for resources. This resource contention may lead to a drastic degradation in a program’s performance. In addition, the execution environment composed of workloads and hardware resources, is dynamic and unpredictable. Efficient matching […]
Jul, 15

6th International Conference on Applied Physics and Mathematics (ICAPM 2016), 2016

Submission Methods Please log in Electronic Submission System (.pdf). http://www.easychair.org/conferences/?conf=icapm2016 Paper Publication: Paper accepted by ICAPM 2016 will be published in one of the following publications after review process. * International Journal of Engineering and Technology (IJET, ISSN: 1793-8236) Indexing: Chemical Abstracts Services (CAS), DOAJ, Engineering & Technology Digital Library, Google Scholar, Ulrich Periodicals Directory, […]
Jul, 13

GPU-accelerated discontinuous Galerkin methods on hybrid meshes

We present a time-explicit discontinuous Galerkin (DG) solver for the time-domain acoustic wave equation on hybrid meshes containing vertex-mapped hexahedral, wedge, pyramidal and tetrahedral elements. Discretely energy-stable formulations are presented for both Gauss-Legendre and Gauss-Legendre-Lobatto (Spectral Element) nodal bases for the hexahedron. Stable timestep restrictions for hybrid meshes are derived by bounding the spectral radius […]
Jul, 13

A Study of Data Partitioning on OpenCL-based FPGAs

A lot of research efforts have been devoted to accelerating relational database applications on FPGAs, due to their high energy efficiency and high throughput. Most of the existing studies are based on hardware description languages (HDLs). Recently, FPGA vendors have started to develop OpenCL SDKs for much better programmability. In this paper, we investigate the […]
Jul, 13

Evaluating the capabilities of the Xeon Phi platform in the context of software-only, thread-level speculation

Intel Xeon Phi accelerators are one of the newest devices used in the field of parallel computing. However, there are comparatively few studies concerning their performance when using most of the existing parallelization techniques. One of them is thread-level speculation, a technique that optimistically tries to extract parallelism of loops without the need of a […]
Jul, 13

Optimization, Specification and Verification of the Prefix Sum Program in an OpenCL Environment

The Prefix Sum is an algorithm used as a building block for various other algorithms, for example radix sort, quicksort and lexically comparing strings. Implementing the Prefix Sum algorithm on the CPU is trivial, but a parallel approach with OpenCL is more complicated. An implementation in OpenCL has been made, and optimized to minimize branch […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org