12839

Posts

Sep, 19

GPU-based NSEC3 Hash Breaking

When a client queries for a non-existent name in the Domain Name System (DNS), the server responds with a negative answer. With the DNS Security Extensions (DNSSEC), the server can either use NSEC or NSEC3 for authenticated negative answers. NSEC3 claims to protect DNSSEC servers against domain enumeration, but incurs significant CPU and bandwidth overhead. […]
Sep, 19

Automated Software Testing of Memory Performance in Embedded GPUs

Embedded and real-time software is often constrained by several temporal requirements. Therefore, it is important to design embedded software that meets the required performance goal. The inception of embedded graphics processing units (GPUs) brings fresh hope in developing high-performance embedded software which were previously not suitable for embedded platforms. Whereas GPUs use massive parallelism to […]
Sep, 19

GPU accelerated Monte Carlo simulation of Brownian motors dynamics with CUDA

This work presents an updated and extended guide on methods of a proper acceleration of the Monte Carlo integration of stochastic differential equations with the commonly available NVIDIA Graphics Processing Units using the CUDA programming environment. We outline the general aspects of the scientific computing on graphics cards and demonstrate them with two models of […]
Sep, 17

Smart Multi-Task Scheduling for OpenCL Programs on CPU/GPU Heterogeneous Platforms

Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms for high performance computing. Such platforms are usually programmed using OpenCL which provides program portability by allowing the same program to execute on different types of device. As such systems become more mainstream, they will move from application dedicated devices to platforms […]
Sep, 17

GPU Accelerated Parallel Iris Segmentation

A biometric system provides automatic identification of an individual based on a unique feature or characteristic possessed by the person. Iris recognition systems are the most definitive biometric system since complex random iris patterns are unique to each individual and do not change with time. Iris Recognition is basically divided into three steps, namely, Iris […]
Sep, 17

High Performance Client-Side Web Programming with SPOC and Js_of_ocaml

We present WebSpoc, an OCaml GPGPU library targeting web applications that is built upon SPOC and js_of_ocaml. SPOC is an OCaml GPGPU library focusing on abstracting memory transfers, handling GPGPU computations and offering easy portability. Js_of_ocaml is the OCaml byte-code to JavaScript compiler. Thus, WebSpoc provides high performance computations from the web browser while benefiting […]
Sep, 17

GPU Implementation of a Deep Learning Network for Financial Prediction

Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. Neural network is the well-known branch of machine learning & it has been used extensively by researchers for prediction of data and the prediction accuracy depends upon fine tuning of particular financial data. In this paper […]
Sep, 17

Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

We present a general method for accelerating by more than an order of magnitude the convolution of pixelated functions on the sphere with a radially-symmetric kernel. Our method splits the kernel into a compact real-space component and a compact spherical harmonic space component. These components can then be convolved in parallel using an inexpensive commodity […]
Sep, 16

The GPUVerify Method: a Tutorial Overview

I present a tutorial overview demonstrating the key technique used by GPUVerify, a static verification tool for graphics processing unit (GPU) kernels. The technique is a method for translating a massively parallel GPU kernel into a sequential program such that correctness of the sequential program implies data race-freedom of the parallel kernel.
Sep, 16

Distance Threshold Similarity Searches on Spatiotemporal Trajectories using GPGPU

The processing of moving object trajectories arises in many application domains. We focus on a trajectory similarity search, the distance threshold search, which finds all trajectories within a given distance of a query trajectory over a time interval. A multithreaded CPU implementation that makes use of an in-memory R-tree index can achieve high parallel efficiency. […]
Sep, 16

High-accuracy Optimization by Parallel Iterative Discrete Approximation and GPU Cluster Computing

High-accuracy optimization is the key component of time-sensitive applications in computer sciences such as machine learning, and we develop single-GPU Iterative Discrete Approximation Monte Carlo Optimization (IDA-MCS) and multi-GPU IDA-MCS in our previous research. However, because of the memory capability constrain of GPUs in a workstation, single-GPU IDA-MCS and multi-GPU IDA-MCS may be in low […]
Sep, 16

Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi

In 2013 Intel introduced the Xeon Phi, a new parallel co-processor board. The Xeon Phi is a cache-coherent many-core shared memory architecture claiming CPU-like versatility, programmability, high performance, and power efficiency. The first published micro-benchmark studies indicate that many of Intel’s claims appear to be true. The current paper is the first study on the […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org