14236

Posts

Jul, 6

LTTng CLUST: A system-wide unified CPU and GPU tracing tool for OpenCL applications

As computation schemes evolve and many new tools become available to programmers to enhance the performance of their applications, many programmers started to look towards highly parallel platforms such as Graphical Processing Unit (GPU). Offloading computations that can take advantage of the architecture of the GPU is a technique that has proven fruitful in recent […]
Jul, 6

Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources

With recent advancement on hardware technologies, new general-purpose high-performance devices have been widely adopted, such as the graphics processing unit (GPU) and solid state drive (SSD). GPU may offer an order of higher throughput for applications with massive data parallelism, compared with the multicore CPU. Moreover, new storage device SSD is also capable of offering […]
Jul, 6

Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores

In-memory key-value stores play a critical role in data processing to provide high throughput and low latency data accesses. In-memory key-value stores have several unique properties that include (1) data intensive operations demanding high memory bandwidth for fast data accesses, (2) high data parallelism and simple computing operations demanding many slim parallel computing units, and […]
Jul, 3

High Performance Approximate Sort Algorithm Using GPUs

Sorting is a fundamental problem in computer science, and the strict sorting usually means a strict order with ascending or descending. However, some applications in reality don’t require the strict ascending or descending order and the approximate ascending or descending order just meets the requirement. Graphics processing units (GPUs) have become accelerators for parallel computing. […]
Jul, 3

Modelling the Formation of Ordered Acentrosomal Microtubule Arrays

Acentrosomal microtubules are not bound to a microtubule organising centre yet are still able to form ordered arrays. Two clear examples of this behaviour are the acentrosomal apico-basal (side wall) array in epithelial cells and the parallel organisation of plant cortical microtubules. This research investigates their formation through mathematical modelling and Monte Carlo simulations with […]
Jul, 3

Texture Cache Approximation on GPUs

We present texture cache approximation as a method for using existing hardware on GPUs to eliminate costly global memory accesses. We develop a technique for using a GPU’s texture fetch units to generate approximate values, and argue that this technique is applicable to a wide variety of GPU kernels. Applying texture cache approximation to an […]
Jul, 3

GPU phase-field lattice Boltzmann simulations of growth and motion of a binary alloy dendrite

A GPU code has been developed for a phase-field lattice Boltzmann (PFLB) method, which can simulate the dendritic growth with motion of solids in a dilute binary alloy melt. The GPU accelerated PFLB method has been implemented using CUDA C. The equiaxed dendritic growth in a shear flow and settling condition have been simulated by […]
Jul, 3

Parallel Sparse Coding for Seafloor Image Analysis

Sparse coding has been a popular learning model in machine learning field. However, due to the complexity of the learning model, the high computational cost has seriously hindered its application. Toward this purpose, this paper presents a parallel sparse coding method to improve the performance by exploiting the power of acceleration technologies such as Intel […]
Jul, 3

2nd International Conference on Artificial Intelligence (ICOAI), 2015

Topics: – Agent-based systems – Ant colony optimization – Approximate reasoning – Artificial immune systems – Artificial Intelligence in modelling and simulation – Artificial Intelligence in scheduling and optimization – Artificial life – Bioinformatics and computational biology – Brain-machine interfaces – Cognitive systems and applications – Collective intelligence – Computer vision – Data mining – […]
Jul, 3

8th International Conference on Computer Science and Information Technology (ICCSIT), 2015

Topics: Algorithms Artificial Intelligence Automated Software Engineering Bio-informatics Bioinformatics and Scientific Computing Biomedical Engineering Compilers and Interpreters Computational Intelligence Computer Animation Computer Architecture & VLSI Computer Architecture and Embedded Systems Computer Based Education Computer Games Computer Graphics & Virtual Reality Computer Graphics and Multimedia Computer Modeling Computer Networks Computer Networks and Data Communication Computer Security […]
Jul, 1

Buffer overflow vulnerabilities in CUDA: a preliminary analysis

We present a preliminary study of buffer overflow vulnerabilities in CUDA software running on GPUs. We show how an attacker can overrun a buffer to corrupt sensitive data or steer the execution flow by overwriting function pointers, e.g., manipulating the virtual table of a C++ object. In view of a potential mass market diffusion of […]
Jul, 1

Spectral Ewald Acceleration of Stokesian Dynamics for polydisperse suspensions

In this work we develop the Spectral Ewald Accelerated Stokesian Dynamics (SEASD), a novel computational method for dynamic simulations of polydisperse colloidal suspensions with full hydrodynamic interactions. SEASD is based on the framework of Stokesian Dynamics (SD) with extension to compressible solvents, and uses the Spectral Ewald (SE) method [Lindbo & Tornberg, J. Comput. Phys. […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: