12187

Posts

May, 29

Optimization of solver for gas flow modeling

The main purpose of the work is optimization of the solver for rarefied gas flow modeling based on the Boltzmann equation. Optimization method is based on SIMD extensions for x86 processors. Computational code is profiled and manually optimized with SSE instructions. Heat flow, shock waves and Knudsen pump are modeled with optimized solver. Dependencies of […]
May, 29

Simulating of query processing on multiprocessor database systems with modern coprocessors

The modern manycore coprocessors and GPUs demonstrate very high performance on certain problems. Recent research has shown that these coprocessors can be used to accelerate database operations. But, to the best of our knowledge, there is only a little prior work on using coprocessors in multiprocessor database systems. This paper focuses on evaluation of database […]
May, 29

Performance Evaluation of Quicksort with GPU Dynamic Parallelism for Gene-Expression Quantile Normalization

High-density oligonucleotide microarrays allow several millions of genetic markers in a single experiment to be observed. Current bioinformatics tools for gene-expression quantile data normalization are unable to process such huge data sets. In parallel with this reality, the huge volume of molecular data produced by current high-throughput technologies in modern molecular biology has increased at […]
May, 29

Heterogeneous Computing for Solving System of the Linear Equations by the Conjugate Gradient Method

The main purpose of this work is to show the advantages of using various approaches of heterogeneous programming. The results were received on the example of solving the system of the linear equations by the conjugate gradient method. High-level and low-level technologies (OpenACC and CUDA respectively) were used to accelerate computations on the GPU. The […]
May, 29

SMCGen: Generating Reconfigurable Design for Sequential Monte Carlo Applications

The Sequential Monte Carlo (SMC) method is a simulation-based approach to compute posterior distributions. SMC methods often work well on applications considered intractable by other methods due to high dimensionality, but they are computationally demanding. While SMC has been implemented efficiently on FPGAs, design productivity remains a challenge. This paper introduces a design flow for […]
May, 28

3rd International Conference on Image, Vision and Computing ICIVC 2014

Submission Deadline: 2014-06-30 Publication: The Proceedings of this conference will be published, and will be included in the SPIE Digital Library, and indexed by Ei Compendex and Thomson ISI. Call for Paper: Image acquisition Image processing Medical image processing Pattern recognition and analysis Visualization Image coding and compression Face Recognition Super-resolution imaging Image segmentation Face […]
May, 28

6th International Conference on Software Technology and Engineering ICSTE 2014

Submission Deadline: 2014-06-30 Publication: All accepted papers will be published in the volume of Lecture Notes on Software Engineering (LNSE) (ISSN: 2301-3559), and will be included in the DOAJ, Electronic Journals Library, Engineering & Technology Digital Library, EBSCO, Ulrich’s Periodicals Directory, International Computer Science Digital Library (ICSDL), ProQuest and Google Scholar. Call for Paper: AI […]
May, 26

GASPP: A GPU-Accelerated Stateful Packet Processing Framework

Graphics processing units (GPUs) are a powerful platform for building high-speed network traffic processing applications using low-cost hardware. Existing systems tap the massively parallel architecture of GPUs to speed up certain computationally intensive tasks, such as cryptographic operations and pattern matching. However, they still suffer from significant overheads due to critical-path operations that are still […]
May, 26

Automatic Termination Analysis for GPU Kernels

We describe a method for proving termination of massively parallel GPU kernels. An implementation in KITTeL is able to show termination of 94% of the 598 kernels in our benchmark suite.
May, 26

Efficient Algorithm for RSA Text Encryption Using CUDA-C

Modern-day computer security relies heavily on cryptography as a means to protect the data that we have become increasingly reliant on. The main research in computer security domain is how to enhance the speed of RSA algorithm. The computing capability of Graphic Processing Unit as a co-processor of the CPU can leverage massive-parallelism. This paper […]
May, 26

A Parallel Jacobi-Type Lattice Basis Reduction Algorithm

This paper describes a parallel Jacobi method for lattice basis reduction and a GPU implementation using CUDA. Our experiments have shown that the parallel implementation is more than fifty times as fast as the serial counterpart, which is twice as fast as the well-known LLL lattice reduction algorithm.
May, 26

GPU Passthrough Performance: A Comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications

As more scientific workloads are moved into the cloud, the need for high performance accelerators increases. Accelerators such as GPUs offer improvements in both performance and power efficiency over traditional multi-core processors; however, their use in the cloud has been limited. Today, several common hypervisors support GPU passthrough, but their performance has not been systematically […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: