12433

Posts

Jul, 1

Increasing Deep Neural Network Acoustic Model Size for Large Vocabulary Continuous Speech Recognition

Deep neural networks (DNNs) are now a central component of nearly all state-of-the-art speech recognition systems. Part of the promise of DNNs is their ability to represent increasingly complex functions as the number of DNN parameters increases. This paper investigates the performance of DNN-based hybrid speech recognition systems as DNN model size and training data […]
Jul, 1

The design and verification of Mumax3

We report on the design, verification and performance of mumax3, an open-source GPU-accelerated micromagnetic simulation program. This software solves the time- and space dependent magnetization evolution in nano- to micro scale magnets using a finite-difference discretization. Its high performance and low memory requirements allow for large-scale simulations to be performed in limited time and on […]
Jul, 1

Speedup of Micromagnetic Simulations with C++ AMP On Graphics Processing Units

A finite-difference Micromagnetic solver is presented utilizing the C++ Accelerated Massive Parallelism (C++ AMP). The high speed performance of a single Graphics Processing Unit (GPU) is demonstrated compared to a typical CPU-based solver. The speed-up of GPU to CPU is shown to be greater than 100 for problems with larger sizes. This solver is based […]
Jun, 28

Performance and Efficiency Analysis of Modern Accelerators: Fine-Grained Parallelism on the Intel Xeon Phi

Supercomputers define the pinnacle of computational power and are an essential tool for solving vast scientific computational problems. They employ increasingly parallel architectures to ever increase their nominal peak performance and to allow them to solve larger problems. Employing the vast amount of computation power is however difficult and optimising for many-core architectures has become […]
Jun, 28

Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor

In this paper we will present a detailed study of implementing double-precision matrix-matrix multiplication (DGEMM) utilizing the Intel Xeon Phi Coprocessor. We discuss a DGEMM algorithm implementation running "natively" on the coprocessor, minimizing communication with the host CPU. We will run DGEMM across a range of matrix sizes natively as well using Intel Math Kernel […]
Jun, 28

Modified Bloom filter for high performance hybrid NoSQL systems

This article addresses problems of implementation of a modified Bloom filter as an additional module for mass data storage systems in supercomputers with hybrid CPU/GPU architecture. It is proposed to use a modified filter with counters, which makes it possible to monitor not only data addition, but also data removal. A comparative analysis has been […]
Jun, 28

Implementation of the genetic algorithm by means of CUDA technology involved in travelling salesman problem

The research was intended to solve the travelling salesman problem by means of genetic algorithms. The implementation of the algorithm was by virtue of CUDA technology. The research was focused on checking how much the system can improve if instead of classical CPU processors one uses GPU graphical processors enabled to perform the operations parallel. […]
Jun, 28

Various String Matching Algorithms for DNA Sequences to Detect Breast Cancer using CUDA Processors

The main aim of string matching algorithm is to locate the appearance of a specific pattern in an array of larger size text. String matching algorithms has been used in many applications such as DNA analysis. This report introduces a new approach of string matching algorithm to detect the occurrence of several gene patterns in […]
Jun, 27

7th International Conference on Computer and Electrical Engineering, ICCEE 2014

Submission Deadline: 2014-07-20 Publication: All accepted paper will be published in International Journal of Electrical Energy (IJOEE), which will be indexed by Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library and Electronic Journals Digital Library. Call for Paper: Computer Engineering Algorithm Computer Vision, Graphics and Intelligence Computational and Artificial Intelligence Image Processing […]
Jun, 27

Performance Evaluation of Parallel AES Implementations over CUDA GPU Framework

With a high computational complexity of encryption algorithm, AES, especially for huge real-time data, GPU has recently offered an alternate computational system instead of a traditional CPU (thread), incurring a significant improvement in speeding up the computational intensive parallel data encryption in various aspects – tremendous number of processing cores and non-generic computational processing architecture […]
Jun, 27

Industrial Robot Collision Handling in Harsh Environments

The focus in this thesis is on robot collision handling systems, mainly collision detection and collision avoidance for industrial robots operating in harsh environments (e.g. potentially explosive atmospheres found in the oil and gas sector). Collision detection should prevent the robot from colliding and therefore avoid a potential accident. Collision avoidance builds on the concept […]
Jun, 27

Computation on GPU of Eigenvalues and Eigenvectors of a Large Number of Small Hermitian Matrices

This paper presents an implementation on Graphics Processing Units of QR-Householder algorithm used to find all the eigenvalues and eigenvectors of many small hermitian matrices (double precision) in a very short time to address time constraints for Radar issues.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org