Posts
Mar, 18
Fast Radix Sort for Sparse Linear Algebra on GPU
Fast sorting is an important step in many parallel algorithms, which require data ranking, ordering or partitioning. Parallel sorting is a widely researched subject, and many algorithms were developed in the past. In this paper, the focus is on implementing highly efficient sorting routines for the sparse linear algebra operations, such as parallel sparse matrix […]
Mar, 14
Heterogeneous Acceleration of Volumetric JPEG 2000
We present the implementation of a volumetric JPEG 2000 codec as a real-world use case of software acceleration with GPUs and multi-core CPUs. We present a generic methodology to accelerate existing code written in C with OpenCL. Furthermore, we account for the volumetric nature of the processed data and formulate associated optimization guidelines. The resulting […]
Mar, 14
EmoNets: Multimodal deep learning approaches for emotion recognition in video
The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies. The videos depict acted-out emotions under realistic conditions with a large degree of variation in attributes such as pose and illumination, making it worthwhile to explore approaches which […]
Mar, 14
Parallel Statistical Multi-resolution Estimation
We discuss several strategies to implement Dykstra’s projection algorithm on NVIDIA’s compute unified device architecture (CUDA). Dykstra’s algorithm is the central step in and the computationally most expensive part of statistical multi-resolution methods. It projects a given vector onto the intersection of convex sets. Compared with a CPU implementation our CUDA implementation is one order […]
Mar, 14
HELIOS-K: An Ultrafast, Open-source Opacity Calculator for Radiative Transfer
We present an ultrafast opacity calculator for application to exoplanetary atmospheres, which we name HELIOS-K. It takes a line list as an input, computes the shape of each spectral line (e.g., a Voigt profile) and provides an option for grouping an enormous number of lines into a manageable number of bins. We implement a combination […]
Mar, 14
Accelerating DEM simulations on GPUs by reducing the impact of warp divergences
A way to accelerate DEM calculations on the GPUs is developed. We examined how warp divergences take place in the contact detection and the force calculations taking account of the GPU architecture. Then we showed a strategy to reduce the impact of the warp divergences on the runtime of the DEM force calculations.
Mar, 14
2nd International Conference on Multimedia and Communication Technologies (ICMCT2015), 2015
2015 2nd International Conference on Multimedia and Communication Technologies (ICMCT2015) September 19-20, 2015 Hong Kong Organized by American Society for Research (ASR) http://www.icmct.org/ Submission Deadline: 2015-06-05 Topics: Hardware & Software for Multimedia Systems Enabling Technologies for Multimedia Multimedia Applications Consumer Systems and Networks Speech and Audio Processing Image and Video Processing Applied Signal Processing Communication […]
Mar, 14
7th International Conference on Software Technology and Engineering (ICSTE 2015), 2015
2015 7th International Conference on Software Technology and Engineering (ICSTE 2015) September 19-20, 2015 Hong Kong Organized by American Society for Research (ASR) http://www.icste.org/ Submission Deadline: 2015-06-05 Topics: AI and Knowledge based software engineering Object-Oriented Technology Artificial Intelligence Parallel and Distributed Computing Aspect-orientation and feature interaction Patterns and frameworks Business Process Reengineering & Science Process […]
Mar, 14
4th International Conference on Image, Vision and Computing (ICIVC 2015), 2015
2015 4th International Conference on Image, Vision and Computing (ICIVC 2015) http://www.icivc.org/ Date: September 19-20, 2015 Venue: Hong Kong Submission Deadline: 2015-06-05 Topics: Image acquisition Detection and Estimation of Signal Parameters Image processing Signal Identification Medical image processing Nonlinear Signals and Systems Pattern recognition and analysis Time-Frequency Signal Analysis Visualization Signal Reconstruction Image coding and […]
Mar, 12
GPGPU Performance and Power Estimation Using Machine Learning
Graphics Processing Units (GPUs) have numerous configuration and design options, including core frequency, number of parallel compute units (CUs), and available memory bandwidth. At many stages of the design process, it is important to estimate how application performance and power are impacted by these options. This paper describes a GPU performance and power estimation model […]
Mar, 12
Implementing Machine Learning Algorithms on GPUs for Real-Time Traffic Sign Classification
This paper investigates traffic sign classification, which is an important problem to solve for autonomous driving. Linear discriminant analysis and convolutional neural networks achieved an accuracy of 98.25% and 98.75% respectively when classifying eight different types of traffic signs. The CNN was implemented on a GPU for real-time traffic sign classification: testing time for the […]
Mar, 12
CUDA accelerated large scale vehicular area network simulator
Both size and computational activities of Vehicular Area Network (VANET) are growing. Simulation of VANETs not only requires the simulation of network standards, but also the mobility of nodes. Such dynamic system involves computation of node distance, routing protocols, application layer, data send, data receive, etc. The simulation model of VANET requires both hardware and […]