Posts
Sep, 9
Improvements to Physically Based Cloth Simulation
Physically based cloth simulation in computer graphics has come a long way since the 1980s. Although extensive methods have been developed, physically based cloth animation remains challenging in a number of aspects, including the efficient simulation of complex internal dynamics, better performance and the generation of more effects of friction in collisions, to name but […]
Sep, 9
A Validation Testsuite for OpenACC 1.0
Directive-based programming models provide high-level of abstraction thus hiding complex low-level details of the underlying hardware from the programmer. One such model is OpenACC that is also a portable programming model allowing programmers to write applications that offload portions of work from a host CPU to an attached accelerator (GPU or a similar device). The […]
Sep, 9
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing
The past years have witnessed many dedicated open-source projects that built and maintain implementations of Support Vector Machines (SVM), parallelized for GPU, multi-core CPUs and distributed systems. Up to this point, no comparable effort has been made to parallelize the Elastic Net, despite its popularity in many high impact applications, including genetics, neuroscience and systems […]
Sep, 9
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical […]
Sep, 8
Professional CUDA C Programming
Designed for professionals across multiple industrial sectors, Professional CUDA C Programming presents CUDA — a parallel computing platform and programming model designed to ease the development of GPU programming — fundamentals in an easy-to-follow format, and teaches readers how to think in parallel and implement parallel algorithms on GPUs. Each chapter covers a specific topic, […]
Sep, 8
Document Classification Using KNN on GPU
Real-time and archrival data documents are increases as fast as or faster than computing power now a days. Document classification using k-nn classification algorithm takes more time in searching nearer neighbors in large training dataset, it include large number of computations. The time for classification increases in proportion to the number of documents. Therefore it […]
Sep, 8
Sparse array representations and some selected array operations on GPUs
A multi-dimensional data model provides a good conceptual view of the data in data warehousing and On-Line Analytical Processing (OLAP). A typical representation of such a data model is as a multi-dimensional array which is well suited when the array is dense. If the array is sparse, i.e., has a few number of non-zero elements […]
Sep, 8
A CUDA Back-End for the Equelle Compiler
As parallel and heterogeneous computing becomes more and more a necessity for implementing high performance simulators, it becomes increasingly harder for scientists and engineers without experience in high performance computing to achieve good performance. Even for those who knows how to write efficient code the process for doing so is time consuming and error prone, […]
Sep, 8
Accelerated Combinatorial Optimization using Graphics Processing Units and C++ AMP
In the course of less than a decade, Graphics Processing Units (GPUs) have evolved from narrowly scoped application specific accelerators to general-purpose parallel machines capable of accommodating an ever-growing set of algorithms. At the same time, programming GPUs appears to have become trapped around an attractor characterised by ad-hoc practices, non-portable implementations and inexact, uninformative […]
Sep, 5
Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models
Radiative transfer (RT) calculations are among the most computationally expensive components of global and regional weather and climate models, and radiation codes are therefore ideal candidates for applying techniques to improve the overall efficiency of such models. In many general circulation models (GCMs), a physically based radiation calculation can require as much as 30-50 percent […]
Sep, 5
Acquisition Method of Spread Spectrum Signals Based on GPU Acceleration
To meet the strict requirements of acquisition time under conditions of high dynamic and low CNR, an acquisition method of spread spectrum signals based on GPU acceleration has put forward through the combination of the spread spectrum signal acquisition and GPU parallel computing. Taking the frequency domain parallel acquisition algorithm based on FFT as the […]
Sep, 5
Dymaxion++: A Directive-based API to Optimize Data Layout and Memory Mapping for Heterogeneous Systems
There has been a growing trend in using heterogeneous systems with CPUs and GPUs to solve diverse compute problems. However, high application performance on these platforms relies on efficient memory accesses. For many applications, CPUs and GPUs prefer different memory mappings and data-structure layouts. This in turn requires developers to use device-specific strategies for memory […]

