Posts
Oct, 8
CUTE solutions for two-point correlation functions from large cosmological datasets
In the advent of new large galaxy surveys, which will produce enormous datasets with hundreds of millions of objects, new computational techniques are necessary in order to extract from them any two-point statistic, the computational time of which grows with the square of the number of objects to be correlated. Fortunately technology now provides multiple […]
Oct, 6
GPGPU accelerated optimization method of Interconnection Network Topology
The optimization of the irregular connection network of the multiprocessor systems with the distributed memory is the NP complete problem which is generally compute-intensive process. Graphics processing units provide a large computational power at a very low price allowing the fine-grained parallelism. This work investigates the use of the GPU in the parallelisation of the […]
Oct, 6
Techniques for Mapping Synthetic Aperture Radar Processing Algorithms to Multi-GPU Clusters
This paper presents a design for parallel processing of synthetic aperture radar (SAR) data using multiple Graphics Processing Units (GPUs). Our approach supports real-time reconstruction of a two-dimensional image from a matrix of echo pulses and their response values. Key to runtime efficiency is a partitioning scheme that divides the output image into tiles and […]
Oct, 6
Architectural explorations for streaming accelerators with customized memory layouts
The basic concept behind the architecture of a general purpose CPU core conforms well to a serial programming model. The integration of more cores on a single chip helped CPUs in running parts of a program in parallel. However, the utilization of huge parallelism available from many high performance applications and the corresponding data is […]
Oct, 6
Visualization of large multidimensional data sets by using multi-core CPU, GPU and MPI cluster
Multidimensional scaling (MDS) is a very popular and reliable method used in feature extraction and visualization of multidimensional data. The role of MDS is to reconstruct the topology of an original N-dimensional feature space consisting of M feature vectors in target 2-D (3-D) Euclidean space. It can be achieved by minimization of the error – […]
Oct, 6
An Efficient GPU Implementation of Modified Discrete Cosine Transform Using CUDA
A new method is presented in this paper for using general purpose programming tools of graphics processing units. It aims to calculate the modified discrete cosine transform in audio coding and compression algorithms for popular audio formats such as MP3, AAC/AC-3, and WMA. The proposed algorithm consists of matrix multiplications that are performed by the […]
Oct, 5
Accelerated protein structure comparison using TM-score-GPU
MOTIVATION: Accurate comparisons of different protein structures play important roles in structural biology, structure prediction and functional annotation. The root-mean-square-deviation (RMSD) after optimal superposition is the predominant measure of similarity due to the ease and speed of computation. However, global RMSD is dependent on the length of the protein and can be dominated by divergent […]
Oct, 5
Using Commodity Coprocessors for Host Intrusion Detection
The ever-rising importance of communication services and devices emphasizes the significance of intrusion detection. Besides general network attacks, private hosts in particular are within the focus of cyber criminals. Private data theft and the integration of individual hosts into large-scale botnets are two common purposes successfully subverted systems are used for. In order to detect […]
Oct, 5
Is the game worth the candle? Evaluation of OpenCL for object detection algorithm optimization
In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, …). The algorithm lends itself to be implemented in a parallel way. We exploit this […]
Oct, 5
An ultrasonic imaging system based on a new SAFT approach and a GPU beamformer
The design of newer ultrasonic imaging systems attempts to obtain low-cost, small-sized devices with reduced power consumption that are capable of reaching high frame rates with high image quality. In this regard, synthetic aperture techniques have been very useful. They reduce hardware requirements and accelerate information capture. However, the beamforming process is still very slow, […]
Oct, 5
Parallel Algorithm of IDCT with GPUs and CUDA for Large-scale Video Quality of 3G
When video is transmitted over 3G networks, the video quality might suffer from impairments caused by packet losses. Extracting video quality features is a set of algorithms and inverse discrete cosine transforms is an important algorithm in this field. To improve the performance and be suitable to apply to evaluating the 3G video quality in […]
Oct, 4
Redefining the Role of the CPU in the Era of CPU-GPU Integration
GPU computing has emerged as a viable alternative to CPUs for throughput oriented applications or regions of code. Speedups of 10 to 100x over CPU implementations have been reported. This trend is expected to continue in the future with GPU architectural advances, improved programming support, scaling, and tighter CPU-GPU chip integration. However, not all code […]