Posts
Aug, 28
A Parallel Algorithm to Test Chordality of Graphs
We present a simple parallel algorithm to test chordality of graphs which is based on the parallel Lexicographical Breadth-First Search algorithm. In total, the algorithm takes time O(N) on N-threads machine and it performs work O(N^2), where N is the number of vertices in a graph. Our implementation of the algorithm uses a GPU environment […]
Aug, 27
CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm
This paper presents a practical GPU-accelerated convex hull algorithm and a novel Sorting-based Preprocessing Approach (SPA) for planar point sets. The proposed algorithm consists of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of calculating the expected convex hull on the CPU. We first discard the interior points […]
Aug, 27
gScan: Accelerating Graham Scan on the GPU
This paper presents a fast implementation of the Graham scan on the GPU. The proposed algorithm is composed of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of finding the convex hull on the CPU. We first discard the interior points that locate inside a quadrilateral formed by […]
Aug, 27
Adaptive Multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model
The study of disordered spin systems through Monte Carlo simulations has proven to be a hard task due to the adverse energy landscape present at the low temperature regime, making it difficult for the simulation to escape from a local minimum. Replica based algorithms such as the Exchange Monte Carlo (also known as parallel tempering) […]
Aug, 27
Accelerated Deep Learning using Intel Xeon Phi
Deep learning, a sub-topic of machine learning inspired by biology, have achieved wide attention in the industry and research community recently. State-of-the-art applications in the area of computer vision and speech recognition (among others) are built using deep learning algorithms. In contrast to traditional algorithms, where the developer fully instructs the application what to do, […]
Aug, 27
MemcachedGPU: Scaling-up Scale-out Key-value Stores
This paper tackles the challenges of obtaining more efficient data center computing while maintaining low latency, low cost, programmability, and the potential for workload consolidation. We introduce GNoM, a software framework enabling energy-efficient, latency bandwidth optimized UDP network and application processing on GPUs. GNoM handles the data movement and task management to facilitate the development […]
Aug, 24
First International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC’15), 2015
With Exascale systems on the horizon at the same time that conventional von-Neumann architectures are suffering from rising power densities, we are facing an era with power, energy-efficiency, and cooling as first-class constraints for scalable HPC. FPGAs can tailor the hardware to the application, avoiding overheads of general-purpose architectures–for example, through customized datapaths and memory […]
Aug, 24
Performance Evaluations of Document-Oriented Databases using GPU and Cache Structure
Document-oriented databases are popular databases, in which users can store their documents in a schema-less manner and perform search queries for them. They have been widely used for web applications that process a large collection of documents because of their high scalability and rich functions. One of major functions of documentoriented databases is a string […]
Aug, 24
Viability of Feature Detection on Sony Xperia Z3 using OpenCL
CONTEXT: Embedded platforms GPUs are reaching a level of performance comparable to desktop hardware. Therefore it becomes interesting to apply Computer Vision techniques to modern smartphones.The platform holds different challenges, as energy use and heat generation can be an issue depending on load distribution on the device. OBJECTIVES: We evaluate the viability of a feature […]
Aug, 24
Scheduling for new computing platforms with GPUs
More and more computers use hybrid architectures combining multi-core processors (CPUs) and hardware accelerators like GPUs (Graphics Processing Units). These hybrid parallel platforms require new scheduling strategies. This work is devoted to a characterization of this new type of scheduling problems. The most studied objective in this work is the minimization of the makespan, which […]
Aug, 24
Source-to-Source Automatic Program Transformations for GPU-like Hardware Accelerators
Since the beginning of the 2000s, the raw performance of processors stopped its exponential increase. The modern graphic processing units (GPUs) have been designed as array of hundreds or thousands of compute units. The GPUs’ compute capacity quickly leads them to be diverted from their original target to be used as accelerators for general purpose […]
Aug, 24
Semi-Global Filtering of Airborne LiDAR Data for Fast Extraction of Digital Terrain Models
Automatic extraction of ground points, called filtering, is an essential step in producing Digital Terrain Models from airborne LiDAR data. Scene complexity and computational performance are two major problems that should be addressed in filtering, especially when processing large point cloud data with diverse scenes. This paper proposes a fast and intelligent algorithm called Semi-Global […]