Posts
Dec, 28
Improving the speed of neural networks on CPUs
Recent advances in deep learning have made the use of large, deep neural networks with tens of millions of parameters suitable for a number of applications that require real-time processing. The sheer size of these networks can represent a challenging computational burden, even for modern CPUs. For this reason, GPUs are routinely used instead to […]
Dec, 28
Multilevel Tile Load Map on Massive Terrain Visualization
This paper analyzed the efficient architecture features of massive terrain LOD visualization, and found that CPU can hardly select tiles from mass terrain effectively. This restricted the expansion of terrain’s size. Yacine Amara presented Tile Load Map(TLM). This paper presented Multilevel Tile Load Map (MTLM) algorithm for tile selection to extend TLM. MTLM uses 2d […]
Dec, 28
Speeding Up Particle Trajectory Simulations under Moving Force Fields using GPUs
In this paper, we introduce a GPU-based framework for simulating particle trajectories under both static and dynamic force fields. By exploiting the highly parallel nature of the problem and making efficient use of the available hardware, our simulator exhibits a significant speedup over its CPU-based analog. We apply our framework to a specific experimental simulation: […]
Dec, 28
BOPM implemented on a GPU-architecture
We used the Binomial Options Pricing Model (BOPM) implemented on a Graphics Processing Unit (GPU) to calculate the value of European and American options, of both put and call type. The advantage of using a GPU over a CPU is that a GPU has many more processing-cores than a CPU and can perform more calculations […]
Dec, 28
GPU-Based Global Illumination Using Lightcuts
Global Illumination aims to generate high quality images. But due to its high requirements, it is usually quite slow. Research documented in this thesis was intended to offer a hardware and software combined acceleration solution to global illumination. The GPU (using CUDA) was the hardware part of the whole method that applied parallelism to increase […]
Dec, 28
A Fast Algorithm for Constructing Inverted Files on Heterogeneous Platforms
Given a collection of documents residing on a disk, we develop a new strategy for processing these documents and building the inverted files extremely fast. Our approach is tailored for a heterogeneous platform consisting of a multicore CPU and a highly multithreaded GPU. Our algorithm is based on a number of novel techniques including: (i) […]
Dec, 28
Monitoring Multiple Streams with Dynamic Time Warping using Graphic Processors
In this paper, we present an approach for efficiently monitoring multiple data streams using graphic processor units (GPUs). Given reference patterns, similar subsequences in streams are matched under the dynamic time warping (DTW) distance and reported continuously. DTW distance is adopted since it offers scaling and shifting exibility in the time axis. However, it suffers […]
Dec, 28
Precomputed compressive sensing for light transport acquisition
In this article, we propose an efficient and accurate compressive-sensing-based method for estimating the light transport characteristics of real-world scenes. Although compressive sensing allows the efficient estimation of a high-dimensional signal with a sparse or near-to-sparse representation from a small number of samples, the computational cost of the compressive sensing in estimating the light transport […]
Dec, 28
Efficient parallel lists intersection and index compression algorithms using graphics processing units
Major web search engines answer thousands of queries per second requesting information about billions of web pages. The data sizes and query loads are growing at an exponential rate. To manage the heavy workload, we consider techniques for utilizing a Graphics Processing Unit (GPU). We investigate new approaches to improve two important operations of search […]
Dec, 28
Real-Time Rendering of Temporal Volumetric Data on a GPU
Real-time rendering of static volumetric data is generally known to be a memory and computationally intensive process. With the advance of graphic hardware, especially GPU, it is now possible to do this using desktop computers. However, with the evolution of real-time CT and MRI technologies, volumetric rendering is an even bigger challenge. The first one […]
Dec, 27
PFAC Library: GPU-based string matching algorithm
The PFAC algorithm efficiently exploits the parallelism of the Aho-Corasick algorithm by creating an individual thread for each byte of an input stream to identify any pattern starting at the thread’s starting position. The number of threads created by the PFAC algorithm is equal to the length of an input stream.
Dec, 27
A GPU Accelerated High Performance Cloud Computing Infrastructure for Grid Computing Based Virtual Environmental Laboratory
Numerical models play a main role in the earth sciences, filling in the gap between experimental and theoretical approach. Nowadays, the computational approach is widely recognized as the complement to the scientific analysis. Meanwhile, the huge amount of observed/modelled data, and the need to store, process, and refine them, often makes the use of high […]