8994

Posts

Feb, 12

Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks

Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are […]
Feb, 12

Seismic Attributes Extraction Based on GPU

In oil and gas exploration, the seismic data can provide the information of the earth’s subsurface structure and detect where oil can be found and recovered. To get a geological model of the earth, the complex iterative processing is being done. So, the need for computing power increases with the oil and gas exploration and […]
Feb, 12

Implementing an architecture for efficient network traffic processing on modern graphics hardware

Network traffic processing is necessary in order to develop active components in the infrastructure of the network, such as routers, or passive applications, such as network intrusion detection systems. However, in today’s high-speed network links this has become a very challenging task in terms of computational resources. Custom hardware appliances that can handle high packet […]
Feb, 12

Accelerated Wide Baseline Matching using OpenCL

Wide baseline matching is the state of the art for object recognition and image registration problems in computer vision. Robust feature descriptors can give vast improvements in the quality and speed of subsequent steps, but intensive computation is still required. With the release of general purpose parallel computing interfaces, opportunities for increases in performance arise. […]
Feb, 12

Extending the Computational Application of Reaction-Diffusion Chemistry by Modelling Artificial Neural Networks

There is a huge computational potential in unconventional computing paradigms such as reaction-diffusion chemistry. The main problem with unconventional systems is the inherent difficulty in programming them. By extending the computational application of reaction-diffusion systems, this problem may be alleviated, as every new application allows for another method of approaching problems. With the central nervous […]
Feb, 9

Adaptation of the MapReduce programming framework to compute-intensive data-analytics kernels

Compute-intensive data-analytic (CIDA) applications have become a major component of many different business domains, as well as scientific computing applications. These algorithms stem from domains as diverse as web analysis and social networks, machine learning and data mining, text analysis, bio-informatics, astronomy image analysis, business analytics, large scale graph algorithms, image/video processing and recognition, some […]
Feb, 9

Distributed multi-node, multi-GPU, heterogeneous system for 3D image reconstruction in Electrical Capacitance Tomography – network performance and application analysis

3D ECT provides a lot of challenging computational issues as image reconstruction requires execution of many basic operations of linear algebra, especially when the solutions are based on Finite Element Method. In order to reach real-time reconstruction a 3D ECT computational subsystem has to be able to transform capacitance data into image in fractions of […]
Feb, 9

A multi-lane traffic simulation model via continuous cellular automata

Traffic models based on cellular automata have high computational efficiency because of their simplicity in describing unrealistic vehicular behavior and the versatility of cellular automata to be implemented on parallel processing. On the other hand, the other microscopic traffic models such as car-following models are computationally more expensive, but they have more realistic driver behaviors […]
Feb, 9

Practical Patient-Specific Cardiac Blood Flow Simulations Using SPH

While recent developments in the field of ventricular blood flow simulations have pushed modeling to increasingly high levels of accuracy, there has been a steep cost in computation time. Current state-of-the-art simulators take days to run, which is impractical for use in a clinical setting. In this paper, we describe novel adaptations of the SPH […]
Feb, 9

Enabling Inter-Machine Parallelism in High-Level Languages with SEJITS and MapReduce

Selective, embedded, just-in-time specialization (SEJITS) is a technique for optimizing embedded domain-specific languages through the use of specializers, or code modules developed by expert programmers that target particular accelerators such as multicore processors and GPUs via just-in-time compilation. We extend SEJITS to exploit inter-machine parallelism by targeting clusters of machines via MapReduce. Our work enables […]
Feb, 9

Fast 3D Wavelet Transform on Multicore and Manycore Computing Platforms

Three-dimensional wavelet transform (3D-DWT) has focused the attention of the research community, most of all in areas such as video watermarking, compression of volumetric medical data, multispectral image coding, 3D model coding and video coding. In this work, we present several strategies to speed-up the 3D-DWT computation through multicore processing. An in depth analysis about […]
Feb, 9

Document Stream Clustering using GPUs

The Web is constantly generating streams of textual information in the form of News articles and Tweets. In order for Information Retrieval systems to make sense of all this data partitional clustering algorithms are used to create groups of similar documents. Traditional clustering algorithms, like K-means, are not well suited for stream processing where the […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: