12963
Elisangela Silva Dias, Diane Castonguay, Humberto Longo, Walid Abdala Rfaei Jradi, Hugo A. D. do Nascimento
Finding chordless cycles is an important theoretical problem in the Graph Theory area. It also can be applied to practical problems such as discover which predators compete for the same food in ecological networks. Motivated by the problem of theoretical interest and also by its significant practical importance, we present in this paper a parallel […]
View View   Download Download (PDF)   
Jesus Robledo, Jonathan Leloux, Eduardo Lorenzo
Shading reduces the power output of a photovoltaic (PV) system. The design engineering of PV systems requires modeling and evaluating shading losses. Some PV systems are affected by complex shading scenes whose resulting PV energy losses are very difficult to evaluate with current modeling tools. Several specialized PV design and simulation software include the possibility […]
View View   Download Download (PDF)   
Bruce Merry
Sorting and scanning are two fundamental primitives for constructing highly parallel algorithms. A number of libraries now provide implementations of these primitives for GPUs, but there is relatively little information about the performance of these implementations. We benchmark seven libraries for 32-bit integer scan and sort, and sorting 32-bit values by 32-bit integer keys.We show […]
Johannes Koster, Sven Rahmann
We present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPU-based read mapper. PEANUT provides the possibility to output both the best […]
View View   Download Download (PDF)   
Fabio Miguel Cardoso Soldado
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the […]
View View   Download Download (PDF)   
Mehmet Ufuk Buyuksahin
Galois Field arithmetic has been used very frequently in popular security and error-correction applications. Montgomery multiplication is among the suitable methods used for accelerating modular multiplication, which is the most time consuming basic arithmetic operation. Montgomery multiplication is also suitable to be implemented in parallel. OpenCL, which is a portable, heterogeneous and parallel programming framework, […]
View View   Download Download (PDF)   
Changsheng Huang, Baochang Shi, Zhaoli Guo, Zhenhua Chai
Conducting lattice Boltzmann method on GPU has been proved to be an effective manner to gain a significant performance benefit, thus the GPU or multi-GPU based lattice Boltzmann method is considered as a promising and competent candidate in the study of large-scale complex fluid flows. In this work, a multi-GPU based lattice Boltzmann algorithm coupled […]
View View   Download Download (PDF)   
Akihiko Kasagi, Koji Nakano, Yasuaki Ito
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of computing on CUDA-enabled GPUs. The summed area table (SAT) of a matrix is a data structure frequently used in the area of computer vision which can be obtained by computing the column-wise prefix-sums and then the rowwise prefix-sums. The […]
View View   Download Download (PDF)   
Hisham Mohamed
This thesis studies the scalability of the similarity search problem in large-scale multidimensional data. Similarity search, translating into the neighbour search problem, finds many applications for information retrieval, visualization, machine learning and data mining. The current exponential growth of data motivates the need for approximate and scalable algorithms. In most of existing algorithms and data-structures, […]
View View   Download Download (PDF)   
Shohei Ando, Fumihiko Ino, Toru Fujiwara, and Kenichi Hagihara
In this paper, we present a parallel algorithm for enumerating joint weight of a binary linear (n, k) code, aiming at accelerating assessment of its decoding error probability for network coding. To reduce the number of pairs of codewords to be investigated, our parallel algorithm reduces dimension k by focusing on the all-one vector included […]
View View   Download Download (PDF)   
Salman Habib, Adrian Pope, Hal Finkel, Nicholas Frontiere, Katrin Heitmann, David Daniel, Patricia Fasel, Vitali Morozov, George Zagaris, Tom Peterka, Venkatram Vishwanath, Zarija Lukic, Saba Sehrish, Wei-keng Liao
Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the ‘Dark Universe’, dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the […]
View View   Download Download (PDF)   
Michael Gowanlock, Henri Casanova
Applications in many domains require processing moving object trajectories. In this work, we focus on a trajectory similarity search that finds all trajectories within a given distance of a query trajectory over a time interval, which we call the distance threshold similarity search. We develop three indexing strategies with spatial, temporal and spatiotemporal selectivity for […]
View View   Download Download (PDF)   
Page 1 of 24212345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

167 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1275 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: