Posts
Jul, 2
CFD Simulation of Jet Cooling and Implementation of Flow Solvers in GPU
In rolling of steel into thin sheets the final step is the cooling of the finished product on the Runout Table. In this thesis, the heat transfer into a water jet impinging on a hot flat steel plate was studied as the key cooling process on the runout table. The temperature of the plate was […]
Jul, 2
Preliminary Experiences with the Uintah Framework on Intel Xeon Phi and Stampede
In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on structured adaptive grids. Uintah uses a combination of fluid-flow solvers and particle-based methods, together with a novel asynchronous […]
Jul, 2
MSTg: Cryptographically strong pseudorandom number generator and its realization
Random covers for finite groups have been introduced in [5, 9], and [18], and used for constructing public key cryptosystems. The primer [10] introduces a new approach for constructing pseudorandom number generators, called MSTg, based on random covers for large finite groups. In particular, on the class of elementary abelian 2-groups they allow a very […]
Jul, 2
Efficient linear-scaling quantum transport calculations on graphics processing units and applications on electron transport in graphene
We implement, optimize, and validate the linear-scaling Kubo-Greenwood quantum transport simulation on graphics processing units by examining resonant scattering in graphene. We consider two practical representations of the Kubo-Greenwood formula: a Green-Kubo formula based on the velocity auto-correlation and an Einstein formula based on the mean square displacement. The code is fully implemented on graphics […]
Jul, 2
GPU Accelerated Fluid Flow Computations Using the Latice Boltzmann Method
We propose a numerical implementation based on a Graphics Processing Unit (GPU) for the acceleration of the execution time of the Lattice Boltzmann Method. The performance analysis is based on three three-dimensional benchmark applications: Poisseuille flow, lid-driven cavity flow and flow in an elbow shaped domain. Three different, recently released GPU cards are considered for […]
Jul, 1
vSMC: Parallel Sequential Monte Carlo in C++
Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions. Some of these algorithms, such as particle filters, are widely used in the physics and signal processing researches. More recent developments have established their application in more general inference problems such as Bayesian modeling. These algorithms have attracted considerable attentions […]
Jul, 1
A Survey On Parallelization Of Data Mining Techniques
This paper contains the overview of various parallelization techniques to improve the performance of existing data mining algorithms and make the capable of handling large amount of data. There are variety of techniques to achieve the parallelization in data mining field, in this paper a brief introduction to few of the popular techniques is presented. […]
Jul, 1
Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory
Programming for parallel architectures that do not have a shared address space is extremely difficult due to the need for explicit communication between memories of different compute devices. A heterogeneous system with CPUs and multiple GPUs, or a distributed-memory cluster are examples of such systems. Past works that try to automate data movement for distributed-memory […]
Jul, 1
Exploiting multi-level parallelism in streaming applications for heterogeneous platforms with GPUs
Heterogeneous computing platforms support the traditional types of parallelism, such as e.g., instruction-level, data, task, and pipeline parallelism, and provide the opportunity to exploit a combination of different types of parallelism at different platform levels. The architectural diversity of platform components makes tapping into the platform potential a challenging programming task. This thesis makes an […]
Jul, 1
Towards Performance-Portable, Scalable, and Convenient Linear Algebra
The rise of multi- and many-core architectures also gave birth to a plethora of new parallel programming models. Among these, the open industry standard OpenCL addresses this heterogeneity of programming environments by providing a unified programming framework. The price to pay, however, is that OpenCL requires additional low-level boilerplate code, when compared to vendor-specific solutions, […]
Jun, 30
Cropped Quad-Tree Based Solid Object Colouring with CUDA
In this study, surfaces of solid objects are coloured with Cropped Quad-Tree method utilizing GPU computing optimization. There are numerous methods used in solid object colouring. When the studies carried out in different fields are taken into consideration, it is seen that quad-tree method displays a prominent position in terms of speed and performance. Cropped […]
Jun, 30
Accelerating SELECT WHERE and SELECT JOIN Queries on a GPU
This paper presents implementations of a few selected SQL operations using the CUDA programming framework on the GPU platform. Nowadays, the GPU’s parallel architectures give a high speed-up on certain problems. Therefore, the number of non-graphical problems that can be run and sped-up on the GPU still increases. Especially, there has been a lot of […]