Posts
Jun, 5
2nd International Conference on Information Management in the Knowledge Economy, IMKE – 2013
The International Conference on Information Management in the Knowledge Economy is a multidisciplinary Conference on digital information management, science and technology. The principal aim of this conference is to bring professionals in academia, research laboratories and industry together, and offer a collaborative platform to address the emerging issues and solutions in digital information science and […]
Jun, 5
Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013
ASPLOS is the premier forum for multidisciplinary systems research spanning computer architecture and hardware, programming languages and compilers, operating systems and security, as well as applications and human-computer interaction. The importance of such crosscutting systems research has been growing hand in hand with the amount of parallelism in hardware, the scope of distribution in internet-scale […]
Jun, 5
A new parallelisation technique for heterogeneous CPUs
Parallelization has moved in recent years into the mainstream compilers, and the demand for parallelizing tools that can do a better job of automatic parallelization is higher than ever. During the last decade considerable attention has been focused on developing programming tools that support both explicit and implicit parallelism to keep up with the power […]
Jun, 5
Using visualization to reveal weak cryptosystems
My thesis explains how we can apply techniques borrowed from the area of visualization to reveal weaknesses in applications and cryptosystems. A presentation of how graphical processing units can be used for general computing is presented in the first half of the thesis. The second half provides an overview of basic techniques and applies these […]
Jun, 5
Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters
Hierarchical level of heterogeneity exists in many modern high performance clusters in the form of heterogeneity between computing nodes, and within a node with the addition of specialized accelerators, such as GPUs. To achieve high performance of scientific applications on these platforms it is necessary to perform load balancing. In this paper we present a […]
Jun, 5
Shortening design time through multiplatform simulations with a portable OpenCL golden-model: the LDPC decoder case
Hardware designers and engineers typically need to explore a multi-parametric design space in order to find the best configuration for their designs using simulations that can take weeks to months to complete. For example, designers of special purpose chips need to explore parameters such as the optimal bit width and data representation. This is the […]
Jun, 5
Landau Gauge Fixing on GPUs
In this paper we present and explore the performance of Landau gauge fixing in GPUs using CUDA. We consider the steepest descent algorithm with Fourier acceleration, and compare the GPU performance with a parallel CPU implementation. Using $32^4$ lattice volumes, we find that the computational power of a single Tesla C2070 GPU is equivalent to […]
Jun, 4
Platform 2012, a Many-Core Computing Accelerator for Embedded SoCs: Performance Evaluation of Visual Analytics Applications
P2012 is an area- and power-efficient many-core computing accelerator based on multiple globally asynchronous, locally synchronous processor clusters. Each cluster features up to 16 processors with independent instruction streams sharing a multi-banked one-cycle access L1 data memory, a multi-channel DMA engine and specialized hardware for synchronization and aggressive power management. P2012 is 3D stacking ready […]
Jun, 4
A Compiler and Runtime for Heterogeneous Computing
Heterogeneous systems show a lot of promise for extracting high-performance by combining the benefits of conventional architectures with specialized accelerators in the form of graphics processors (GPUs) and reconfigurable hardware (FPGAs). Extracting this performance often entails programming in disparate languages and models, making it hard for a programmer to work equally well on all aspects […]
Jun, 4
Finite Element Matrix Generation on a GPU
This paper presents an efficient technique for fast generation of sparse systems of linear equations arising in computational electromagnetics in a finite element method using higher order elements. The proposed approach employs a graphics processing unit (GPU) for both numerical integration and matrix assembly. The performance results obtained on a test platform consisting of a […]
Jun, 4
Pipelining the Fast Multipole Method over a Runtime System
Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problems. The high performance design of such methods usually requires to carefully tune the algorithm for both the targeted physics and the hardware. In this paper, we propose a new approach that achieves high performance across architectures. Our method consists of […]
Jun, 4
High Accuracy Gravitational Waveforms from Black Hole Binary Inspirals Using OpenCL
There is a strong need for high-accuracy and efficient modeling of extreme-mass-ratio binary black hole systems because these are strong sources of gravitational waves that would be detected by future observatories. In this article, we present sample results from our Teukolsky EMRI code: a time-domain Teukolsky equation solver (a linear, hyperbolic, partial differential equation solver […]