Posts
Apr, 8
Benchmarking the cost of thread divergence in CUDA
All modern processors include a set of vector instructions. While this gives a tremendous boost to the performance, it requires a vectorized code that can take advantage of such instructions. As an ideal vectorization is hard to achieve in practice, one has to decide when different instructions may be applied to different elements of the […]
Apr, 8
Early Experiences Running the 3D Stencil Jacobi Method in Intel Xeon Phi
Iterative stencil computations are important pattern of computations in different computational fields such as physics or chemistry simulations. A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. As the demand for more and more compute power is growing rapidly in different fields of research, […]
Apr, 8
3rd International Conference on Control, Robotics and Cybernetics (ICCRC), 2015
ICCRC 2015- 3rd International Conference on Control, Robotics and Cybernetics Berlin, Germany August 13-14, 2015 http://www.iccrc.org/ Submission Deadline: 2015-06-05 Publication: All accepted papers of ICCRC 2015 will be published by International Journal of Mechanical Engineering and Robotics Research (IJMERR) (ISSN:2278-0149), which will be indexed by Copernicus, ProQuest (USA), Open J-Gate, Indian Science and Google Scholar. […]
Apr, 8
8th International Conference on Advanced Computer Theory and Engineering (ICACTE), 2015
ICACTE 2015- 8th International Conference on Advanced Computer Theory and Engineering Berlin, Germany August 13-14, 2015 http://www.icacte.org/ Submission Deadline: 2015-06-05 Publication: Submitted papers can be selected and published into one of the following Journals. *WIT Transactions on Engineering Sciences (ISSN: 1743-3533) Indexed by EI Compendex and ISI * International Journal of Computer Theory and Engineering […]
Apr, 8
7th International Conference on Education Technology and Computer (ICETC), 2015
ICETC 2015- 7th International Conference on Education Technology and Computer Berlin, Germany August 13-14, 2015 http://www.icetc.org/ Submission Deadline: 2015-06-05 Publication: *International Journal of Information and Education Technology (IJIET)-ISSN: 2010-3689 Abstracting/ Indexing: EI (INSPEC, IET), Cabell’s Directories, DOAJ, Electronic Journals Library, Engineering & Technology Digital Library, EBSCO, Google Scholar, Crossref and ProQuest *Lecture Notes on Information […]
Apr, 7
clRNG: A Random Number API with Multiple Streams for OpenCL
We present clRNG, a library for uniform random number generation in OpenCL. Streams of random numbers act as virtual random number generators. They can be created on the host computer in unlimited numbers, and then used either on the host or on other computing devices by work items to generate random numbers. Each stream also […]
Apr, 7
State Lattice-based Motion Planning for Autonomous On-Road Driving
Since DARPA Urban Challenge 2007 (DUC), the development of autonomous vehicles has attracted increasing attention from both academic institutes and the automotive industry. It is believed that autonomous vehicles sophisticated and reliable enough would redefine mobility. The motion planner and sensor simulation presented in this thesis are intended to contribute to this prospect. The task […]
Apr, 7
Multi-Lingual Speech Recognition with Low-Rank Multi-Task Deep Neural Networks
Multi-task learning (MTL) for deep neural network (DNN) multilingual acoustic models has been shown to be effective for learning parameters that are common or shared between multiple languages[1, 2]. In the MTL paradigm, the number of parameters in the output layer is large and scales with the number of languages used in training. This output […]
Apr, 7
Flexible, Fast and Accurate Sequence Alignment Profiling on GPGPU with PaSWAS
MOTIVATION: To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on […]
Apr, 7
On Password Guessing with GPUs and FPGAs
Passwords are still by far the most widely used form of user authentication, for applications ranging from online banking or corporate network access to storage encryption. Password guessing thus poses a serious threat for a multitude of applications. Modern password hashes are specifically designed to slow down guessing attacks. However, having exact measures for the […]
Apr, 4
OmpSs task offload
Exascale performance requires a level of energy efficiency only achievable with specialized hardware. Hence, to build a general purpose HPC system with exascale performance different types of processors, memory technologies and interconnection networks will be necessary. Heterogeneous hardware is already present on some top supercomputer systems that are composed of different compute nodes, which at […]
Apr, 4
Reduction of a Symmetrical Matrix to Tridiagonal Form on GPUs
Many eigenvalue and eigenvector algorithms begin with reducing the input matrix into a tridiagonal form. A tridiagonal matrix is a matrix that has non-zero elements only on its main diagonal, and the two diagonals directly adjacent to it. Reducing a matrix to a tridiagonal form is an iterative process which uses Jacobi rotations to reduce […]