Posts
Jan, 22
Parallel Explicit FEM Algorithms Using GPU’s
The Explicit Finite Element Method is a powerful tool in nonlinear dynamic finite element analysis. Recent major developments in computational devices, in particular, General Purpose Graphical Processing Units (GPGPU’s) now make it possible to increase the performance of the explicit FEM. This dissertation investigates existing explicit finite element method algorithms which are then redesigned for […]
Jan, 22
Heterogeneous (CPU+GPU) Working-set Hash Tables
In this paper, we propose heterogeneous (CPU+GPU) hash tables, that optimize operations for frequently accessed keys. The idea is to maintain a dynamic set of most frequently accessed keys in the GPU memory and the rest of the keys in the CPU main memory. Further, queries are processed in batches of fixed size. We measured […]
Jan, 22
Exploring LLVM Infrastructure for Simplified Multi-GPU Programming
GPUs have established themselves in the computing landscape, convincing users and designers by their excellent performance and energy efficiency. They differ in many aspects from general-purpose CPUs, for instance their highly parallel architecture, their thread-collective bulk-synchronous execution model, and their programming model. In particular, languages like CUDA or OpenCL require users to express parallelism very […]
Jan, 22
Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts
Scalable sparse LU factorization is critical for efficient numerical simulation of circuits and electrical power grids. In this work, we present a new scalable sparse direct solver called Basker. Basker introduces a new algorithm to parallelize the Gilbert-Peierls algorithm for sparse LU factorization. As architectures evolve, there exists a need for algorithms that are hierarchical […]
Jan, 22
GPU Multisplit
Multisplit is a broadly useful parallel primitive that permutes its input data into contiguous buckets or bins, where the function that categorizes an element into a bucket is provided by the programmer. Due to the lack of an efficient multisplit on GPUs, programmers often choose to implement multisplit with a sort. However, sort does more […]
Jan, 22
The 9th International Conference on Machine Vision (SPIE-ICMV), 2016
All accepted of ICMV 2016 will be published by SPIE, which will be indexed by [EI&Scopus]. Previous proceedings from 2007 to 2015 (http://www.icmv.org/pro.html), all have get published and indexed by EI&Scopus. Conference photos, please check 2015 (http://www.icmv.org/photo2015.html); 2014 (http://www.icmv.org/photo2014.html); 2013 (http://www.icmv.org/photo2013.html). Keynote &Plenary Speakers Prof. AntanasVerikas, Halmstad University, Sweden; Prof. PetiaRadeva, University of Barcelona, Spain; […]
Jan, 19
Adaptive and Hybrid Machine Learning Approaches Utilizing General Purpose Computing on Graphical Processing Units
Unlike machines, humans and animals have very complex reasoning capability that allows them to adapt to changes in the naturally world, while computers tend to be very limited in that same aspect. What limits machines from becoming adaptable can span many topics, but of these attributes which would limit a machines ability to adapt is […]
Jan, 19
A framework for efficient execution on GPU and CPU+GPU systems
Technological limitations faced by the semi-conductor manufacturers in the early 2000’s restricted the increase in performance of the sequential computation units. Nowadays, the trend is to increase the number of processor cores per socket and to progressively use the GPU cards for highly parallel computations. Complexity of the recent architectures makes it difficult to statically […]
Jan, 18
The 2nd International Conference on Control, Automation and Robotics (ICCAR), 2016
ICCAR 2016 conference proceedings will be published by IEEE Conference Publication, which would be indexed by. ★ ICCAR 2016 is in the IEEE conference list. http://www.ieee.org/conferences_events/conferences/conferencedetails/index.html?Conf_ID=38085 ★Publication and Indexing History of ICCAR: ICCAR 2015, Singapore, May 20-22, 2015. Publication: IEEE Conference Proceedings Online: http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?reload=true&punumber=7153096 ★Keynote &Plenary Speakers Prof. Wei-Hsin Liao, The Chinese University of Hong […]
Jan, 16
Homomorphic Autocomplete
With the rapid progress in fully homomorpic encryption (FHE) and somewhat homomorphic encryption (SHE) schemes, we are witnessing renewed efforts to revisit privacy preserving protocols. Several works have already appeared in the literature that provide solutions to these problems by employing FHE or SHE techniques. These applications range from cloud computing to computation over confidential […]
Jan, 16
Performance Analysis of Roberts Edge Detection Using CUDA and OpenGL
The evolution of high-performance and programmable graphics processing units (GPUs) has generated considerable advancements in graphics and parallel computing. In this paper we present a Roberts filter based on edge detection algorithm using CUDA and OpenGL architectures. The basic idea is to use the Pixel Buffer Object (PBO) to create images with CUDA on a […]
Jan, 16
LHCb GPU acceleration project
The LHCb detector is due to be upgraded for processing high-luminosity collisions, which will increase data bandwidth to the event filter farm from 100 GB/s to 4 TB/s, encouraging us to look for new ways of accelerating Online reconstruction. The Coprocessor Manager is a new framework for integrating LHCb’s existing computation pipelines with massively parallel […]

