Posts
Sep, 9
Dissecting GPU Memory Hierarchy through Microbenchmarking
Memory access efficiency is a key factor in fully utilizing the computational power of graphics processing units (GPUs). However, many details of the GPU memory hierarchy are not released by GPU vendors. In this paper, we propose a novel fine-grained microbenchmarking approach and apply it to three generations of NVIDIA GPUs, namely Fermi, Kepler and […]
Sep, 8
Accelerating Multiple Compound Comparison Using LINGO-based Load-Balancing Strategies on Multi-GPUs
Compound comparison is an important task for the computational chemistry. By the comparison results, potential inhibitors can be found and then used for the pharmacy experiments. The time complexity of a pairwise compound comparison is O(n^2), where n is the maximal length of compounds. In general, the length of compounds is tens to hundreds, and […]
Sep, 8
Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
Lattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to […]
Sep, 8
Contributions to the Efficient Use of General Purpose Coprocessors: Kernel Density Estimation as Case Study
The high performance computing landscape is shifting from assemblies of homogeneous nodes towards heterogeneous systems, in which nodes consist of a combination of traditional out-oforder execution cores and accelerator devices. Accelerators, built around GPUs, many-core chips, or FPGAs, are used to offload compute-intensive tasks. These devices provide superior theoretical performance compared to traditional multi-core CPUs, […]
Sep, 8
Accelerating Web Search using GPUs
The amount of content on the Internet is growing rapidly as well as the number of the online Internet users. As a consequence, web search engines need to increase their computing capabilities and data continually while maintaining low search latency and without a significant rise in the cost per query. To serve this larger numbers […]
Sep, 7
A Survey Of Architectural Techniques for Near-Threshold Computing
Energy efficiency has now become the primary obstacle in scaling the performance of all classes of computing systems. Low-voltage computing and specifically, near-threshold voltage computing (NTC), which involves operating the transistor very close to and yet above its threshold voltage, holds the promise of providing many-fold improvement in energy efficiency. However, use of NTC also […]
Sep, 5
Waste Not, Want Not! Managing relational data in asymmetric memories
In this thesis, we study the management of relational data in modern, i.e., asymmetric computer systems. We explore different strategies to identify asymmetries in persistent data, map them to asymmetries in the memory landscape and, eventually, exploit them to increase query processing performance. To this end, we study memory conscious decomposition and storage of data […]
Sep, 5
Virtualizing Data Parallel Systems for Portability, Productivity, and Performance
Computer systems equipped with graphics processing units (GPUs) have become increasingly common over the last decade. In order to utilize the highly data parallel architecture of GPUs for general purpose applications, new programming models such as OpenCL and CUDA were introduced, showing that data parallel kernels on GPUs can achieve speedups by several orders of […]
Sep, 5
Parallel Execution of the ASP Computation – an Investigation on GPUs
This paper illustrates the design and implementation of a conflict-driven ASP solver that is capable of exploiting the Single-Instruction Multiple-Thread parallelism offered by General Purpose Graphical Processing Units (GPUs). Modern GPUs are multi-core platforms, providing access to large number of cores at a very low cost, but at the price of a complex architecture with […]
Sep, 5
Convolutional Neural Network for Sentence Classification
The goal of a Knowledge Base-supported Question Answering (KB-supported QA) system is to answer a query natural language by obtaining the answer from a knowledge database, which stores knowledge in the form of (entity, relation, value) triples. QA systems understand questions by extracting entity and relation pairs. This thesis aims at recognizing the relation candidates […]
Sep, 5
On GPU-Accelerated Fast Direct Solvers and Their Applications in Image Denoising
This dissertation focuses on block cyclic reduction (BCR) type fast direct solvers, graphics processing unit (GPU) computation, and image denoising. The fast direct solvers are specialized methods for solving certain types of linear systems. They take into account specific characteristics of the system and are therefore able to solve the system much more efficiently than […]
Sep, 3
Advanced Simulation Library: Expanding software ecosystem for the DSP/FPGA/GPU market
Advanced Simulation Library is a free and open source multiphysics simulation software package and a tool for solving Partial Differential Equations. It has significant user base across many areas of engineering and science, from both industrial and academic organizations. ASL utilizes only the methods that allow efficient parallelization: Lattice Boltzmann Methods, Explicit Finite Difference, Matrix […]

