6727

Posts

Dec, 20

Parsing in Parallel on Multiple Cores and GPUs

This paper examines the ways in which parallelism can be used to speed the parsing of dense PCFGs. We focus on two kinds of parallelism here: Symmetric Multi-Processing (SMP) parallelism on shared-memory multicore CPUs, and Single-Instruction MultipleThread (SIMT) parallelism on GPUs. We describe how to achieve speed-ups over an already very efficient baseline parser using […]
Dec, 20

The Future in Mobile Multicore Computing

Mobile computers are an essential part of consumer technology, and we are fast approaching a future where all mobile computers have general purpose GPUs (GPGPUs) and multicore CPUs in them. We describe this development as Mobile Multicore Computing (MMC). In this paper, we discuss the importance of MMC, as well as three critical issues associated […]
Dec, 20

Floating-point Mixed-radix FFT Core Generation for FPGA and Comparison with GPU and CPU

Over the past decades, we noticed huge advances in FPGA technologies. The topic of floating-point accelerator on FPGA has gained renewed interests due to the increased device size and the emergence of fast hardware floating-point library. The popularity of FFT makes it easier to justify spending lots of effort doing detailed optimization. However, the ever […]
Dec, 19

Fast Random Graph Generation

Today, several database applications call for the generation of random graphs. A fundamental, versatile random graph model adopted for that purpose is the Erdos-Renyi Gamma_v,p model. This model can be used for directed, undirected, and multipartite graphs, with and without self-loops; it induces algorithms for both graph generation and sampling, hence is useful not only […]
Dec, 19

GPU-Accelerated Preconditioned Iterative Linear Solvers

This work is an overview of our preliminary experience in developing high-performance iterative linear solver accelerated by GPU co-processors. Our goal is to illustrate the advantages and difficulties encountered when deploying GPU technology to perform sparse linear algebra computations. Techniques for speeding up sparse matrix-vector product (SpMV) kernels and finding suitable preconditioning methods are discussed. […]
Dec, 19

3D Recursive Gaussian IIR on GPU and FPGAs: A Case Study for Accelerating Bandwidth-Bounded Applications

GPU devices typically have a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper we present our implementations of a 3D recursive Gaussian IIR on multicore CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 […]
Dec, 19

An Efficient Simulation Environment for Modeling Large-Scale Cortical Processing

We have developed a spiking neural network simulator, which is both easy to use and computationally efficient, for the generation of large-scale computational neuroscience models. The simulator implements current or conductance based Izhikevich neuron networks, having spike-timing dependent plasticity and short-term plasticity. It uses a standard network construction interface. The simulator allows for execution on […]
Dec, 19

Optimization of mapped functions sequences using fusions on GPU

When implementing a function mapping on the contemporary GPU, several contradictory performance factors have to be balanced. Previously a decomposition-fusion scheme was devised to guide such an implementation and this work is here further elaborated. To ease this process, an automatic source-to-source compiler is presented, while the main subject of this thesis are the core […]
Dec, 19

GPU-based Implementation of the Variational Path Integral Method

Any system in the world constitutes particles like electrons. To analyze the behaviors of these systems the behavior of these particles must be predicted. The ground state energy of a molecule is the most important information about a molecule and can calculate by solving the Schrodinger equation. But as the number of atoms increase, the […]
Dec, 19

Towards Automatic C Programs Optimization and Parallelization using the PIPS-PoCC Integration

This paper explains how the PIPS source-to-source compilation framework integrates the Polyhedral Compiler Collection (PoCC) as one of PIPS many program transformations. The integration between PIPS and PoCC extracts automatically the static control parts of the source code, which can be optimized independently by PoCC and then reintegrates them transparently in the user source code. […]
Dec, 19

Experiences Developing the OpenUH Compiler and Runtime Infrastructure

The OpenUH compiler is a branch of the open source Open64 compiler suite for C, C++, Fortran 95/2003, with support for a variety of targets including x86_64, IA-64, and IA-32. For the past several years, we have used OpenUH to conduct research in parallel programming models and their implementation, static and dynamic analysis of parallel […]
Dec, 19

Acceleration of grammatical evolution using graphics processing units: computational intelligence on consumer games and graphics hardware

Several papers show that symbolic regression is suitable for data analysis and prediction in financial markets. Grammatical Evolution (GE), a grammar-based form of Genetic Programming (GP), has been successfully applied in solving various tasks including symbolic regression. However, often the computational effort to calculate the fitness of a solution in GP can limit the area […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: