5920

Posts

Oct, 8

Analysis of 3-dimensional electromagnetic fields in dispersive media using cuda

This research presents the implementation of the Finite-Difference Time-Domain (FDTD) method for the solution of 3-dimensional electromagnetic problems in dispersive media using Graphics Processor Units (GPUs). By using the newly introduced CUDA technology, we illustrate the efficacy of GPUs in accelerating the FDTD computations by achieving appreciable speedup factors with great ease and at no […]
Oct, 8

Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs)

Iterative reconstruction algorithms are becoming increasingly important in electron tomography of biological samples. These algorithms, however, impose major computational demands. Parallelization must be employed to maintain acceptable running times. Graphics Processing Units (GPUs) have been demonstrated to be highly cost-effective for carrying out these computations with a high degree of parallelism. In a recent paper […]
Oct, 8

Programming framework for clusters with heterogeneous accelerators

We describe a programming framework for high performance clusters with various hardware accelerators. In this framework, users can utilize the available heterogeneous resources productively and efficiently. The distributed application is highly modularized to support dynamic system configuration with changing types and number of the accelerators. Multiple layers of communication interface are introduced to reduce the […]
Oct, 8

Astrophysical particle simulations with large custom GPU clusters on three continents

We present direct astrophysical N-body simulations with up to six million bodies using our parallel MPI-CUDA code on large GPU clusters in Beijing, Berkeley, and Heidelberg, with different kinds of GPU hardware. The clusters are linked in the cooperation of ICCS (International Center for Computational Science). We reach about one third of the peak performance […]
Oct, 8

Efficient reconfigurable design for pricing asian options

Arithmetic Asian options are financial derivatives which have the feature of path-dependency: they depend on the entire price path of the underlying asset, rather than just the instantaneous price. This path-dependency makes them difficult to price, as only computationally intensive Monte-Carlo methods can provide accurate prices. This paper proposes an FPGA-accelerated Asian option pricing solution, […]
Oct, 7

Multifrontal computations on GPUs and their multi-core hosts

The use of GPUs to accelerate the factoring of large sparse symmetric matrices shows the potential of yielding important benefits to a large group of widely used applications. This paper examines how a multifrontal sparse solver performs when exploiting both the GPU and its multi-core host. It demonstrates that the GPU can dramatically accelerate the […]
Oct, 7

Non-recursive beam search on GPU for formal concept analysis

We document a parallel non-recursive beam search GPGPU FCA CbO like algorithm written in nVidia CUDA C and test it on software module dependency graphs. Despite removing repeated calculations and optimising data structures and kernels, we do not yet see major speed ups. Instead GeForce 295 GTX and Tesla C2050 report 141072 concepts (maximal rectangles, […]
Oct, 7

Investigation on the Use of GPGPU for Fast Sparse Matrix Factorization

Solution for network equations is frequently encountered by power system researchers. With the increasingly larger system size, time consumed network solution is becoming a dominant factor in the overall time cost. One distinct and important feature of the network admittance matrix is that it is highly sparse, which need to be addressed by specialized computation […]
Oct, 7

GPGPU-assisted prediction of ion binding sites in proteins

Prediction of binding sites for different types of ions in protein 3D structure context is a complex challenge for biophysical computational methods. One possible approach involves using empirical, also called as knowledge-based, potentials. In the current study, we present a new GPGPU program complex, PIONCA (Protein-ION CAlculator) for efficient generation of empirical potentials for protein-ion […]
Oct, 7

Heterogeneous NPACI-Rocks/MPI/CUDA distributed multi-GPGPU application for seeking counterexamples to Beal’s Conjecture: MPI/CUDA integration component

Beal’s Conjecture asserts that if Ax + By = Cz for integers A,B,C > 0 and integers x,y,z > 2, then A, B, and C share a common prime factor. While empirical computational studies by several researchers have established that Beal’s Conjecture holds for all A,B,C,x,y,z < 1000, the truth of the general conjecture remains […]
Oct, 7

Hybrid coherence for scalable multicore architectures

This work describes a cache architecture and memory model for 1000+ core microprocessors. Our approach exploits workload characteristics and programming model assumptions to build a hybrid memory model that incorporates features from both software-managed coherence schemes and hardware cache coherence. The goal is to achieve the scalability found in compute accelerators, which support relaxed ordering […]
Oct, 7

Intel’s Array Building Blocks: A retargetable, dynamic compiler and embedded language

Our ability to create systems with large amount of hardware parallelism is exceeding the average software developer’s ability to effectively program them. This is a problem that plagues our industry. Since the vast majority of the world’s software developers are not parallel programming experts, making it easy to write, port, and debug applications with sufficient […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: