2795

Posts

Jan, 27

The Saga of Landau-Gauge Propagators: Gathering New Ammo

Compelling evidence has recently emerged from lattice simulations in favor of the massive solution of the Schwinger-Dyson equations of Landau-gauge QCD. The main objections to these lattice results are based on possible Gribov-copy effects. We recently installed at IFSC-USP a new GPU cluster dedicated to the study of Green’s functions. We present here our point […]
Jan, 26

Assessing Accelerator-Based HPC Reverse Time Migration

Oil and gas companies trust Reverse Time Migration (RTM), the most advanced seismic imaging technique, with crucial decisions on drilling investments. The economic value of the oil reserves that require RTM to be localized is in the order of 10^13 dollars. But RTM requires vast computational power, which somewhat hindered its practical success. Although, accelerator-based […]
Jan, 26

Texturing and Modeling, Third Edition: A Procedural Approach (The Morgan Kaufmann Series in Computer Graphics)

The third edition of this classic tutorial and reference on procedural texturing and modeling is thoroughly updated to meet the needs of today’s 3D graphics professionals and students. New for this edition are chapters devoted to real-time issues, cellular texturing, geometric instancing, hardware acceleration, futuristic environments, and virtual universes. In addition, the familiar authoritative chapters […]
Jan, 26

Realisation of a holographic microlaser scalpel using a digital micromirror device

Modern spatial light modulators (SLM) enable the generation of more or less arbitrary light fields in three dimensions. Such light fields can be used for different future applications in the field of biomedical optics. One example is the processing/cutting of biological material on a microscopic scale. By displaying computer generated holograms by suitable SLMs it […]
Jan, 26

Weak execution ordering – exploiting iterative methods on many-core GPUs

On NVIDIA’s many-core GPUs, there is no synchronization function among parallel thread blocks. When fine-granularity of data communication and synchronization is required for large-scale parallel programs executed by multiple thread blocks, frequent host synchronization are necessary, and they incur a significant overhead. In this paper, we investigate a class of applications which uses a chaotic […]
Jan, 26

A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations

Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture, there are usually halo regions that need to be updated and exchanged among different processing elements (PEs). In addition, synchronization is often used to signal the completion of […]
Jan, 26

Architectural Support for the Stream Execution Model on General-Purpose Processors

There has recently been much interest in stream processing, both in industry (e.g., Cell, NVIDIA G80, ATI R580) and academia (e.g., Stanford Merrimac, MIT RAW), with stream programs becoming increasingly popular for both media and more general-purpose computing. Although a special style of programming called stream programming is needed to target these stream architectures, huge […]
Jan, 26

Correlating Radio Astronomy Signals with Many-Core Hardware

A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is […]
Jan, 26

Adaptable particle-in-cell algorithms for graphical processing units

We developed new parameterized Particle-in-Cell algorithms and data structures for emerging multi-core and many-core architectures. Four parameters allow tuning of this PIC code to different hardware configurations. Particles are kept ordered at each time step. The first application of these algorithms is to NVIDIA Graphical Processing Units, where speedups of about 15-25 compared to an […]
Jan, 26

Hierarchical Agglomerative Clustering Using Graphics Processor with Compute Unified Device Architecture

We explore the use of today’s high-end Graphics processing units on desktops to perform hierarchical agglomerative clustering with the Compute Unified Device Architecture – CUDA of NVIDIA. Although the advancement in graphics cards has made the gaming industry to flourish,there is a lot more to be gained the field of scientific computing, high performance computing […]
Jan, 26

Simulating flows of incompressible and weakly compressible fluids on multicore hybrid computer systems

A logically simple algorithm based on explicit schemes for modeling flows of incompressible and weakly compressible fluids is considered. The hyperbolic variant of the quasi-gas dynamic system of equations is used as a mathematical model. An ingenious computer cluster based on NVIDIA GPUs is used for the computations.
Jan, 26

Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem

Modern video cards and game consoles typically have much better performance to price ratios than that of general purpose CPUs. The parallel processing capabilities of game hardware are well-suited for high throughput biomedical data analysis. Our initial results suggest that game hardware is a cost-effective platform for some computationally demanding bioinformatics problems.

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: