Posts
Sep, 19
A small-world network model for distributed storage of semantic metadata
The growing uptake of semantic web and grid ideas is raising the importance of optimising distribution algorithms for semantic metadata. While it is not yet clear how real-world metadata distribution patterns ought to evolve, practical experience of social and technical networks suggests that a small-world pattern is desirable and practical. We explore simulated small-world networks […]
Sep, 19
Using many-core hardware to correlate radio astronomy signals
A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is […]
Sep, 19
An adaptative game loop architecture with automatic distribution of tasks between CPU and GPU
This article presents a new architecture to implement all game loop models for games and real-time applications that use the GPU as a mathematics and physics coprocessor, working in parallel processing mode with the CPU. The presented model applies automatic task distribution concepts. The architecture can apply a set of heuristics defined in Lua scripts […]
Sep, 19
cuIBM — A GPU-accelerated Immersed Boundary Method
A projection-based immersed boundary method is dominated by sparse linear algebra routines. Using the open-source Cusp library, we observe a speedup (with respect to a single CPU core) which reflects the constraints of a bandwidth-dominated problem on the GPU. Nevertheless, GPUs offer the capacity to solve large problems on commodity hardware. This work includes validation […]
Sep, 17
GPU Technology Conference, GTC 2012
GTC advances awareness of high performance computing, and connects the scientists, engineers, researchers, and developers who use GPUs to tackle enormous computational challenges. GTC 2012 will feature the latest breakthroughs and the most amazing content in GPU-enabled computing. Spanning 4 full days of world-class education delivered by some of the greatest minds in GPU computing, […]
Sep, 17
39th International Symposium on Computer Architecture, ISCA 2012
The International Symposium on Computer Architecture is the premier forum for new ideas and experimental results in computer architecture. Novel papers are solicited on a broad range of topics, including (but not limited to): * Processor, memory, and storage systems architecture * Parallel and multi-core systems * Interconnection networks * Instruction, thread, and data-level parallelism […]
Sep, 17
26th IEEE International Parallel & Distributed Processing Symposium, IPDPS 2012
IPDPS is an international forum for engineers and scientists from around the world to present their latest research findings in all aspects of parallel computation. In addition to technical sessions of submitted paper presentations, the meeting offers workshops, tutorials, and commercial presentations & exhibits. IPDPS represents a unique international gathering of computer scientists from around […]
Sep, 16
Returning control to the programmer: SIMD intrinsics for virtual machines
Exposing SIMD units within interpreted languages could simplify programs and unleash floods of untapped processor power. Server and workstation hardware architecture is continually improving, yet interpreted languages-most importantly, Java-have failed to keep pace with the proper utilization of modern processors. SIMD (single instruction, multiple data) units are available in nearly every current desktop and server […]
Sep, 16
NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce
In the last decade, advances in data collection and storage technologies have led to an increased interest in designing and implementing large-scale parallel algorithms for machine learning and data mining (ML-DM). Existing programming paradigms for expressing large-scale parallelism such as MapReduce (MR) and the Message Passing Interface (MPI) have been the de facto choices for […]
Sep, 16
Programmable and Scalable Architecture for Graphics Processing Units
Graphics processing is an application area with high level of parallelism at the data level and at the task level. Therefore, graphics processing units (GPU) are often implemented as multiprocessing systems with high performance floating point processing and application specific hardware stages for maximizing the graphics throughput. In this paper we evaluate the suitability of […]
Sep, 16
Searching for Concurrent Design Patterns in Video Games
The transition to multicore architectures has dramatically underscored the necessity for parallelism in software. In particular, while new gaming consoles are by and large multicore, most existing video game engines are essentially sequential and thus cannot easily take advantage of this hardware. In this paper we describe techniques derived from our experience parallelizing an open-source […]
Sep, 16
A Light-Weight Approach to Dynamical Runtime Linking Supporting Heterogenous, Parallel, and Reconfigurable Architectures
When targeting hardware accelerators and reconfigurable processing units, the question of programmability arises, i.e. how different implementations of individual, configuration-specific functions are provided. Conventionally, this is resolved either at compilation time with a specific hardware environment being targeted, by initialization routines at program start, or decision trees at run-time. Such technique are, however, hardly applicable […]