Posts
Sep, 17
GPU Technology Conference, GTC 2012
GTC advances awareness of high performance computing, and connects the scientists, engineers, researchers, and developers who use GPUs to tackle enormous computational challenges. GTC 2012 will feature the latest breakthroughs and the most amazing content in GPU-enabled computing. Spanning 4 full days of world-class education delivered by some of the greatest minds in GPU computing, […]
Sep, 17
39th International Symposium on Computer Architecture, ISCA 2012
The International Symposium on Computer Architecture is the premier forum for new ideas and experimental results in computer architecture. Novel papers are solicited on a broad range of topics, including (but not limited to): * Processor, memory, and storage systems architecture * Parallel and multi-core systems * Interconnection networks * Instruction, thread, and data-level parallelism […]
Sep, 17
26th IEEE International Parallel & Distributed Processing Symposium, IPDPS 2012
IPDPS is an international forum for engineers and scientists from around the world to present their latest research findings in all aspects of parallel computation. In addition to technical sessions of submitted paper presentations, the meeting offers workshops, tutorials, and commercial presentations & exhibits. IPDPS represents a unique international gathering of computer scientists from around […]
Sep, 16
Returning control to the programmer: SIMD intrinsics for virtual machines
Exposing SIMD units within interpreted languages could simplify programs and unleash floods of untapped processor power. Server and workstation hardware architecture is continually improving, yet interpreted languages-most importantly, Java-have failed to keep pace with the proper utilization of modern processors. SIMD (single instruction, multiple data) units are available in nearly every current desktop and server […]
Sep, 16
NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce
In the last decade, advances in data collection and storage technologies have led to an increased interest in designing and implementing large-scale parallel algorithms for machine learning and data mining (ML-DM). Existing programming paradigms for expressing large-scale parallelism such as MapReduce (MR) and the Message Passing Interface (MPI) have been the de facto choices for […]
Sep, 16
Programmable and Scalable Architecture for Graphics Processing Units
Graphics processing is an application area with high level of parallelism at the data level and at the task level. Therefore, graphics processing units (GPU) are often implemented as multiprocessing systems with high performance floating point processing and application specific hardware stages for maximizing the graphics throughput. In this paper we evaluate the suitability of […]
Sep, 16
Searching for Concurrent Design Patterns in Video Games
The transition to multicore architectures has dramatically underscored the necessity for parallelism in software. In particular, while new gaming consoles are by and large multicore, most existing video game engines are essentially sequential and thus cannot easily take advantage of this hardware. In this paper we describe techniques derived from our experience parallelizing an open-source […]
Sep, 16
A Light-Weight Approach to Dynamical Runtime Linking Supporting Heterogenous, Parallel, and Reconfigurable Architectures
When targeting hardware accelerators and reconfigurable processing units, the question of programmability arises, i.e. how different implementations of individual, configuration-specific functions are provided. Conventionally, this is resolved either at compilation time with a specific hardware environment being targeted, by initialization routines at program start, or decision trees at run-time. Such technique are, however, hardly applicable […]
Sep, 16
Optimizations and Performance of a Robotics Grasping Algorithm Described in Geometric Algebra
The usage of Conformal Geometric Algebra leads to algorithms that can be formulated in a very clear and easy to grasp way. But it can also increase the performance of an implementation because of its capabilities to be computed in parallel. In this paper we show how a grasping algorithm for a robotic arm is […]
Sep, 16
Parallel Medical Image Reconstruction: From Graphics Processors to Grids
We present a variety of possible parallelization approaches for a real-world case study using several modern parallel and distributed computer architectures. Our case study is a production-quality, time-intensive algorithm for medical image reconstruction used in computer tomography. We describe how this algorithm can be parallelized for the main kinds of contemporary parallel architectures: shared-memory multiprocessors, […]
Sep, 16
A Generic Approach to Topic Models
This article contributes a generic model of topic models. To define the problem space, general characteristics for this class of models are derived, which give rise to a representation of topic models as "mixture networks", a domain-specific compact alternative to Bayesian networks. Besides illustrating the interconnection of mixtures in topic models, the benefit of this […]
Sep, 16
Implicit and dynamic trees for high performance rendering
Recent advances in GPU architecture and programmability have enabled the computation of ray casted or ray traced images at interactive frame rates. However, the rapid performance gains of the hardware cannot by themselves address the challenge posed by the steady growth in the geometric and temporal complexity of computer graphics datasets. In this paper we […]