13078

Posts

Nov, 3

Profiling of Data-Parallel Processors

Profiling data can help to improve an application with respect to various objectives like execution time, energy consumption or even thermal sensor placement for an upcoming device. This survey reviews state-of-the-art profiling tools for dataparallel processors like Nsight, PAPI and TAU as well as Lynx. Additionally, the attained knowledge is utilized to detect the bottleneck […]
Nov, 3

A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations

Fast Poisson solvers using the Fast Fourier Transform on uniform grids are especially suited for parallel implementation, making them appropriate for portability on graphical processing unit (GPU) devices. The goal of the following work was to implement, test, and evaluate a fast Poisson solver for periodic boundary conditions for use on a variety of GPU […]
Nov, 3

Bounds on the Energy Consumption of Computational Kernels

As computing devices evolve with successive technology generations, many machines target either the mobile or high-performance computing/datacenter environments. In both of these form factors, energy consumption often represents the limiting factor on hardware and software efficiency. On mobile devices, limitations in battery technology may reduce possible hardware capability due to a tight energy budget. On […]
Nov, 3

A GPU-based Framework for Real-time Free Viewpoint Television

Thesis addresses two main problems of Free Viewpoint TV: generation of arbitrary viewpoint in real-time and its delivery to end-user. For the first problem a GPU-based algorithm capable of generating free viewpoints from a network of fixed HD video cameras was developed. We used a space-sweep algorithm to estimate depth information. The view generation sub-system […]
Nov, 3

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs

This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to some particular sparse matrices. In addition, our partitioning method can obtain dense blocks by analyzing […]
Nov, 3

International Workshop on Pattern Recognition, ICOPR 2015

Submission Deadline: 2015-03-01 Publication: All papers for the ICOPR 2015 will be published in the IJSPS (ISSN: 2315-4535) as one volume, and will be indexed by Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library, etc. Topics: Track 1: Computer Vision Vision sensors Early vision Low-level vision Biologically motivated vision Illumination and reflectance […]
Nov, 3

4th International Conference on Software and Information Engineering, ICSIE 2015

Submission Deadline: 2015-03-01 Publication: Selected submission paper will be recommended to publish into one of the journals below: * Journal of Lecture Notes on Software Engineering (LNSE, ISSN: 2301-3559) Abstracting/ Indexing: EI (INSPEC, IET), DOAJ, Electronic Journals Library, Engineering & Technology Digital Library, Ulrich’s Periodicals Directory, International Computer Science Digital Library (ICSDL), ProQuest and Google […]
Oct, 31

Compoundly weighted Voronoi: a sequential and parallel implementation

SMARAD is the Smart and Novel Radios research unit at Aalto University in Helsinki. In the context of their smart radio research the area of influence of existing television transmitters is important data for the placement of experimental transmitters. Currently these areas are calculated with a regular Voronoi tessellation ignoring variation in transmitter characteristics. This […]
Oct, 31

Hierarchical Mapping Techniques for Signal Processing Systems on Parallel Platforms

Dataflow models are widely used for expressing the functionality of digital signal processing (DSP) applications due to their useful features, such as providing formal mechanisms for description of application functionality, imposing minimal data-dependency constraints in specifications, and exposing task and data level parallelism effectively. Due to the increased complexity of dynamics in modern DSP applications, […]
Oct, 31

Finite Element Numerical Integration on Xeon Phi coprocessor

In the present article we describe the implementation of the finite element numerical integration algorithm for the Xeon Phi coprocessor. The coprocessor is an extension of the idea of the many-core specialized unit for calculations and, by assumption, its performance has to be competitive with the current families of GPUs. Its main advantage is the […]
Oct, 31

Extended Dynamic Programming and Fast Multidimensional Search Algorithm for Energy Minization in Stereo and Motion

This paper presents a novel extended dynamic programming approach for energy minimization (EDP) to solve the correspondence problem for stereo and motion. A significant speedup is achieved using a recursive minimum search strategy (RMS). The mentioned speedup is particularly important if the disparity space is 2D as well as 3D. The proposed RMS can also […]
Oct, 31

Using an OpenCL Framework to Evaluate Interconnect Implementations on FPGAs

Field Programmable Gate Arrays (FPGAs) are an ideal platform for building systems with custom hardware accelerators, however managing these systems is still a major challenge. The OpenCL standard has become accepted as a good programming model for managing heterogeneous platforms due to its rich constructs. Although commercial OpenCL frameworks are now emerging, there is a […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: