14388

Posts

Aug, 6

Optimizing an OpenCL Application for Video Watermarking in FPGAs

Video streaming and downloading account for the majority of consumer Internet traffic and are a driving force behind cloud computing. The continually growing demand for this type of content is pushing video-processing applications out of specialized systems and into the data center. This shift in the deployment paradigm allows for the rapid scaling of computation […]
Aug, 5

Exploiting two-level parallelism by aggregating computing resources in task-based applications over accelerator-based machines

Computing platforms are now extremely complex providing an increasing number of CPUs and accelerators. This trend makes balancing computations between these heterogeneous resources performance critical. In this paper we tackle the task granularity problem and we propose aggregating several CPUs in order to execute larger parallel tasks and thus find a better equilibrium between the […]
Aug, 5

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location […]
Aug, 5

Flexible Linear Algebra Development and Scheduling with Cholesky Factorization

Modern high performance computing environments are composed of networks of compute nodes that often contain a variety of heterogeneous compute resources, such as multicore-CPUs, GPUs, and coprocessors. One challenge faced by domain scientists is how to efficiently use all these distributed, heterogeneous resources. In order to use the GPUs effectively, the workload parallelism needs to […]
Aug, 5

Target Marker: A Visual Marker for Long Distances and Detection in Realtime on Mobile Devices

This paper deals with the design, detection and recognition of a visual marker which can be detected and recognized over a long distance in realtime on a mobile device. The solution is based on a rotationally symmetric marker, a detection process based on estimated circle and ellipse parameters and a Hidden Markov Model based approach […]
Aug, 5

Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures

We study the performance of dense symmetric indefinite factorizations (Bunch-Kaufman and Aasen’s algorithms) on multicore CPUs with a Graphics Processing Unit (GPU). Though such algorithms are needed in many scientific and engineering simulations, obtaining high performance of the factorization on the GPU is difficult because the pivoting, required to ensure the numerical stability of the […]
Aug, 4

4th International Conference on Intelligent Mechatronics and Automation (ICIMA), 2016

Publication: Selected submission paper will be recommended to publish into the journals below: International Journal of Mechanical Engineering and Robotics Research ISSN: 2278-0149 Abstracting/Indexing: Index Corpernicus, ProQuest, UDL, Google Scholar, Open J-Gate; etc. Call for Paper: 1. AI, intelligent control, neuro-control, fuzzy control and their applications 2. Biomedical and rehabilitation engineering, prosthetics and artificial organs; […]
Aug, 4

5th International Conference on Information and Industrial Electronics (ICIIE 2016), 2016

Publication: Selected submission paper will be recommended to publish into one of the journals below: * Journal of Advances in Information Technology (ISSN: 1798-2340) Abstracting/Indexing: INSPEC; EBSCO; ULRICH’s Periodicals Directory; WorldCat; CrossRef; Genamics JournalSeek; Google Scholar; Ovid LinkSolver; etc * Journal of Industrial and Intelligent Information(ISSN: 2301-3745) Abstracting/Indexing: EI(INSPEC, IET), Google Scholar, EBSCO, Engineering & […]
Aug, 3

Face Search at Scale: 80 Million Gallery

Due to the prevalence of social media websites, one challenge facing computer vision researchers is to devise methods to process and search for persons of interest among the billions of shared photos on these websites. Facebook revealed in a 2013 white paper that its users have uploaded more than 250 billion photos, and are uploading […]
Aug, 3

Real-Time Scheduling for GPUs with Applications in Advanced Automotive Systems

Self-driving cars, once constrained to closed test tracks, are beginning to drive alongside human drivers on public roads. Loss of life or property may result if the computing systems of automated vehicles fail to respond to events at the right moment. We call such systems that must satisfy precise timing constraints "real-time systems." Since the […]
Aug, 3

Optimized Event-Driven Runtime Systems for Programmability and Performance

Modern parallel programming models perform their best under the particular patterns they are tuned to express and execute, such as OpenMP for fork/join and Cilk for divide-and-conquer patterns. In cases where the model does not fit the problem, shoehorning of the problem to the model leads to performance bottlenecks, for example by introducing unnecessary dependences. […]
Aug, 3

Multi-GPU Parallel Computing and Task Scheduling under Virtualization

General Purpose Graphics Units (GPGPUS) have seen a tremendous rise in scientific computing application. To fully utilize the powerful parallel computing ability of GPU, and combine the isolation characteristic of virtualization, a GPU virtualization method that supports dynamic scheduling and multi-user concurrency is proposed. For multi-task of GPU general computing programs in virtualization environment, the […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org