15477

Posts

Feb, 19

LN-Annote: An Alternative Approach to Information Extraction from Emails using Locally-Customized Named-Entity Recognition

Personal mobile devices offer a growing variety of personalized services that enrich considerably the user experience. This is made possible by increased access to personal information, which to a large extent is extracted from user email messages and archives. There are, however, two main issues. First, currently these services can be offered only by large […]
Feb, 19

Gravitational wave astrophysics, data analysis and multimessenger astronomy

This paper reviews gravitational wave sources and their detection. One of the most exciting potential sources of gravitational waves are coalescing binary black hole systems. They can occur on all mass scales and be formed in numerous ways, many of which are not understood. They are generally invisible in electromagnetic waves, and they provide opportunities […]
Feb, 19

The 4th International Symposium on Computing and Networking

Following the success of past ICNC conferences, 2010 in Hiroshima, 2011 in Osaka, and 2012 in Okinawa, and CANDAR symposiums 2013 in Matsuyama, 2014 in Shizuoka, 2015 in Sapporo, CANDAR 2016 will be held in Hiroshima, Japan. CANDAR 2016 will serve as a forum for exchanging the latest findings and experiences ranging from theoretical research […]
Feb, 18

Deep Feature-based Face Detection on Mobile Devices

We propose a deep feature-based face detector for mobile devices to detect user’s face acquired by the front facing camera. The proposed method is able to detect faces in images containing extreme pose and illumination variations as well as partial faces. The main challenge in developing deep feature-based algorithms for mobile devices is the constrained […]
Feb, 18

Speeding Up Reinforcement Learning with Graphics Processing Units

Conventionally programmed systems (e.g. robots) are not able to adapt to unforeseen changes in their task or environment. Reinforcement learning (RL), a machine learning approach, could grant this flexibility. Many fields of work could greatly benefit from this, be it in terms of cost, time or some other parameter. With RL, a learning agent tries […]
Feb, 17

Deep Learning on FPGAs: Past, Present, and Future

The rapid growth of data size and accessibility in recent years has instigated a shift of philosophy in algorithm design for artificial intelligence. Instead of engineering algorithms by hand, the ability to learn composable systems automatically from massive amounts of data has led to ground-breaking performance in important domains such as computer vision, speech recognition, […]
Feb, 16

Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems

Computer systems have become increasingly diverse and specialized in recent years. This complexity supports a wide range of new computing uses and users, but is not without cost: it has become difficult to maintain the efficiency of contemporary general purpose computing systems. Computing inefficiencies, which include nonoptimal runtimes, excessive energy use, and limits to scalability, […]
Feb, 16

SABER: Window-Based Hybrid Stream Processing for Heterogeneous Architectures

Modern servers have become heterogeneous, often combining multicore CPUs with many-core GPGPUs. Such heterogeneous architectures have the potential to improve the performance of data-intensive stream processing applications, but they are not supported by current relational stream processing engines. For an engine to exploit a heterogeneous architecture, it must execute streaming SQL queries with sufficient data-parallelism […]
Feb, 16

CaffeLink: Mathematica binding for Caffe Deep Learning Framework

In this paper we present CaffeLink an open-source library for Mathematica which is a binding of a well-established Caffe deep learning framework. Caffe is a highly-optimized CUDA accelerated library with focus on convolutional neural networks written in C++ with Python and Matlab bindings. CaffeLink is based upon Mathematica’s LibraryLink. It makes accessible most features of […]
Feb, 16

Towards Improving Programmability of Heterogeneous Parallel Architectures

Parallel computing has been considered an effective approach to combine performance and power efficiency for a long time. Starting from High Performance Computing (HPC) to modern embedded systems, the employment of heterogeneous parallel architectures is becoming the common case, since they provide a good tradeoff in terms of power efficiency. The exascale objective for the […]
Feb, 16

Parallel and Scalable Sparse Basic Linear Algebra Subprograms

Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks for numerous scientific computations and graph applications. Compared with Dense BLAS, parallelization of Sparse BLAS routines entails extra challenges due to the irregularity of sparse data structures. This thesis proposes new fundamental algorithms and data structures that accelerate Sparse BLAS routines on modern massively parallel […]
Feb, 11

Writing a performance-portable matrix multiplication

There are several frameworks that, while providing functional portability of code across different platforms, do not automatically provide performance portability. As a consequence, programmers have to hand-tune the kernel codes for each device. The Heterogeneous Programming Library (HPL) is one of these libraries, but it has the interesting feature that the kernel codes, which implement […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: