Posts
Dec, 9
Theano-based Large-Scale Visual Recognition with Multiple GPUs
In this report, we describe a Theano-based AlexNet (Krizhevsky et al., 2012) implementation and its naive data parallelism on multiple GPUs. Our performance on 2 GPUs is comparable with the state-of-art Caffe library (Jia et al., 2014) run on 1 GPU. To the best of our knowledge, this is the first open-source Python-based AlexNet implementation […]
Dec, 9
Lattice QCD with Domain Decomposition on Intel Xeon Phi Co-Processors
The gap between the cost of moving data and the cost of computing continues to grow, making it ever harder to design iterative solvers on extreme-scale architectures. This problem can be alleviated by alternative algorithms that reduce the amount of data movement. We investigate this in the context of Lattice Quantum Chromodynamics and implement such […]
Dec, 9
MLitB: Machine Learning in the Browser
With few exceptions, the field of Machine Learning (ML) research has largely ignored the browser as a computational engine. Beyond an educational resource for ML, the browser has vast potential to not only improve the state-of-the-art in ML research, but also, inexpensively and on a massive scale, to bring sophisticated ML learning and prediction to […]
Dec, 9
Risk Estimation Without Using Stein’s Lemma — Application to Image Denoising
Image denoising is a classical problem in image processing and has applications in areas ranging from photography to medical imaging. In this paper, we examine the denoising performance of an optimized spatially-varying Gaussian filter. The parameters of the Gaussian filter are tuned by optimizing a mean squared error estimate which is similar Stein’s Unbiased Risk […]
Dec, 9
Portable OpenCL Out-of-Order Execution Framework for Heterogeneous Platforms
Heterogeneous computing has become a viable option in seeking computing performance, to the side of conventional homogeneous multi-/single-processor approaches. The advantage of heterogeneity is the possibility to choose the best device on the platform for different distinct workloads in the application to gain performance and/or to lower power consumption. The drawback of heterogeneity is the […]
Dec, 8
XIII International Conference on Parallel Processing, ICPP 2015
The ICPP 2015 : XIII International Conference on Parallel Processing is the premier interdisciplinary forum for the presentation of new advances and research results in the fields of Parallel Processing. The conference will bring together leading academic scientists, researchers and scholars in the domain of interest from around the world. Topics of interest for submission […]
Dec, 8
The Sixth International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, HEART 2015
The HEART symposium is an international forum on state-of-the-art research in high-performance and power-efficient computing using accelerator technologies such as FPGAs, GPGPUs, and/or specialized accelerators. The fifth edition of HEART will take place in Boston MA, USA. The Sixth International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) is a forum to present and […]
Dec, 8
Computer Graphics International, CGI’15
Computer Graphics International is one of the oldest and true international conference in Computer Graphics and one of the five most important ones worldwide. It is an essential yearly meeting where academics present their latest models and technologies, and explore new trends and ideas. In previous years, it had been held in many different places […]
Dec, 8
19th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, KES 2015
The conference encompasses a broad spectrum of intelligent systems related subjects. The following list provides examples of applicable topics; however, the list is not meant to exclude other applicable areas. Generic Topics of Interest Automated Design and Configuration of Sensory Systems, Self-x principles in Intelligent Engineering Systems, Knowledge-Based Systems, Expert Systems, Cognitive Systems, Neural Networks, […]
Dec, 8
International Conference on Parallel Computing 2015, ParCo2015
Section 1: Algorithms Design, analysis, and implementation of parallel algorithms in science and engineering, focusing on issues such as Scalability and speedup Efficient utilization of the memory hierarchy Communication and synchronization Data Management and Exploration Energy Efficiency. The parallel computing aspects should be emphasized. Section 2: Software and Architectures Software engineering for developing and maintaining […]
Dec, 7
Massively Parallel A* Search on a GPU
A* search is a fundamental topic in artificial intelligence. Recently, the general purpose computation on graphics processing units (GPGPU) has been widely used to accelerate numerous computational tasks. In this paper, we propose the first parallel variant of the A* search algorithm such that the search process of an agent can be accelerated by a […]
Dec, 7
Big Integer Multiplication with CUDA FFT (cuFFT) Library
It is well recognized in the computer algebra theory and systems communities that the Fast Fourier Transform (FFT) can be used for multiplying polynomials. Theory predicts that it is fast for "large enough" polynomials. The basic idea is to use fast polynomial multiplication to perform fast integer multiplication. We can achieve really fast FFT multiplication […]