Posts
Aug, 15
2nd International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC’16), 2016
With Exascale systems on the horizon at the same time that conventional von-Neumann architectures are suffering from rising power densities, we are facing an era with power, energy-efficiency, and cooling as first-class constraints for scalable HPC. FPGAs can tailor the hardware to the application, avoiding overheads of general-purpose architectures–for example, through customized datapaths and memory […]
Aug, 11
A Comparison of Potential Interfaces for Batched BLAS Computations
One trend in modern high performance computing (HPC) is to decompose a large linear algebra problem into thousands of small problems which can be solved independently. There is a clear need for a batched BLAS standard, allowing users to perform thousands of small BLAS operations in parallel and making efficient use of their hardware. There […]
Aug, 11
CaffePresso: An Optimized Library for Deep Learning on Embedded Accelerator-based platforms
Off-the-shelf accelerator-based embedded platforms offer a competitive energy-efficient solution for lightweight deep learning computations over CPU-based systems. Low-complexity classifiers used in power-constrained and performance-limited scenarios are characterized by operations on small image maps with 2-3 deep layers and few class labels. For these use cases, we consider a range of embedded systems with 5-20 W […]
Aug, 11
Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL
Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an […]
Aug, 11
9th International Conference on Machine Learning and Computing (ICMLC), 2017
Paper Publication ICMLC 2017 proceedings will be published in the International Conference Proceedings Series by ACM, which will be archived in the ACM Digital Library, and indexed by Ei Compendex and Scopus and submitted to be reviewed by Thomson Reuters Conference Proceedings Citation Index (ISI Web of Science). Proceedings ISBN: 978-1-4503-4783-9 Submission Methods You can […]
Aug, 11
International Conference on Bioinformatics and Computing Technologies (ICBCT), 2017
Publication All papers accepted by this conference will be published by International Journal of Bioscience, Biochemistry and Bioinformatics (IJBBB) / International Journal of Machine Learning and Computing (IJMLC), and will be submitted to EI(INSPEC) to include. Submission Please submit your full paper to us:icbct@saise.org
Aug, 11
9th International Conference on Computer and Automation Engineering (ICCAE), 2017
Publication All accepted papers of ICCAE 2017 will be published in the International Conference Proceedings Series by ACM, (ISBN: 978-1-4503-4791-4), which will be archived in the ACM Digital Library, and indexed by Ei Compendex and Scopus and submitted to be reviewed by Thomson Reuters Conference Proceedings Citation Index (ISI Web of Science). ICCAE 2016 conference […]
Aug, 8
Co-design of a particle-in-cell plasma simulation code for Intel Xeon Phi: a first look at Knights Landing
Three dimensional particle-in-cell laser-plasma simulation is an important area of computational physics. Solving state-of-the-art problems requires large-scale simulation on a supercomputer using specialized codes. A growing demand in computational resources inspires research in improving efficiency and co-design for supercomputers based on many-core architectures. This paper presents first performance results of the particle-in-cell plasma simulation code […]
Aug, 8
Accelerating Computational Finance Simulations with OpenCL
Computational finance is a domain, where performance is in high demand. Therefore, we investigate the suitability of two families of accelerators for computational finance simulations. Specifically, we use a scenario-based ALM (Asset Liability Management) model and design a suitable OpenCL implementation. We further improve the performance of the application by applying several typical optimization techniques […]
Aug, 8
A Comprehensive Performance Analysis of HSA and OpenCL 2.0
Heterogeneous systems, that marry CPUs and GPUs together in a range of configurations, are quickly becoming the design paradigm for today’s platforms because of their impressive parallel processing capabilities. However, in many existing heterogeneous systems, the GPU is only treated as an accelerator by the CPU, working as a slave to the CPU master. But […]
Aug, 8
Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies
A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over […]
Aug, 8
OpenCL-accelerated object classification in video streams using Spatial Pooler of Hierarchical Temporal Memory
We present a method to classify objects in video streams using a brain-inspired Hierarchical Temporal Memory (HTM) algorithm. Object classification is a challenging task where humans still significantly outperform machine learning algorithms due to their unique capabilities. We have implemented a system which achieves very promising performance in terms of recognition accuracy. Unfortunately, conducting more […]