Posts
Nov, 8
High Level Synthesis and Evaluation of the Secure Hash Standard for FPGAs
Secure hash algorithms (SHAs) are important components of cryptographic applications. SHA performance on central processing units (CPUs) is slow, therefore, acceleration must be done using hardware such as Field Programmable Gate Arrays (FPGAs). Considerable work has been done in academia using FPGAs to accelerate SHAs. These designs were implemented using Hardware Description Language (HDL) based […]
Nov, 8
On the Effectiveness of OpenMP teams for Programming Embedded Manycore Accelerators
With the introduction of more powerful and massively parallel embedded processors, embedded systems are becoming HPC capable. In particular heterogeneous on-chip systems (SoC) that couple a general-purpose host processor to a many-core accelerator are becoming more and more widespread, and provide tremendous peak performance/watt, well suited to execute HPC-class programs. The increased computation potential is […]
Nov, 8
Data Stream Classification using Random Feature Functions and Novel Method Combinations
Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, $k$-nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a […]
Nov, 8
Deep Learning for Computer Vision: A comparison between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks
In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks [1] [2] [3]. The rise of deep learning is also revolutionizing the entire field of Machine Learning and […]
Nov, 4
Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Storage Format
The performance of sparse matrix vector multiplication (SpMV) is important to computational scientists. Compressed sparse row (CSR) is the most frequently used format to store sparse matrices. However, CSR-based SpMV on graphics processing units (GPUs) has poor performance due to irregular memory access patterns, load imbalance, and reduced parallelism. This has led researchers to propose […]
Nov, 4
Accelerating Twisted Mass LQCD with QPhiX
We present the implementation of twisted mass fermion operators for the QPhiX library. We analyze the performance on the Intel Xeon Phi (Knights Corner) coprocessor as well as on Intel Xeon Haswell CPUs. In particular, we demonstrate that on the Xeon Phi 7120P the Dslash kernel is able to reach 80% of the theoretical peak […]
Nov, 4
Exact diagonalization of quantum lattice models on coprocessors
We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a […]
Nov, 4
Heterogeneous CPU/(GP) GPU Memory Hierarchy Analysis and Optimization
Heterogeneous systems, more specifically CPU – GPGPU platforms, have gained a lot of attention due to the excellent speedups GPUs can achieve with such little amount of energy consumption. Anyhow, not everything is such a good story, the complex programming models to get the maximum exploitation of the devices and data movement overheads are some […]
Nov, 4
Performance of GTX Titan X GPUs and Code Optimization
Recently Nvidia has released a new GPU model: GTX Titan X (TX) in a linage of the Maxwell architecture. We use our conjugate gradient code and non-perturbative renormalization code to measure the performance of TX. The results are compared with those of GTX Titan Black (TB) in a lineage of the Kepler architecture. We observe […]
Nov, 4
3rd International Conference on Mechanical, Electronics and Computer Engineering (CMECE), 2016
Dear Scholars and Researchers, Warmest Greetings from CMECE 2016! This is 2016 3rd International Conference on Mechanical, Electronics and Computer Engineering (CMECE 2016) conference committee. We are very pleased to tell you that CMECE 2016 will be held in New York, USA during January 07-09, 2016. CMECE2014 and 2015 had been held in Sanya, China […]
Nov, 4
4th International Conference on Nano and Materials Science (ICNMS), 2016
Dear Scholars and Researchers, Warmest Greetings from ICNMS 2016! This is 2016 4th International Conference on Nano and Materials Science (ICNMS 2016) conference committee. We are very pleased to tell you that ICNMS 2016 will be held in New York, USA during January 7-9, 2016. Publication All papers, both invited and contributed, will be reviewed […]
Nov, 3
Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices
Sparse matrix vector multiplication (SpMV) is an important linear algebra primitive. Recent research has focused on improving the performance of SpMV on GPUs when using compressed sparse row (CSR), the most frequently used matrix storage format on CPUs. Efficient CSR-based SpMV obviates the need for other GPU-specific storage formats, thereby saving runtime and storage overheads. […]

