Posts
Feb, 28
An Efficient Multiway Mergesort for GPU Architectures
Sorting is a primitive operation that is a building block for countless algorithms. As such, it is important to design sorting algorithms that approach peak performance on a range of hardware architectures. Graphics Processing Units (GPUs) are particularly attractive architectures as they provides massive parallelism and computing power. However, the intricacies of their compute and […]
Feb, 28
Key Reconciliation with Low-Density Parity-Check Codes for Long-Distance Quantum Cryptography
The speed at which two remote parties can exchange secret keys over a fixed-length fiber-optic cable in continuous-variable quantum key distribution (CV-QKD) is currently limited by the computational complexity of post-processing algorithms for key reconciliation. Multi-edge low-density parity-check (LDPC) codes with low code rates and long block lengths were proposed for CV-QKD, in order to […]
Feb, 28
Deep Voice: Real-time Neural Text-to-Speech
We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end neural speech synthesis. The system comprises five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an […]
Feb, 28
CHAOS: A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi
Deep learning is an important component of big-data analytic tools and intelligent applications, such as, self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive, and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time […]
Feb, 27
International Conference on Bioinformatics and Computational Intelligence (ICBCI), 2017
Publication ICBCI 2017 will be published in Proceedings. Submission Methods Electronic Submission System (.pdf) http://www.easychair.org/conferences/?conf=icbci2017 Contacts Ms. Ada R. L. Wei Email: icbci@zhconf.ac.cn Tel: +86-28-8625-6789 10 am–12 am, 2 pm-6 pm, Monday to Friday
Feb, 27
The 2nd International Conference on Network Security (ICNS), 2017
2017 II International Conference on Network Security (ICNS 2017) will be held in Kunming, China, during December 8-10, 2017. ICNS 2017 will be a remarkable event which brings together professors, researchers and students in the field of Network Security making the conference a perfect platform to share experience, foster collaborations across industry and academia, and […]
Feb, 26
Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network
OpenCL FPGA has recently gained great popularity with emerging needs for workload acceleration such as Convolutional Neural Network (CNN), which is the most popular deep learning architecture in the domain of computer vision. While OpenCL enhances the code portability and programmability of FPGA, it comes at the expense of performance. The key challenge is to […]
Feb, 26
liquidSVM: A Fast and Versatile SVM package
liquidSVM is a package written in C++ that provides SVM-type solvers for various classification and regression tasks. Because of a fully integrated hyper-parameter selection, very carefully implemented solvers, multi-threading and GPU support, and several built-in data decomposition strategies it provides unprecedented speed for small training sizes as well as for data sets of tens of […]
Feb, 26
Is GPGPU CCL worth it? A performance comparison between some GPU and CPU algorithms for solving connected components labeling on binary images
Connected component labeling (CCL) is a traditionally sequential problem that is hard to parallelize. This report aims to test the performance of solving CCL using massively parallel hardware through GPGPU. To achieve this several CCL algorithms were researched and implemented using C++ and OpenCL. The results showed an improvement of up to a factor of […]
Feb, 26
Large-Scale Stochastic Learning using GPUs
In this work we propose an accelerated stochastic learning system for very large-scale applications. Acceleration is achieved by mapping the training algorithm onto massively parallel processors: we demonstrate a parallel, asynchronous GPU implementation of the widely used stochastic coordinate descent/ascent algorithm that can provide up to 35x speed-up over a sequential CPU implementation. In order […]
Feb, 26
First Experiences Optimizing Smith-Waterman on Intel’s Knights Landing Processor
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration […]
Feb, 22
Dynamic Buffer Overflow Detection for GPGPUs
Buffer overflows are a common source of program crashes, data corruption, and security problems. In this work, we demonstrate that GPU-based workloads can also cause buffer overflows, a problem that was traditionally ignored because CPUs and GPUs had separate memory spaces. Modern GPUs share virtual, and sometimes physical, memory with CPUs, meaning that GPU-based buffer […]