17526

Posts

Sep, 3

Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes

The basic features of some of the most versatile and popular open source frameworks for machine learning (TensorFlow, Deep Learning4j, and H2O) are considered and compared. Their comparative analysis was performed and conclusions were made as to the advantages and disadvantages of these platforms. The performance tests for the de facto standard MNIST data set […]
Aug, 26

Accelerate Local Tone Mapping for High Dynamic Range Images Using OpenCL with GPU

Tone mapping has been used to transfer HDR (high dynamic range) images to low dynamic range. This paper describes an algorithm to display high dynamic range images. Although local tone-mapping operator is better than global operator in reproducing images with better details and contrast, however, local tone mapping algorithm usually requires a huge amount of […]
Aug, 26

Vulnerability Analysis and Attacks on Intel Xeon Phi Coprocessor

The Intel Xeon Phi coprocessor is a PCIe based add-in card. Though it is prone to simple attacks, many high performance computing systems are constructed by combining CPUs and coprocessors. This paper describes two attacks that exploit vulnerabilities related to the boot process of coprocessor and ownership of offload user. Proof of concept codes are […]
Aug, 26

Large Integer Arithmetic in GPU for Cryptography

Most computer nowadays support 32 bits or 64 bits of data type on various type of programming languages and they are sufficient for most use cases. However, in cryptography, the required range and precision are more than 64 bits which are computationally expensive on CPUs. In this report, we present our design and implementation of […]
Aug, 26

Dynamic Parallelism in GPU Optimized Barnes Hut Trees for Molecular Dynamics Simulations

Since the beginning of the modern computing era, high performance computing has been pushing the boundaries of the types of problems that can be solved in many different disciplines. One of the leading fields is computational biophysics where molecular dynamics (MD) simulations provide microscopic resolution details of how biomolecules move, fold, and assemble into intricate […]
Aug, 26

eccCL: parallelized GPU implementation of Ensemble Classifier Chains

BACKGROUND: Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational […]
Aug, 17

Simulating the Cardinal Movements of Childbirth Using Finite Element Analysis on the Graphics Processing Unit

Many problems can occur during childbirth which may lead to instant or future morbidity and even mortality. Therefore the computer-based simulation of the mechanisms and biomechanics of human childbirth is becoming an increasingly important area of study, to avoid potential trauma to the baby and the mother throughout, and immediately following, the childbirth process. Computer-based […]
Aug, 17

Scaling Deep Learning on GPU and Knights Landing clusters

The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited […]
Aug, 17

Gpufit: An open-source toolkit for GPU-accelerated curve fitting

We present a general purpose, open-source software library for estimation of non-linear parameters by the Levenberg-Marquardt algorithm. The software, Gpufit, runs on a Graphics Processing Unit (GPU) and executes computations in parallel, resulting in a significant gain in performance. We measured a speed increase of up to 42 times when comparing Gpufit with an identical […]
Aug, 17

Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture

Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration. Among emerging killer applications, parallel graph processing has been a critical technique to analyze connected data. In this paper, we empirically evaluate various computing platforms including an Intel Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 […]
Aug, 17

GARDENIA: A Domain-specific Benchmark Suite for Next-generation Accelerators

This paper presents the Graph Analytics Repository for Designing Next-generation Accelerators (GARDENIA), a benchmark suite for studying irregular algorithms on massively parallel accelerators. Existing generic benchmarks for accelerators have mainly focused on high performance computing (HPC) applications with limited control and data irregularity, while available graph analytics benchmarks do not apply state-of-the-art algorithms and/or optimization […]
Aug, 14

Accelerating digital forensic searching through GPGPU parallel processing techniques

Background String searching within a large corpus of data is a critical component of digital forensic (DF) analysis techniques such as file carving. The continuing increase in capacity of consumer storage devices requires similar improvements to the performance of string searching techniques employed by DF tools used to analyse forensic data. As string searching is […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: