8191

Posts

Aug, 27

A Unified Optimizing Compiler Framework for Different GPGPU Architectures

This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and judicious management of parallelism. The input to our compiler is a naive GPU kernel function, which is functionally correct but without […]
Aug, 27

Low-Latency Elliptic Curve Scalar Multiplication

This paper presents a low-latency algorithm designed for parallel computer architectures to compute the scalar multiplication of elliptic curve points based on approaches from cryptographic side-channel analysis. A graphics processing unit implementation using a standardized elliptic curve over a 224-bit prime field, complying with the new 112-bit security level, computes the scalar multiplication in 1.9 […]
Aug, 27

An Implementation of Coincidence Algorithm on Graphic Processing Units

Genetic Algorithms (GAs) are powerful search techniques. However when they are applied to complex problems, they consume large computation power. One of the choices to make them faster is to use a parallel implementation. This paper presents a parallel implementation of Combinatorial Optimisation with Coincidence Algorithm (COIN) on Graphic Processing Units. COIN is a modern […]
Aug, 27

Perceptually Optimized Real-Time Computer Graphics

Perceptual optimization, the application of human visual perception models to remove imperceptible components in a graphics system, has been proven effective in achieving significant computational speedup. Previous implementations of this technique have focused on spatial level of detail reduction, which typically results in noticeable degradation of image quality. This thesis introduces refresh rate modulation (RRM), […]
Aug, 27

A Novel Approach to Visualizing Dark Matter Simulations

In the last decades cosmological N-body dark matter simulations have enabled ab initio studies of the formation of structure in the Universe. Gravity amplified small density fluctuations generated shortly after the Big Bang, leading to the formation of galaxies in the cosmic web. These calculations have led to a growing demand for methods to analyze […]
Aug, 26

GPU Accelerated Nonlinear Optimization in Radio Interferometric Calibration

We present the GPU based acceleration of two well known nonlinear optimization routines: Levenberg-Marquardt (LM) and Limited Memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) in radio interferometric calibration. Radio interferometric calibration is a heavily compute intensive operation where the same nonlinear optimization problem has to be solved over many time intervals, with different data. We achieve a speedup of […]
Aug, 26

Efficient Dynamic Program Monitoring on Multi-Core Platforms

Software security and reliability have become increasingly important in the modern world. An effective approach to enforcing software security and reliability is to monitor a program’s execution at run time. However, instrumentation-based implementation of a dynamic program monitor on single-core systems suffers significant performance overhead. As multi-core architecture becomes more mainstream, implementing efficient dynamic program […]
Aug, 26

Scalable Clustering for Vision using GPUs

Clustering algorithms have wide applications in Computer Vision, Data mining, Data Visualization, etc. Clustering is an important step for indexing and searching of documents, images, video, etc. Clustering large numbers of high-dimensional vectors is very computation intensive. CPUs are unable to handle such load and consume sometimes days and even weeks to cluster large data. […]
Aug, 26

Designing a Unified Programming Model for Heterogeneous Machines

While high-efficiency machines are increasingly embracing heterogeneous architectures and massive multithreading, contemporary mainstream programming languages reflect a mental model in which processing elements are homogeneous, concurrency is limited, and memory is a flat undifferentiated pool of storage. Moreover, the current state of the art in programming heterogeneous machines tends towards using separate programming models, such […]
Aug, 26

Is OpenCL a suitable platform for algorithm development in health care systems?

This thesis reviews if OpenCL is a suitable and cost effective platform for algorithm development in health care systems. Aspects such as maintainability, performance, portability and integration with high-level languages (in this case Python) are analyzed. The review is done by implementing one part of a dose calculation algorithm that is complex enough to provide […]
Aug, 26

The nonequispaced FFT on graphics processing units

Without doubt, the fast Fourier transform (FFT) belongs to the algorithms with large impact on science and engineering. By appropriate approximations, this scheme has been generalized for arbitrary spatial sampling points. This so called nonequispaced FFT is the core of the sequential NFFT3 library and we discuss its computational costs in detail. On the other […]
Aug, 26

Password Recovery Using MPI and CUDA

Using passwords to verify a user’s identity is the most widely deployed method for electronic authentication. When system administrators need to recover lost passwords or test accounts for easily guessable passwords, it can require millions of hash function and string comparison operations. These operations can be computationally expensive but are easily parallelizable because each password […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org