Posts
Jun, 22
Concurrent Task Execution on the Intel Xeon Phi
The Intel Xeon Phi coprocessor is a new choice for the high performance computing industry and it needs to be tested. In this thesis, we compared the difference in performance between the Xeon Phi and the GPU. The Smith-Waterman algorithm is a widely used algorithm for solving the sequence alignment problem. We implemented two versions […]
Jun, 22
Comparative Study of the Parallelization of the Smith-Waterman Algorithm on OpenMP and Cuda C
In this paper, we present parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman’s algorithm for sequence alignment. This algorithm, well known in bioinformatics for its applications, is unfortunately time-consuming on a serial computer. We use formulation based on anti-diagonals structure of data. This representation focuses on […]
Jun, 22
Optimization and Parallelization Methods for the Design of Next-Generation Radio Networks
The complexity of the design of radio networks has grown with the adoption of modern standards. Therefore, the role of the computer for the faster delivery of accurate results has become increasingly important. In this thesis, novel methods for the planning and automatic optimization of radio networks are developed and discussed. The state-of-the-art metaheuristic algorithms, […]
Jun, 22
GPU accelerated spectral finite elements on all-hex meshes
This paper presents a spectral element finite element scheme that efficiently solves elliptic problems on unstructured hexahedral meshes. The discrete equations are solved using a matrix-free preconditioned conjugate gradient algorithm. An additive Schwartz two-scale preconditioner is employed that allows h-independence convergence. An extensible multi-threading programming API is used as a common kernel language that allows […]
Jun, 22
Generating Efficient Tensor Contractions for GPUs
Many scientific and numerical applications, including quantum chemistry modeling and fluid dynamics simulation, require tensor product and tensor contraction evaluation. Tensor computations are characterized by arrays with numerous dimensions, inherent parallelism, moderate data reuse and many degrees of freedom in the order in which to perform the computation. The best-performing implementation is heavily dependent on […]
Jun, 19
Autotuning Tensor Contraction Computations on GPUs
We describe a framework for generating optimized GPU code for computing tensor contractions, a multidimensional generalization of matrix-matrix multiplication that arises frequently in computational science applications. Typical performance optimization strategies for such computations transform the tensors into sequences of matrix-matrix multiplications to take advantage of an optimized BLAS library, but this approach is not appropriate […]
Jun, 19
Study of Sparse-Matrix Vector Multiplication (SpMV) on Different Architectures and Libraries
With the advent of parallel processing architectures and a steep increase in parallelism found among the recent applications, GPGPUs have gained attention with respect to their importance in the execution of these applications. In this document, we specifically analyze Sparse-Matrix Vector Multiplication(SPMV) across different architectures, libraries and matrix formats. The experimental platforms include but are […]
Jun, 19
Bulk GCD Computation Using a GPU to Break Weak RSA Keys
RSA is one the most well-known public-key cryptosystems widely used for secure data transfer. An RSA encryption key includes a modulus n which is the product of two large prime numbers p and q. If an RSA modulus n can be decomposed into p and q, the corresponding decryption key can be computed easily from […]
Jun, 19
Parallel BTF Compression with Multi-Level Vector Quantization in OpenCL
Bidirectional Texture Function (BTF) as an effective visual fidelity representation of surface appearance is becoming more and more widely used. In this paper we report on contributions to BTF data compression for multi-level vector quantization. We describe novel decompositions that improve the compression ratio by 15% in comparison with the original method, without loss of […]
Jun, 19
Accelerated dimension-independent adaptive Metropolis
This work considers black-box Bayesian inference over high-dimensional parameter spaces. The well-known adaptive Metropolis (AM) algorithm of (Haario etal. 2001) is extended herein to scale asymptotically uniformly with respect to the underlying parameter dimension for Gaussian targets, by respecting the variance of the target. The resulting algorithm, referred to as the dimension-independent adaptive Metropolis (DIAM) […]
Jun, 19
Visualization of OpenCL Application Execution on CPU-GPU Systems
Evaluating the performance of parallel and heterogeneous programs and architectures can be challenging. An emulator or simulator can be used to aid the programmer. To provide guidance and feedback to the programmer, the simulator needs to present traces, reports, and debugging information in a coherent and unambiguous format. Although these outputs contain a lot of […]
Jun, 17
2nd International Conference on Mechanical, Aeronautical and Automotive Engineering (ICMAA), 2015
Topics: Mechanical Engineering Applied Mechanics Automation Biomechanics Computational Fluid Dynamics Design and Manufacturing Energy Management Fluid Dynamics Fuels and Combustion Green Manufacturing Heat and Mass Transfer Industrial Tribology Instrumentation and Control Internal Combustion Engines Mechatronics Micro-Machining Modeling of Processes Nano- Technology Optimization of Systems Renewable and Non-Renewable Energies Reverse Engineering Robotics Solid Mechanics Oil and […]

