Posts
Jan, 12
Importance-Driven Isosurface Decimation for Visualization of Large Simulation Data Based on OpenCL
For large simulation data, Parallel Marching Cubes algorithm is efficient and commonly used to extract isosurfaces in 3D scalar field. However, the isosurface meshes are sometimes too dense and it is difficult for scientists to specify the areas they are interested in. In this paper, we provide them a new way to define mesh importance […]
Jan, 12
A tool for mapping Single Nucleotide Polymorphisms using Graphics Processing Units
BACKGROUND: Single Nucleotide Polymorphism (SNP) genotyping analysis is very susceptible to SNPs chromosomal position errors. As it is known, SNPs mapping data are provided along the SNP arrays without any necessary information to assess in advance their accuracy. Moreover, these mapping data are related to a given build of a genome and need to be […]
Jan, 12
Warp-Level Divergence in GPUs: Characterization, Impact, and Mitigation
High throughput architectures rely on high thread-level parallelism (TLP) to hide execution latencies. In state-of-art graphics processing units (GPUs), threads are organized in a grid of thread blocks (TBs) and each TB contains tens to hundreds of threads. With a TB-level resource management scheme, all the resource required by a TB is allocated/released when it […]
Jan, 12
GPU-Accelerated parallel FDTD on Distributed Heterogeneous Platform
This paper introduces a (Finite-Difference Time-Domain) FDTD code written in Fortran and CUDA for realistic electromagnetic calculations with parallelization methods of Message Passing Interface (MPI) and Open Multi-Processing (OpenMP). Since both Central Processing Unit (CPU) and Graphics Processing Unit (GPU) resources are utilized, a faster execution speed can be reached compared to a traditional pure […]
Jan, 11
Implementations of the Hough Transform on the Embedded Multicore Processors
Embedded multicore processors represented by FPGAs and GPUs have lately attracted considerable attention for their potential computation ability and power consumption. Recent FPGAs have hundreds of embedded DSP slices and block RAMs. For example, Xilinx Virtex-6 Family FPGAs have a DSP48E1 slice, which is a configurable logic block equipped with fast multipliers, adders, pipeline registers, […]
Jan, 11
Maximal Information Coefficient Analysis
In the domain of the Side Channel Attacks, various statistical tools have succeeded to retrieve a secret key, as the Pearson coefficient or the Mutual Information. In this paper we propose to study the Maximal Information Coefficient (MIC) which is a non-parametric method introduced by Reshef et al. [13] to compare two random variables. The […]
Jan, 11
Mining Rare Features in Fingerprints Using Core Points and Triplet-based Features
A fingerprint matching algorithm with a novel set of matching parameters based on core points and triangular descriptors is proposed to discover rarity in fingerprints. The algorithm uses a mathematical and statistical approach to discover rare features in fingerprints which provides scientific validation for both ten-print and latent fingerprint evidence. A feature is considered rare […]
Jan, 11
Framework for utilizing computational devices within simulation
Nowadays there exist several frameworks to utilize a computation power of graphics cards and other computational devices such as FPGA, ARM and multi-core processors. The best known are either low-level and need a lot of controlling code or are bounded only to special graphic cards. Furthermore there exist more specialized frameworks, mainly aimed to the […]
Jan, 11
Toward a Generic Hybrid CPU-GPU Parallelization of Divide-and-Conquer Algorithms
In the last few years, the development of programming languages for general purpose computing on Graphic Processing Units (GPUs) has led to the design and implementation of fast parallel algorithms for this architecture for a large spectrum of applications. Given the streaming-processing characteristics of GPUs, most practical applications consist of tasks that admit highly data-parallel […]
Jan, 10
An octree-based proxy for collision detection in large-scale particle systems
Particle systems are important building block for simulating vivid and detail-rich effects in virtual world. One of the most difficult aspects of particle systems has been detecting collisions between particlesand mesh surface. Due to the huge computation, a variety of proxy-based approaches have been proposed recently to perform visually correct simulation. However, all either limit […]
Jan, 10
Integrating Occlusion Culling with Parallel LOD for Rendering Complex 3D Environments on GPU
Real-time rendering of complex 3D models is still a very challenging task. Recently, many GPU-based level-of-detail (LOD) algorithms have been proposed to decrease the complexity of 3D models in a parallel fashion. However, LOD approaches alone are not sufficient to reduce the amount of geometry data for interactive rendering of massive scale models. Visibility-based culling, […]
Jan, 10
Saddle Vertex Graph (SVG): A Novel Solution to the Discrete Geodesic Problem
This paper presents the Saddle Vertex Graph (SVG), a novel solution to the discrete geodesic problem. The SVG is a sparse undirected graph that encodes complete geodesic distance information: a geodesic path on the mesh is equivalent to a shortest path on the SVG, which can be solved efficiently using the shortest path algorithm (e.g., […]

