Papers on hgpu.org (.txt-file)
Automated Architecture Design for Deep Neural Networks
Automated architecture-aware mapping of streaming applications onto GPUs
Automated Buffer Sizing of Dataflow Applications in a High-Level Synthesis Workflow
Automated development of applications for graphical processing units using rewriting rules
Automated Dynamic Analysis of CUDA Programs
Automated Enhanced Parallelization of Sequential C to Parallel OpenMP
Automated Generation of OpenCL Programs Based on Algebra-Algorithmic Approach
Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications
Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline
Automated Long-Term Monitoring of Parallel Microfluidic Operations Applying a Machine Vision-Assisted Positioning Method
Automated Partitioning of Data-Parallel Kernels using Polyhedral Compilation
Automated pose estimation in 3D point clouds applying annealing particle filters and inverse kinematics on a GPU
Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing
Automated Software Testing of Memory Performance in Embedded GPUs
Automated Techniques for Enabling Efficient MPI Application Migration
Automated test generation for OpenCL kernels using fuzzing and constraint solving
Automated Testing of Graphics Shader Compilers
Automated Tool to Generate Parallel CUDA code from a Serial C Code
Automatic abstraction and fault tolerance in cortical microachitectures
Automatic acceleration of Numpy applications on GPUs and multicore CPUs
Automatic and Explicit Parallelization Approaches for Mathematical Simulation Models
Automatic and portable mapping of data parallel programs to OpenCL for GPU-based heterogeneous systems
Automatic bi-layer video segmentation based on sensor fusion
Automatic C-to-CUDA Code Generation for Affine Programs
Automatic classification of object code using machine learning
Automatic Code Generation and Adaptive Grid Scheduling for GPU Cluster Computing
Automatic code generation and tuning for stencil kernels on modern shared memory architectures
Automatic code generation for solvers of cardiac cellular membrane dynamics in GPUs
Automatic Code Generation for Stencil Computations on GPU Architectures
Automatic code generation methods applied to numerical linear algebra in high performance computing
Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL
Automatic Compilation for Heterogeneous Architectures with Single Assignment C
Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors
Automatic Compiler Based FPGA Accelerator for CNN Training
Automatic contention detection and amelioration for data-intensive operations
Automatic CPU-GPU communication management and optimization
Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures
Automatic Data Layout Generation and Kernel Mapping for CPU+GPU Architectures
Automatic Data Layout Optimizations for GPUs
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
Automatic Detection and Denoising of Signals in Large Geophysical Datasets
Automatic Discovery of Algorithms for Multi-Agent Systems
Automatic Dynamic Task Distribution between CPU and GPU for Real-Time Systems
Automatic efficient data layout for multithreaded stencil codes on CPUs and GPUs
Automatic fitting of spiking neuron models to electrophysiological recordings
Automatic Fusions of CUDA-GPU Kernels for Parallel Map
Automatic Generation Of Application-Specific Accelerators for FPGAs from Python Loop Nests
Automatic generation of CUDA code performing tensor manipulations using C++ expression templates
Automatic Generation of FFT Libraries for GPU Platforms
Automatic generation of heterogeneous spectrometers for radio astronomy
Automatic Generation of Multicore Chemical Kernels
Automatic Generation of OpenCL Code for ARM Architectures
Automatic generation of software pipelines for heterogeneous parallel systems
Automatic generation of warp-level primitives and atomic instructions for fast and portable parallel reduction on GPUs
Automatic GPU optimization through higher-order functions in functional languages
Automatic Hepatic Vessel Segmentation Using Graphics Hardware
Automatic Implementation of Evolutionary Algorithms on GPUs using ESDL
Automatic Kernel Generation for Volta Tensor Cores
Automatic library generation for BLAS3 on GPUs
Automatic Loop Partitioning for Heterogeneous Systems
Automatic Mapping for OpenCL-Programs on CPU/GPU Heterogeneous Platforms
Automatic Mapping of Stream Programs on Multicore Architectures
Automatic Multi-Camera Setup Optimization for Optical Tracking
Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines
Automatic NUMA Characterization using Cbench
Automatic Online Tuning (AutoTune): Fully Extended Analysis
Automatic OpenCL code generation for multi-device heterogeneous architectures
Automatic OpenCL Device Characterization: Guiding Optimized Kernel Design
Automatic OpenCL Task Adaptation for Heterogeneous Architectures
Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators based on a Domain-Specific Language for Medical Imaging
Automatic Optimization of OpenCL-Based Stencil Codes for FPGAs and Its Evaluation
Automatic Optimization of Thread Mapping for a GPGPU Programming Framework
Automatic Parallelization for GPUs
Automatic parallelization for graphics processing units
Automatic Parallelization for Heterogeneous Embedded Systems
Automatic Parallelization of a Gap Model using Java and OpenCL
Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs
Automatic Parallelization of Tiled Stencil Loop Nests on GPUs
Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime
Automatic Performance Optimisation of Parallel Programs for GPUs via Rewrite Rules
Automatic Performance Optimization in ViennaCL for GPUs
Automatic Performance Optimization on Heterogeneous Computer Systems using Manycore Coprocessors
Automatic Performance Tuning of Pipeline Patterns for Heterogeneous Parallel Architectures
Automatic Performance Tuning of Stencil Computations on Graphics Processing Units
Automatic Point Target Detection for Interactive Visual Analysis of SAR Images
Automatic Pose Estimation for Range Images on the GPU
Automatic program analysis for data parallel kernels
Automatic program parallelization for multicore processors
Automatic Resource-Constrained Static Task Parallelization
Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions
Automatic safety proofs for asynchronous memory operations
Automatic Scan Parallelization in OpenMP
Automatic scanning of nuclear emulsions with wide-angle acceptance for nuclear fragment detection
Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures
Automatic Selection of Sparse Matrix Representation on GPUs
Automatic shader level of detail
Automatic SIMD Code Generation
Automatic Skeleton-Based Compilation through Integration with an Algorithm Classification
Automatic Software Synthesis from High-Level ForSyDe Models Targeting Massively Parallel Processors
Automatic source code adaptation for heterogeneous platforms
Titles: 100
open PDFs: 95
packages: 13