1173

Papers on hgpu.org (.txt-file)

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning Download Package

tntorch: Tensor Network Learning with PyTorch Download Package

To Co-Run, or Not To Co-Run: A Performance Study on Integrated Architectures Download Package

To GPU Synchronize or Not GPU Synchronize? Download

To Use or Not to Use: Graphics Processing Units for Pattern Matching Algorithms Download

Togpu: Automatic Source Transformation from C++ to CUDA using Clang/LLVM Download

TonY: An Orchestrator for Distributed Machine Learning Jobs Download Package

Toolchain for programming, simulating and studying the XMT many-core architecture Download Package

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions Download

Tools for GPU Computing – Debugging and Performance Analysis of Heterogenous HPC Applications Download

Tools for GPU Computing–Debugging and Performance Analysis of Heterogenous HPC Applications Download

Tools for Reduced Precision Computation: A Survey Download

Top ten ways to make formal methods for HPC practical Download

Top-k Queries Processing With Uncertain Data on Graphics Processing Units Download

Top-Performance Tokenization and Small-Ruleset Regular Expression Matching: A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor

Topical perspective on massive threading and parallelism

TopicBERT for Energy Efficient Document Classification Download

Topology optimization design of 3D electrothermomechanical actuators by using GPU as a co-processor Download

Topology Optimization with Unstructured Meshes on Graphics Processing Units (GPUs) Download

Torch7: A Matlab-like Environment for Machine Learning Download

TorchAudio: Building Blocks for Audio and Speech Processing Download Package

TorchBench: Benchmarking PyTorch with High API Surface Coverage Download Package

Torchnet: An Open-Source Platform for (Deep) Learning Research Download Package

torchode: A Parallel ODE Solver for PyTorch Download Package

TorchOpt: An Efficient Library for Differentiable Optimization Download Package

Toward a Generic Hybrid CPU-GPU Parallelization of Divide-and-Conquer Algorithms Download

Toward a GPU-Accelerated Immersed Boundary Method for Wind Forecasting Over Complex Terrain Download

Toward a Multi-level Parallel Framework on GPU Cluster with PetSC-CUDA for PDE-based Optical Flow Computation Download

Toward a multicore architecture for real-time ray-tracing Download

Toward a Practical Implementation of Exemplar-Based Noise Robust ASR Download

Toward Accelerating the Matrix Inversion Computation of Symmetric Positive-Definite Matrices on Heterogeneous GPU-Based Systems Download

Toward Acceleration of RSA Using 3D Graphics Hardware Download

Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks Download

Toward Auto-tuned Krylov Basis Computations with minimized Communication on Clusters of Accelerators Download

Toward Automatic Translation: From OpenACC to OpenMP 4 Download

Toward Better Computation Models for Modern Machines Download

Toward efficient GPU-accelerated N-body simulations Download

Toward GPU Accelerated Data Stream Processing Download

Toward GPU-accelerated Traffic Simulation and Its Real-Time Challenge Download

Toward GPUs being mainstream in analytic processing: An initial argument using simple scan-aggregate queries Download

Toward Harnessing DOACROSS Parallelism for Multi-GPGPUs Download

Toward improved aeromechanics simulations using recent advancements in scientific computing Download

Toward large-scale Hybrid Monte Carlo simulations of the Hubbard model on graphics processing units Download

Toward OpenCL Automatic Multi-Device Support Download

Toward optimised skeletons for heterogeneous parallel architecture with performance cost model Download

Toward Performance Portability for CPUs and GPUs Through Algorithmic Compositions Download

Toward Practical Real-Time Photon Mapping: Efficient GPU Density Estimation Download

Toward Real-Time Dense 3d Reconstruction using Stereo Vision Download

Toward real-time kernel density estimate display for instrumentation

Towards a Benchmarking Suite for Kernel Tuners Download Package

Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Download

Towards a Distributed GPU-Accelerated Matrix Inversion Download

Towards a functional run-time for dense NLA domain Download

Towards a GPU-based Implementation of Interaction Nets Download Package

Towards a GPU-Based Simulation Framework for Deformable Surface Meshes

Towards a GPU-Parallelization of the neXtSIM-DG Dynamical Core Download Package

Towards a More Efficient Use of GPUs Download

Towards a Performance-Portable FFT Library for Heterogeneous Computing Download

Towards a Portable and Future-proof Particle-in-Cell Plasma Physics Code Download

Towards a robust, real-time face processing system using CUDA-enabled GPUs Download

Towards a Software Transactional Memory for Graphics Processors Download

Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems Download

Towards a Unified CPU-GPU code hybridization: A GPU Based Optimization Strategy Efficient on Other Modern Architectures Download

Towards a unified framework for rapid 3D computed tomography on commodity GPUs Download

Towards a Unified Sentiment Lexicon (USL) based on Graphics Processing Units (GPUs) Download

Towards a Unified Sentiment Lexicon Based on Graphics Processing Units Download

Towards Accelerated Computation of Atmospheric Equations Using CUDA

Towards accelerating molecular modeling via multi-scale approximation on a GPU

Towards accelerating Smoothed Particle Hydrodynamics simulations for free-surface flows on multi-GPU clusters Download

Towards acceleration of fault simulation using graphics processing units Download

Towards ad-hoc GPU acceleration of parallel eigensystem computations Download

Towards Adaptive GPU Resource Management for Embedded Real-Time Systems Download

Towards Alignment of Parallelism in SYCL and ISO C++ Download

Towards an automatic generation of dense linear algebra solvers on parallel architectures Download

Towards an Effective Unified Programming Model for Many-Cores Download

Towards an embedded biologically-inspired machine vision processor Download

Towards an interactive and automated script feature analysis of 3D scanned cuneiform tablets Download

Towards automated kernel selection in machine learning systems: A SYCL case study Download Package

Towards Automated Learning of Object Detectors Download

Towards Automatic C Programs Optimization and Parallelization using the PIPS-PoCC Integration Download Package

Towards automatic Digital Surface Model generation using a Graphics Processing Unit

Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code Download

Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs Download Package

Towards Automating Multi-dimensional Data Decomposition for Executing a Single-GPU Code on a Multi-GPU System Download

Towards Building Error Resilient GPGPU Applications Download

Towards Chip-on-Chip Neuroscience: Fast Mining of Frequent Episodes Using Graphics Processors Download

Towards chip-on-chip neuroscience: fast mining of neuronal spike streams using graphics hardware Download

Towards Co-execution on Commodity Heterogeneous Systems: Optimizations for Time-Constrained Scenarios Download Package

Towards Code Generation from Design Models for Embedded Systems on Heterogeneous CPU-GPU Platforms Download

Towards Comprehensive Parametric Code Generation Targeting Graphics Processing Units in Support of Scientific Computation Download

Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems Download

Towards Distortion-Predictable Embedding of Neural Networks Download Package

Towards Distributed Heterogenous High-Performance Computing with ViennaCL Download Package

Towards Domain-specific Computing for Stencil Codes in HPC Download Package

Towards dynamic reconfigurable load-balancing for hybrid desktop platforms Download

Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA Download

Towards Efficient GPU Sharing on Multicore Processors Download

Towards Efficient Indexing of Spatiotemporal Trajectories on the GPU for Distance Threshold Similarity Searches Download

Towards Efficient Large-Scale Graph Neural Network Computing Download

Towards Efficient Risk Quantification-Using GPUs and Variance Reduction Technique Download

 

Brief statistics for this page

Titles: 100

Download open PDFs: 93

Package packages: 20

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: