high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Bin Lei, Caiwen Ding, Le Chen, Pei-Hung Lin, Chunhua Liao

Dept. Computer Science and Engineering, University of Connecticut, Storrs, USA

arXiv:2307.07686 [cs.SE], (15 Jul 2023)

DOI:10.48550/arXiv.2307.07686

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Fortran-CPP-HPC-code-translation-dataset

823

views

In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code. To ensure reliability and applicability, the dataset is initially refined using a meticulous code similarity test. The effectiveness of our dataset is assessed using both quantitative (CodeBLEU) and qualitative (human evaluation) methods. We demonstrate how this dataset can significantly improve the translation capabilities of large-scale language models, with improvements of times 5.1 for models with no prior coding knowledge and times 9.9 for models with some coding familiarity. Our work highlights the potential of this dataset to advance the field of code translation for high-performance computing.

Tags: Computer science, CUDA, Fortran, Machine learning, nVidia, nVidia RTX 6000 Ada, OpenMP, Package

July 24, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)