high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Deep Learning for Obfuscated Code Analysis

Deep Learning for Obfuscated Code Analysis

Alexander Shroyer

Indiana University

School of Informatics, Computing, and Engineering, Indiana University

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Function Classification for Obfuscated Binary Code

5297

views

Modern software development relies increasingly on third-party code dependencies, which enables rapid development but also increases risk of introducing bugs, malware, or unauthorized intellectual property. The goal of this dissertation is to reduce these risks making them easier to detect. Determining the meaning of an arbitrary program reduces to solving the halting problem, which is provably impossible. Instead, this work focuses on a narrower scope: to assign a similarity metric between a known program and an unknown one. To be able to quantify the distance between two programs, one must take into account slight variations in programs due to diverse compilation, whether debug symbols are stripped, or even intentional obfuscation. We address this variation by adding diversity to our training sample data through diverse compilation and deliberate obfuscation. These methods preserve the syntactic and structural qualities of valid code and permit augmentation of sparse datasets on a large scale. We train a variety of models to classify programs in the augmented training data. These trained models can now predict which parts of unknown programs are most similar to the training programs. In this work we train on standard library functions originally implemented in the C programming language within the musl library. This forms the basis of a novel method which can be applied to other codebases in order to quickly scan for similar examples in unfamiliar code.

Tags: Computer science, CUDA, Deep learning, nVidia, nVidia GeForce RTX 2060, OpenCL, Package, PyTorch, Thesis

January 7, 2024 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Deep Learning for Obfuscated Code Analysis

Package:

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Deep Learning for Obfuscated Code Analysis

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)