high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Security » Android Malware Classification Using Parallelized Machine Learning Methods

Android Malware Classification Using Parallelized Machine Learning Methods

Lifan Xu

University of Delaware, Department of Computer and Information Sciences

University of Delaware, 2016

@phdthesis{xu2016android,

title={Android Malware Classification Using Parallelized Machine Learning Methods},

author={Xu, Lifan},

year={2016},

school={University of Delaware}

}

Download (PDF)

View

Source

2968

views

Android is the most popular mobile operating system with a market share of over 80%. Due to its popularity and also its open source nature, Android is now the platform most targeted by malware, creating an urgent need for effective defense mechanisms to protect Android-enabled devices. In this dissertation, we present a novel characterization and machine learning method for Android malware classification. We first present a method of dynamically analyzing and classifying Android applications as either malicious or benign based on their execution behaviors. We invent novel graph-based methods of characterizing an application’s execution behavior that are inspired by traditional vector-based characterization methods. We show evidence that our graph-based techniques are superior to vector-based techniques for the problem of classifying malicious and benign applications. We also augment our dynamic analysis characterization method with a static analysis method which we call HADM, Hybrid Analysis for Detection of Malware. We first extract static and dynamic information, and convert this information into vector-based representations. It has been shown that combining advanced features derived by deep learning with the original features provides significant gains. Therefore, we feed each of the original dynamic and static feature vector sets to a Deep Neural Network (DNN) which outputs a new set of features. These features are then concatenated with the original features to construct DNN vector sets. Different kernels are then applied onto the DNN vector sets. We also convert the dynamic information into graph-based representations and apply graph kernels onto the graph sets. Learning results from various vector and graph feature sets are combined using hierarchical Multiple Kernel Learning (MKL) to build a final hybrid classifier. Graph-based characterization methods and their associated machine learning algorithm tend to yield better accuracy for the problem of malware detection. However, the graph-based machine learning techniques we use, i.e., graph kernels, are computationally expensive. Therefore, we also study the parallelization of graph kernels in this dissertation. We first present a fast sequential implementation of the graph kernel. Then, we explore two different parallelization schemes on the CPU and four different implementations on the GPU. After analyzing the advantages of each, we present a hybrid parallel scheme, which dynamically chooses the best parallel implementation to use based on characteristics of the problem. In the last chapter of this dissertation, we explore parallelizing deep learning on a novel architecture design, which may be prevalent in the future. Parallelization of deep learning methods has been studied on traditional CPU and GPU clusters. However, the emergence of Processing In Memory (PIM) with die-stacking technology presents an opportunity to speed up deep learning computation and reduce energy consumption by providing low-cost high-bandwidth memory accesses. PIM uses 3D die stacking to move computations closer to memory and therefore reduce data movement overheads. In this dissertation, we study the parallelization of deep learning methods on a system with multiple PIM devices. We select three representative deep learning neural network layers: the convolutional, pooling, and fully connected layers, and parallelize them using different schemes targeted to PIM devices.

Tags: ATI, ATI Radeon HD 7970, Computer science, Deep learning, Machine learning, Neural networks, nVidia, OpenCL, OpenMP, Security, Tesla C2050, Thesis

December 31, 2016 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Android Malware Classification Using Parallelized Machine Learning Methods

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Android Malware Classification Using Parallelized Machine Learning Methods

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)