Exploratory Data Analysis of Software Repositories via GPU Processing
Universidade Federal Fluminense, Niteroi, Brazil
2014 International Conference on Software Engineering & Knowledge Engineering, 2014
@article{da2014exploratory,
title={Exploratory Data Analysis of Software Repositories via GPU Processing},
author={da Silva Junior, Jose Ricardo and Clua, Esteban and Murta, Leonardo and Sarma, Anita},
year={2014}
}
Analyzing software repositories with thousands of artifacts is data intensive, which makes interactive exploration analysis of such data infeasible. We introduce a novel approach, Dominoes, that can support automated exploration of relationships amongst project elements, where users have the flexibility to explore on the fly the numerous types of project relationships. Dominoes organizes data extracted from software repositories into multiple matrices that can be treated as domino pieces (e.g., [commit|method]). It allows connecting such pieces based on a set of matrix operations to derive additional domino pieces. These derived domino pieces represent semantics on project entity relationships (e.g., number of commits in which two methods co-occurred) and can be used for further explorations. This opens a vast possibility of data analysis, since these domino pieces can be iteratively combined. Our proposed matrix representation and operations allow for fast and efficient processing of a large volume of data by using a highly parallel architecture, such as GPUs.
September 15, 2014 by hgpu