Empirical Study

We conducted an empirical experiment to evaluate Mk-Fault's accuracy and how it helps reduce efforts in localizing Make build code faults. We first collected several open-source subject projects from sourceforge.net that use Make as their build language and have a long development history.
Data collection: We wrote a simple tool to select for each subject project the revisions that have at least one modified Makefile. We randomly selected one revision as a starting point. Then, we manually examined the changes to the Makefiles as well as the commit logs to determine if they were the bug fixes to build crashes in those Makefiles. If such a build error was found, we compared the current revision (fixed one) with the previous revision (buggy one), and used the fixing change location in the buggy revision as the root cause of the error. We collected that root cause location and that buggy Makefile. We skipped the errors in the evaluation phase, non-crashing faults, and faults involving multiple fixing locations. Then, we continued the process for the next revisions until we had 15-20 faults for each project and used them as an oracle (table below).

Name	Bugs	Makefiles	LOC	Variables	Rules
Actiongame	19	70	691	107	91
Blood Frontier	17	60	769	111	103
Dream Toolbox	15	34	400	32	31
GMod	20	20	430	10	23
X10	20	26	262	19	17
Totals	91	210	2,552	279	265

The columns Bugs and Makefiles shows the total number of bugs collected for each project and the number of involved Makefilles. The last three columns show the average numbers of LOCs and their program elements including the average numbers of variables and rules for each Makefile.

Empirical study results

Sensitivity analysis

Fault Localization for Build Crashes

Jafar Al-Kofahi, Hung Viet Nguyen, Tien N. Nguyen

Empirical Study