Linking with Code Features
Figure 1 shows a bug record in the bug tracking repository of the ZXing project, an open-source barcode image-processing software library for Android phones. A bug record/report usually consists:
We also had checked in the version control repository of the ZXing project and found the corresponding fix for the issue #86. Figure 2 shows its commit log, which is a textual description about that set of code changes which are committed to the version control repository. Because that set of code changes (called change set) was for fixing the issue, we call it a fixing change set or a fix. In addition to fixing changes, developers might make other types of code changes for enhancement, improvement, etc. A change set or a fix can be involved with multiple changes to different source files.
Figure 3 shows the change set at the revision #599 to fix the issue #86 in Figure 1. In general, the issue(s) in a bug report can be fixed by multiple fixing change sets (i.e. fixes) committed at different transaction/time. On the other hand, each change set committed at a certain time can also fix one or multiple issues in different bug reports. That is, a bug report can be linked to one or multiple fixes, and a fix can be linked to one or multiple bug reports. In the cases which are involved multiple bug reports or multiple fixing change sets, we will separate them and consider a link as a connection between a bug report and a change set. We use the word ‘change set’ and ‘commit’ interchangeably. In addition to the commit log, a commit also has its associated meta-data such as committer, committing date, and a list of changed files.
1. The traditional pattern-based approach which relies on the hint from developers about bug fixing in the commit logs does not work in this case because the commit log does not contain common patterns.
2. The bug report and its corresponding commit log are not textually similar. The bug reporter describes about the issue from the user's perspective (e.g. the output error message). In contrast, in this case, the bug fixer describes the fixing changes from the development view (e.g. adding abstract method declarations) because (s)he writes more about how (s)he fixes the issue, than what bug (s)he has fixed. That is, the bug reporter describes about the issue itself while the fixer records how (s)he has fixed it. This is reasonable because the commit log is designed as part of the version control repository to help developers to record their notes on any change set. This implies that automatic link recovery for bug reports and fixes should not solely rely on textual patterns or textual similarity between the reports and commit logs of the fixes.
Aiming to find an additional mechanism for such link recovery, we explored further the corresponding commit #599 for the bug report #86. Instead of examining only the commit log (Figure 2), we also investigated the fix itself (Figure 3), i.e. the fixing changes that developers made to the source files in order to fix the issue #86. From that fixing code, we observe that