Multi-layered Approach for Recovering Links between
Bug Reports and Fixes

Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Tien N. Nguyen

 


Motivating Examples

Linking with Code Features

 

Bug Report #86 in the ZXing Project
Figure 1. Bug Report #86 in the ZXing Project

 

Figure 1 shows a bug record in the bug tracking repository of the ZXing project, an open-source barcode image-processing software library for Android phones. A bug record/report usually consists:

  1. A short summary of the issue(s)

  2. A textual description on the issue(s)

  3. A list of bug comments. The bug tracking system allows the developers in the project or users to discuss about the reported issue(s)

  4. Associated meta-data such as type (defect), status (fixed, etc.), priority, bug re-porter, fixer, commenters, report time, closed time, etc.

 

Commit Log #599 to Fix Bug #86
Figure 2. Commit Log #599 to Fix Bug #86

 

We also had checked in the version control repository of the ZXing project and found the corresponding fix for the issue #86. Figure 2 shows its commit log, which is a textual description about that set of code changes which are committed to the version control repository. Because that set of code changes (called change set) was for fixing the issue, we call it a fixing change set or a fix. In addition to fixing changes, developers might make other types of code changes for enhancement, improvement, etc. A change set or a fix can be involved with multiple changes to different source files.

Committed Change #599 to Fix Bug #86
Figure 3. Committed Change #599 to Fix Bug #86

 

Figure 3 shows the change set at the revision #599 to fix the issue #86 in Figure 1. In general, the issue(s) in a bug report can be fixed by multiple fixing change sets (i.e. fixes) committed at different transaction/time. On the other hand, each change set committed at a certain time can also fix one or multiple issues in different bug reports. That is, a bug report can be linked to one or multiple fixes, and a fix can be linked to one or multiple bug reports. In the cases which are involved multiple bug reports or multiple fixing change sets, we will separate them and consider a link as a connection between a bug report and a change set. We use the word ‘change set’ and ‘commit’ interchangeably. In addition to the commit log, a commit also has its associated meta-data such as committer, committing date, and a list of changed files.

 

 Observations

 

1. The traditional pattern-based approach which relies on the hint from developers about bug fixing in the commit logs does not work in this case because the commit log does not contain common patterns.

2. The bug report and its corresponding commit log are not textually similar. The bug reporter describes about the issue from the user's perspective (e.g. the output error message). In contrast, in this case, the bug fixer describes the fixing changes from the development view (e.g. adding abstract method declarations) because (s)he writes more about how (s)he fixes the issue, than what bug (s)he has fixed. That is, the bug reporter describes about the issue itself while the fixer records how (s)he has fixed it. This is reasonable because the commit log is designed as part of the version control repository to help developers to record their notes on any change set. This implies that automatic link recovery for bug reports and fixes should not solely rely on textual patterns or textual similarity between the reports and commit logs of the fixes.

 

Code Features

 

Aiming to find an additional mechanism for such link recovery, we explored further the corresponding commit #599 for the bug report #86. Instead of examining only the commit log (Figure 2), we also investigated the fix itself (Figure 3), i.e. the fixing changes that developers made to the source files in order to fix the issue #86. From that fixing code, we observe that

  1. In addition to a commit log, the fixing changes and the corresponding changed source files are in fact a crucial part of a fix.

  2. The fixing changes (Figure 3) contain the program entity getHeight, which is mentioned in the bug report. The program entities (e.g. getWidth, getHeight) in the changed file (BaseMonochromeBitmapSource) were discussed in the comments of a bug record (see comment 1, Figure 1), as well as in the summary and description of the bug report.

  3. The changed source file implements certain components/functions of the system. One of those is erroneously implemented, leading to the bug report about the issue(s) on that component. Thus, the terms describing those components (e.g.Nokia, MonochromeBitmapSource) can appear in both textual comments of the bug report and changed source code.

  4. In Comment 1 of Figure 1, a commenter suggests a potential patch (public abstract in getHeight();...) and that code fragment was actually used in the fix (Figure 3). Using such patch, we could recover the link between them.