|
Algorithm
Here, we describe the key algorithm in Bugzie. With two adjustable parameters x
(for fixer candidates) and k (for selected term lists), Bugzie operates in three
main phases:
-
Initializing, i.e. building the fuzzy sets for the technical terms
collected from the initially available information (e.g. already-fixed bug
reports):
In this phase, Bugzie uses a collection of already-fixed bug reports to build
its initial internal data, including:
1) the fuzzy sets of capable fixers for the available technical terms,
2) the fixer candidate list F(x),
3) the individual term lists Td(k), and
4) the system-wide term list T(k).
-
Recommending, i.e. producing a ranked list of developers capable of
fixing an unfixed bug report.
In this phase, Bugzie recommends the most capable developers for a given unfixed
bug report B.
1) It extracts all terms from B and keeps only
terms belonging to the selected term list T(k).
2) It computes the membership scores of all developers in the candidate list
F(x).
3) Bugzie ranks those membership scores and recommends the top-n developers as
the most capable fixers for the bug(s) reported in B.
-
Updating, i.e. updating the fuzzy sets as new information is available
(i.e. newly fixed bug reports).
In this phase, Bugzie incrementally updates its internal data with newly
available information (i.e. new bug reports are fixed by some developers).
1) It updates the counting values nd, nt, and nd;t using
newly available fixed bug reports by adding new corresponding counts for the new
data.
2) After updating the counting numbers, Bugzie updates the list F(x), Td(k), and
T(k). Instead of re-sorting all available developers and terms to update those
lists, Bugzie uses a caching strategy: it stores F(x) as a cache (called
developer cache). Thus, for each fixed bug report in the updating data, if the
fixer does not belong to the cache, Bugzie will add it to the cache, and if the
cache is full, it will remove from the cache the developer(s) having the least
recent fixing activity. Similarly, Bugzie also stores Td(k) as caches (called
term cache), and updates them based on the membership scores. Td(k) is stored as
a descendingly sorted list. During updating, if a term t does not belong to the
cache and its score t(d) is larger than that of some term currently in the
cache, Bugzie will insert it to the cache, and if the cache is full, it will
remove the least-scored term.
This updating and caching strategy makes our incremental updating very
efficient. Importantly, it fits well with software evolution nature.
|