Data and Tool
tool.zip contains following files:
java -Xmx2000m -jar BugDup.jar -c config_file input_data_file.txt
We used the data given by the authors of "Towards More Accurate Retrieval of Duplicate Bug Reports".
The structure of a input_data_file.txt file is as following:
Each [BugReportDescription i] has following structure (each fiedd forms a line).
ID=[Number] // ID number of the bug report
PS=[Text] // summary
DID=[Number] // ID number of one duplcate bug report of i (if any), or leave it empty
COMP=[Text] //Name of Component of the system relevant to the bug report
SUB_COMP=[Text] //Name of Component of the system relevant to the bug report
VER=[Text] // Version
PRIO=[Number] // Priority
ISSUE_TYPE=[Text] // Issue type
The normal texts (e.g ID=) are the required field names and should appear in the Description. The Italic texts is the to-be-filled in:  is the field's type, // Is notation about field.