Topic Model for Duplicate Bug Reports
Figure 1.Topic Model for Duplicate Bug Reports
To support the detection of duplicate bug reports, we specifically develop a novel topic model, called T-Model, based on the mechanism of topic modeling in LDA. Figure 1 shows the graphical notation of T-Model. Our idea is as follows. Each bug report bi is modeled by a LDA, which is represented via three parameters: topic proportion θbi, topic assignment zbi, and the selected terms wbi. While θbi and zbi are latent, the terms wbi are observable and determined by the topic assignment z and word selection ϕ.