Incremental Latent Semantic Indexing

Our Approach for Managing Traceability Links in Evolving Software

We propose a technique to automatically manage traceability link evolution and update the links in evolving software. Our automatic link updating technique relies on a
novel incremental version of the well-known Latent Semantic Indexing (LSI) algorithm that have been used for TLR. Our novel technique, called Incremental LSI (iLSI), allows
the fast and low-cost computation of traceability links by using the results from previous LSI computation. iLSI avoids the full cost of LSI computation for TLR by analyzing the
changes to documentation and source code in different versions of the system, and then derive the changes to the set of documentation-to-source code traceability links.

We developed a complete automatic traceability link evolution management tool, called TLEM, which was integrated into Eclipse environment. The tool operates in connection
with the Eclipse editor and with a back-end software configuration management (SCM) tool, named Molhado [24]. Molhado is used to store information on traceability links
and software artifacts at different versions. Initially, TLEM uses the traditional LSI technique to derive traceability links and store them into the SCM repository. For a subsequent version of software, before a check-in or on users' command,TLEM analyzes the changes and the new version of software artifacts and uses the incremental LSI algorithm to update the links. Since connecting to Molhado repository for link storage, TLEM is able provide for developers the tracking support of traceability links through different versions of thesoftware system. Additionally, traceability links can be inspected at any version to reason about their properties, for example, to support consistency checking among artifacts. In brief, TLEM provides a method to update traceability links without paying the huge cost of re-running a TLR tool every time the software artifacts change. We also conducted an empirical evaluation on its great reduction of time complexity and the maintaining of both high precision and recall values for the recovery of traceability links.

With the great reduction in time complexity, iLSI allows TLEM to be interactively used in Eclipse editor during software development to automatically maintain and update the
links. Our key departure point from existing traceability approaches is the leverage of an advanced TLR technique to provide the support for automatic traceability link evolution
management that can cope with software evolution.