The implementation for our approach leverages several mature technologies to realise different steps in our approach.
- The first step, Identifying Candidate Terms, has been implemented using GATE NLP Workbench and the GATE's scripting language.
- The second step, Calculate Similarity between Candidate terms, has been implemented using:
- Syntactic Similarity Calculation:: a JAVA package for string similarity calculation, i.e, Simpack.
- Semantic Similarity Calculation:: Approach followed in [1] using Wordnet::Similarity package.
[1] Nejati, S.; Sabetzadeh, M.; Chechik, M.; Easterbrook, S.; Zave, P., "Matching and Merging of Variant Feature Specifications," Software Engineering, IEEE Transactions on , vol.38, no.6, pp.1355,1375, Nov.-Dec. 2012
- The third step, Clustering, has been implemented using the EM algorithm in mclust package in R statistical programming language.
Fig: Overview of implementation for our approach