On the Effectiveness of Clone Detection by String Matching

Ducasse, Stéphane; Nierstrasz, Oscar Marius; Rieger, Matthias (2006). On the Effectiveness of Clone Detection by String Matching. Journal of software maintenance and evolution - research and practice, 18(1), pp. 37-58. John Wiley & Sons, Ltd. 10.1002/smr.317

[img] Text
Duca06iDuplocJSMEPaper.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (403kB) | Request a copy

Although duplicated code is known to pose severe problems for software maintenance, it is difficult to identify in large systems. Many different techniques have been developed to detect software clones, some of which are very sophisticated, but are also expensive to implement and adapt. Lightweight techniques based on simple string matching are easy to implement, but how effective are the y? We present a simple stringbased approach which we have successfully applied to a number of different languages such COBOL, Java, C++, Pascal, Python, Smalltalk, C and PDP-11 assembler. In each case the maximum time to adapt the approach to a new language was less than 45 minutes. In this article we investigate a number of simple variants of string-based clone detection that abstract away from common editing operations, and assess the quality of clone detection for very different case studies. Our results confirm that this inexpensive clone detection technique generally achieves high recall and acceptable precision. Over-zealous normalization of the code before comparison, however, can result in unacceptable numbers of false positives.

Item Type:

Journal Article (Original Article)

Division/Institute:

08 Faculty of Science > Institute of Computer Science (INF)
08 Faculty of Science > Institute of Computer Science (INF) > Software Composition Group (SCG) [discontinued]

UniBE Contributor:

Nierstrasz, Oscar

Subjects:

000 Computer science, knowledge & systems
500 Science > 510 Mathematics

ISSN:

1532-060X

Publisher:

John Wiley & Sons, Ltd.

Language:

English

Submitter:

Anja Ebeling

Date Deposited:

16 Oct 2017 13:14

Last Modified:

11 Apr 2024 16:11

Publisher DOI:

10.1002/smr.317

BORIS DOI:

10.7892/boris.104509

URI:

https://boris.unibe.ch/id/eprint/104509

Actions (login required)

Edit item Edit item
Provide Feedback