Automatic Labeling of Software Components and their Evolution using Log-Likelihood Ratio of Word Frequencies in Source Code

Kuhn, Adrian (2009). Automatic Labeling of Software Components and their Evolution using Log-Likelihood Ratio of Word Frequencies in Source Code. In: Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories, 16-17 May 2009, Vancouver BC (pp. 175-178). Washington, DC: IEEE Computer Society 10.1109/MSR.2009.5069499

Full text not available from this repository. (Request a copy)

As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.

Item Type: Conference or Workshop Item (Paper)
Division/Institute: 08 Faculty of Science > Institute of Computer Science (INF)
UniBE Contributor: Kuhn, Adrian
Publisher: IEEE Computer Society
Language: English
Submitter: Factscience Import
Date Deposited: 04 Oct 2013 15:22
Last Modified: 06 Dec 2013 14:05
Publisher DOI: 10.1109/MSR.2009.5069499
Web of Science ID: 000272225900025
URI: http://boris.unibe.ch/id/eprint/37152 (FactScience: 206977)

Actions (login required)

Edit item Edit item
Provide Feedback