A Study of Repetitiveness of Code Changes in Software Evolution
By: Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, and Hridesh Rajan
Download PaperAbstract
In this paper, we present a study of repetitiveness of code changes in software evolution. Repetitiveness is defined as the ratio of repeated changes over total changes. Focusing on fine-grained code changes, we model a change as a pair of old and new AST sub-trees within a method. A change is considered repeated within or cross-project if it matches another change having occurred in the history of the project or another project, respectively. We report the following important findings. First, repetitiveness of changes could be as high as 70-100% at small sizes and decreases exponentially as size increases. Second, repetitiveness is higher and more stable in cross-project setting than in within-project one. Third, fixing changes repeat similarly to general changes. Importantly, learning code changes and recommending them in software evolution is beneficial with accuracy for top-1 recommendation of over 30% and top-3 of nearly 35%. Repeated fixing changes could also be useful for automatic program repair.
ACM Reference
Nguyen, H.A. et al. 2013. A Study of Repetitiveness of Code Changes in Software Evolution. Proceedings of the 28th International Conference on Automated Software Engineering (2013).
BibTeX Reference
@inproceedings{nguyen2013study,
author = {Hoan Anh Nguyen and Anh Tuan Nguyen and Tung Thanh Nguyen and Tien N. Nguyen and Hridesh Rajan},
title = {A Study of Repetitiveness of Code Changes in Software Evolution},
booktitle = {Proceedings of the 28th International Conference on Automated Software Engineering},
series = {ASE},
year = {2013},
location = {Silicon Valley, CA},
entrysubtype = {conference},
abstract = {
In this paper, we present a study of repetitiveness of code changes in
software evolution. Repetitiveness is defined as the ratio of repeated changes
over total changes. Focusing on fine-grained code changes, we model a change
as a pair of old and new AST sub-trees within a method. A change is considered
repeated within or cross-project if it matches another change having occurred
in the history of the project or another project, respectively. We report the
following important findings. First, repetitiveness of changes could be as
high as 70-100% at small sizes and decreases exponentially as size increases.
Second, repetitiveness is higher and more stable in cross-project setting than
in within-project one. Third, fixing changes repeat similarly to general
changes. Importantly, learning code changes and recommending them in software
evolution is beneficial with accuracy for top-1 recommendation of over 30% and
top-3 of nearly 35%. Repeated fixing changes could also be useful for
automatic program repair.
}
}