A Study of Repetitiveness of Code Changes in Software Evolution

PDF Download Download Paper


In this paper, we present a study of repetitiveness of code changes in software evolution. Repetitiveness is defined as the ratio of repeated changes over total changes. Focusing on fine-grained code changes, we model a change as a pair of old and new AST sub-trees within a method. A change is considered repeated within or cross-project if it matches another change having occurred in the history of the project or another project, respectively. We report the following important findings. First, repetitiveness of changes could be as high as 70-100% at small sizes and decreases exponentially as size increases. Second, repetitiveness is higher and more stable in cross-project setting than in within-project one. Third, fixing changes repeat similarly to general changes. Importantly, learning code changes and recommending them in software evolution is beneficial with accuracy for top-1 recommendation of over 30% and top-3 of nearly 35%. Repeated fixing changes could also be useful for automatic program repair.

Bib Info

  author = {Hoan Anh Nguyen and Anh Tuan Nguyen and Tung Thanh Nguyen and Tien N. Nguyen and Hridesh Rajan},
  title = {A Study of Repetitiveness of Code Changes in Software Evolution},
  booktitle = {Proceedings of the 28th International Conference on Automated Software Engineering},
  series = {ASE},
  year = {2013},
  location = {Silicon Valley, CA},