A Large-scale Empirical Study of Java Language Feature Usage

By: Robert Dyer, Hridesh Rajan, Hoan Anh Nguyen, and Tien N. Nguyen

PDF Download Download Paper

Abstract

Programming languages evolve over time, adding additional language features to simplify common tasks and make the language easier to use. For example, the Java Language Specification has four editions and is currently drafting a fifth. While the addition of language features is driven by an assumed need by the community (often with direct requests for such features), there is little empirical evidence demonstrating how these new features are adopted by developers once released. In this paper, we analyze over 23k open-source Java projects representing over 7 million Java files, which when parsed contain over 14 billion AST nodes. We analyze this corpus to find uses of new Java language features over time. Our study gives interesting insights, such as: the fact that while all features are used, there are still millions of more places they could potentially be used; all features are used before release; and features tend to be adopted by committers on an individual basis rather than as a team.

ACM Reference

Dyer, R. et al. 2013. A large-scale empirical study of Java language feature usage. Technical Report #13-02. Department of Computer Science, Iowa State University.

BibTeX Reference

@techreport{dyer2013large,
  title = {A large-scale empirical study of Java language feature usage},
  author = {Dyer, Robert and Rajan, Hridesh and Nguyen, Hoan Anh and Nguyen, Tien N},
  year = {2013},
  month = {June},
  institution = {Department of Computer Science, Iowa State University},
  number = {13-02},
  abstract = {
    Programming languages evolve over time, adding additional language features to
    simplify common tasks and make the language easier to use. For example, the Java
    Language Specification has four editions and is currently drafting a fifth.
    While the addition of language features is driven by an assumed need by the
    community (often with direct requests for such features), there is little
    empirical evidence demonstrating how these new features are adopted by
    developers once released. In this paper, we analyze over 23k open-source Java
    projects representing over 7 million Java files, which when parsed contain over
    14 billion AST nodes. We analyze this corpus to find uses of new Java language
    features over time. Our study gives interesting insights, such as: the fact that
    while all features are used, there are still millions of more places they could
    potentially be used; all features are used before release; and features tend to
    be adopted by committers on an individual basis rather than as a team.
  }
}