Inferring Behavioral Specifications from Large-scale Repositories by Leveraging Collective Intelligence

PDF Download Download Paper


Despite[1] their proven benefits, useful, comprehen- sible, and efficiently checkable specifications are not widely available. This is primarily because writing useful, non-trivial specifications from scratch is too hard, time consuming, and requires expertise that is not broadly available. Furthermore, the lack of specifications for widely-used libraries and frameworks, caused by the high cost of writing specifications, tends to have a snowball effect. Core libraries lack specifications, which makes specifying applications that use them expensive. To contain the skyrocketing development and maintenance costs of high assur- ance systems, this self-perpetuating cycle must be broken. The labor cost of specifying programs can be significantly decreased via advances in specification inference and synthesis, and this has been attempted several times, but with limited success. We believe that practical specification inference and synthesis is an idea whose time has come. Fundamental breakthroughs in this area can be achieved by leveraging the collective intelligence available in software artifacts from millions of open source projects. Fine- grained access to such data sets has been unprecedented, but is now easily available. We identify research directions and report our preliminary results on advances in specification inference that can be had by using such data sets to infer specifications.

Bib Info

  author = {Hridesh Rajan and Tien N. Nguyen and Gary T. Leavens and Robert Dyer},
  title = {Inferring Behavioral Specifications from Large-scale Repositories by Leveraging Collective Intelligence},
  booktitle = {ICSE'15: The 37th International Conference on Software Engineering: NIER Track},
  location = {Florence, Italy},
  month = {May},
  year = {2015},