Photo of Davy Landman

Davy Landman

PhD student

SWAT, CWI, Netherlands

Data for ICSM 2013

Publication

  • Paul Klint, Davy Landman and Jurgen Vinju, Exploring the Limits of Domain Model Recovery, 29th IEEE International Conference on Software Maintenance, ICSM 2013, 2013AbstractSlides BibTEX

    We are interested in re-engineering families of legacy applications towards using Domain-Specific Languages (DSLs). Is it worth to invest in harvesting domain knowledge from the source code of legacy applications?

    Reverse engineering domain knowledge from source code is sometimes considered very hard or even impossible. Is it also difficult for "modern legacy systems"? In this paper we select two open-source applications and answer the following research questions: which parts of the domain are implemented by the application, and how much can we manually recover from the source code? To explore these questions, we compare manually recovered domain models to a reference model extracted from domain literature, and measured precision and recall.

    The recovered models are accurate: they cover a significant part of the reference model and they do not contain much junk. We conclude that domain knowledge is recoverable from "modern legacy" code and therefore domain model recovery can be a valuable component of a domain re-engineering process.

    @INPROCEEDINGS{Klint2013,
      author = { Paul Klint and Davy Landman and Jurgen Vinju },
      title = { {Exploring the Limits of Domain Model Recovery} },
      booktitle = { 29th IEEE International Conference on Software Maintenance, ICSM
      2013 },
      year = { 2013 },
      datalink = { http://homepages.cwi.nl/~landman/icsm2013/ },
      abstract = { We are interested in re-engineering families of legacy applications towards
        using Domain-Specific Languages (DSLs). Is it worth to invest in harvesting
        domain knowledge from the source code of legacy applications?
    
        Reverse engineering domain knowledge from source code is sometimes
        considered very hard or even impossible. Is it also difficult for "modern
        legacy systems"? In this paper we select two open-source applications and
        answer the following research questions: which parts of the domain are
        implemented by the application, and how much can we manually recover from
        the source code? To explore these questions, we compare manually recovered
        domain models to a reference model extracted from domain literature, and
        measured precision and recall.
    
        The recovered models are accurate: they cover a significant part of the
        reference model and they do not contain much junk. We conclude that domain
        knowledge is recoverable from "modern legacy" code and therefore domain
        model recovery can be a valuable component of a domain re-engineering
        process.
       }}

Data

This site contains the data and scripts mentioned in the IEEE International Conference on Software Maintenance 2013 (ICSM2013) submission.

ICSM2013-data.zip contain all the data. We used Rascal to analyse the systems. This data file also contains a Rascal (eclipse) project.

The following structure is used:

Make sure to use the unstable-updates site to install Rascal, since that contains features used in this project.