SCAM SCAM SCAM
Check out SCAM - the international working conference on Source Code Analysis and Manipulation 2013 in Eindhoven. It has a tool paper track with deadlines en of June!
WASDETT?
At ECOOP 2013 we organize another workshop on academic tools . Tools are important artifacts of research projects, and we promote reporting on these high investment, high return-on-investment deliverables.
SLE and Parsing@SLE
SLE is a great conference where programming language people, modeling people and ontology people come together to report on recent advances in Software Language Engineering. SLE 2013 is organized with SPLASH 2013 in Indianapolis.
This year we organize a special informal workshop, called Parsing@SLE, together with SLE. This workshop invites all people working on parsing and parsing technology to come together and discuss the field and its future!
About
I am a researcher in the field of Software Engineering. My academic position is group leader of SEN1 - Software Analysis & Transformation at CWI, and I teach at Universiteit van Amsterdam courses in the Master Software Engineering.
Research interests
My research interests are in understanding and improving source code.
In theory source code is written text which can be changed at any time. In reality the source code of real software systems is mostly too complex to read and understand. The source code of normal software systems is actually quite difficult to manipulate and adapt to changing circumstances and requirements. Perhaps it should not have been called "software" after all... To make matters more interesting, the older systems are the more complex they become.
My personal goals are to:
- help software engineers to analyze source code to efficiently acquire a better understanding of source code.
- help software engineers effectively improve source code by code generation, refactoring and source-to-source transformation.
- understand which design decisions influence the flexibility and understandability of source code.
- enable the construction of software tools for source code generation, analysis, transformation, and visualization by a larger group of software engineers.
Affiliation
I work for these three institutes:
- Centrum Wiskunde & Informatica , SEN1 - Software Analysis & Transformation
- INRIA Lille Nord Europe, ATEAMS - Analysis and Transformation based on rEliAble tool coMpositionS
- Universiteit van Amsterdam ,Master Software Engineering
In the past I have visited at INRIA Nancy, Lucent Bell Labs and IBM TJ Watson research center.
I currently enjoy collaborating closely with Paul Klint, Tijs van der Storm, Mark Hills, Anastasia Izmaylova, Davy Landman, Atze van der Ploeg, Vadim Zaytsev, Jeroen van den Bos and all group members of SEN1, Michael W. Godfrey who is visiting CWI, and Robert M. Fuhrer.
Previously, Mark van den Brand supervised my PhD thesis project with Paul Klint. I enjoyed working with Pierre-Etienne Moreau in INRIA Nancy, Dennis Dams at Murray Hill (Bell Labs) and Robert Fuhrer at IBM TJ Watson. The first PhD student I supervised is Bas Basten.
Thanks (in no particular order) to Taeke Kooiker, Rob Economopoulos, Hayco de Jong, Pieter Olivier, Merijn de Jonge, Tobias Kuipers, Leon Moonen, Ralf Lämmel, Anthony Cleve, Diego Ordóñez Camacho, James R. Cordy, Martin Bravenboer, Eelco Visser, Arie van Deursen, Jan Heering, Magiel Bruntink, Jørgen Iversen, Niels Veermans, Steven Klusener, Peter Mosses, Slinger Jansen, Philippe Charles, Frank Tip, Stan Sutton, and Claude Kirchner for either working together or your inspiration or both!
Keywords
- meta programming
- context-free general parsing
- scannerless parsing
- disambiguation
- concrete syntax
- static analysis
- reverse engineering
- domain specific languages (DSL)
- term rewriting
- relational calculus
- interactive development environments (IDE)
Committees
I'm a member of the steering committees of:
- International Conference on Software Language Engineering (SLE)
- International Working Conference on Source Code Analysis and Manipulation (SCAM)
This year I serve on the following program committees:
- International Conference on Software Maintenance (ICSM)
- International Working Conference on Source Code Analysis and Manipulation (SCAM)
- Compiler Construction (CC)
- K workshop
In the past I have served on instances of the program committees of SLE, SCAM, CC, SAC PL-track, LDTA, WASDETT, WAPL, GTTSE, GPCE, ICMT, ESEC/FSE, and I co-chaired the PC's of LDTA and SCAM.
I'm currently editing two special issues for Elsevier's Science of Computer Programming: one for Language Descriptions Tools & Applications (LDTA) and one for Source Code Analysis and Manipulation (SCAM)
Contact details
| Jurgen.Vinju@cwi.nl jurgen@vinju.org |
|
| Snailmail | Science Park 123 P.O. Box 95079 NL-1090 GB AMSTERDAM |
| Visit | Science Park 123 1098 XG AMSTERDAM Room L221 |
| Phone | +31205924102 |
| Skype | skype://jurgen.vinju |
| http://nl.linkedin.com/in/jurgenvinju | |
| http://www.twitter.com/jurgenvinju | |
| http://www.facebook.com/jurgen.vinju | |
| Researchr | http://researchr.org/profile/jurgenjvinju |
Grotere kaart weergeven
Publications
2013
Anastasia Izmaylova and Jurgen J. Vinju. A Modular Language Parametric Framework for Type Constraint Based Refactorings. (DRAFT). abstract bibtex
Refactoring tools are among the most desirable in the programmer's toolbox. Any refactoring tool -specific for a particular language and for a specific kind of refactoring- represents a considerable investment. At an increasing rate new languages are introduced, and new features are introduced to existing languages. The development of refactoring tools is forced to keep with this evolution. The extension of a general purpose language like Java with generics is a good example that requires both adaptations to existing refactoring tools, as well as the introduction of new refactoring tools specific for generics. We propose a modular language-parametric framework, called "TyMoRe" (TYpe-related MOdular REfactoring), for constraint-based type refactorings. It enables reuse between languages and reuse between different refactorings for the same language. The framework uses functional monadic composition to achieve the desired modularity and compositionality. The effectiveness of TyMoRe is demonstrated by our prototype of the ``Infer Generic Type Arguments'' refactoring for a large subset of Java.
This article is an unpublished draft.
Mark Hills, Paul Klint and Jurgen J. Vinju. An empirical study of PHP feature usage. Proceedings of the International Symposium in Software Testing and Analysis (ISSTA), July 2013. Lugano Switserland. abstract bibtex
PHP is one of the most popular languages for server-side application development. The language is highly dynamic, providing programmers with a large amount of flexibility. However, these dynamic features also have a cost, making it difficult to apply traditional static analysis techniques used in standard code analysis and transformation tools. As part of our work on creating analysis tools for PHP, we have conducted a study over a significant corpus of open-source PHP systems, looking at the sizes of actual PHP programs, which features of PHP are actually used, how often dynamic features appear, and how distributed these features are across the files that make up a PHP website. We have also looked at whether uses of these dynamic features are truly dynamic or are, in some cases, statically understandable, allowing us to identify specific patterns of use which can then be taken into account to build more precise tools. We believe this work will be of interest to creators of analysis tools for PHP, and that the methodology we present can be leveraged for other dynamic languages with similar features.
To appear
2012
Mark Hills, Paul Klint and Jurgen J. Vinju. Scripting a refactoring with Rascal and Eclipse. Proceedings of the Fifth Workshop on Refactoring Tools. abstract bibtex
@inproceedings{WRT2012,
author = {Hills, Mark and Klint, Paul and Vinju, Jurgen J.},
title = {Scripting a refactoring with Rascal and Eclipse},
booktitle = {Proceedings of the Fifth Workshop on Refactoring Tools},
series = {WRT '12},
year = {2012},
pages = {40--49},
publisher = {ACM},
}
Mark Hills, Paul Klint and Jurgen Vinju. Program Analysis Scenarios in Rascal. 9th International Workshop on Rewriting Logic and its Applications (WRLA 2012). abstract bibtex
Rascal is a meta programming language focused on the implemen- tation of domain-specific languages and on the rapid construction of tools for software analysis and software transformation. In this paper we focus on the use of Rascal for software analysis. We illustrate a range of scenarios for building new software analysis tools through a number of examples, including one showing integration with an existing Maude-based analysis. We then focus on ongoing work on alias analysis and type inference for PHP, showing how Rascal is being used, and sketching a hypothetical solution in Maude. We conclude with a high-level discussion on the commonalities and differences between Rascal and Maude when applied to program analysis.
@inproceedings{wrla12,
title = "Program Analysis Scenarios in Rascal",
author = {Mark Hills and Paul Klint and Jurgen J. Vinju},
booktitle = {9th International Workshop on Rewriting Logic and Its Applications (WRLA 2012)},
note = {Invited Paper},
series = {Lecture Notes in Computer Science},
publisher = {Springer},
year = 2012
}
Mark Hills, Paul Klint and Jurgen J. Vinju. Meta-Language Support for Type-Safe Access to External Resources. International Conference on Software Language Engineering (SLE). abstract bibtex
Meta-programming applications often require access to het- erogenous sources of information, often from different technological spaces (grammars, models, ontologies, databases), that have specialized ways of defining their respective data schemas. Without direct language support, obtaining typed access to this external, potentially changing, informa- tion is a tedious and error-prone engineering task. The Rascal meta- programming language aims to support the import and manipulation of all of these kinds of data in a type-safe manner. The goal is to lower the engineering effort to build new meta programs that combine information about software in unforeseen ways. In this paper we describe built-in language support, so called resources, for incorporating external sources of data and their corresponding data-types while maintaining type safety. We demonstrate the applicability of Rascal resources by example, showing resources for RSF files, CSV files, JDBC-accessible SQL databases, and SDF2 grammars. For RSF and CSV files this requires a type inference step, allowing the data in the files to be loaded in a type-safe manner without requiring the type to be declared in advance. For SQL and SDF2 a direct translation from their respective schema languages into Rascal is instead constructed, providing a faithful translation of the declared types or sorts into equivalent types in the Rascal type system. An overview of related work and a discussion conclude the paper.
@inproceedings{sle2012,
title = {Meta-Language Support for Type-Safe Access to External Resources},
author = {Mark Hills and Paul Klint and Jurgen J. Vinju},
booktitle = {International Conference on Software Language Engineering (SLE)},
year = 2012,
publisher = {Springer},
series = {LNCS},
}
Jurgen J. Vinju and Michael W. Godfrey. What Does Control Flow Really Look Like? Eyeballing the Cyclomatic Complexity Metric. International Working Conference on Source Code Analysis and Manipulation. abstract bibtex experiment
Assessing the understandability of source code remains an elusive yet highly desirable goal for software developers and their managers. While many metrics have been suggested and investigated empirically, the McCabe cyclomatic complexity metric (CC) --- which is based on control flow complexity --- seems to hold enduring fascination within both industry and the research community. However, the CC metric also has obvious limitations. For example, it is easy to produce example code that seems trivial to understand yet has a high CC value; at the same time, one can also produce "spaghetti" code with many GOTOs that has the same CC value as a well-structured alternative. In this work, we explore the causal relationship between CC and understandability through quantitative and qualitative studies, and through thought experiments and discussion. Empirically, we examine eight well-known open source Java systems by grouping the abstract control flow patterns of the methods into equivalence classes and exploring the results. We found several surprising results: first, the number of unique control flow patterns is relatively low; second, CC often does not accurately reflect the intricacies of Java control flow; and third, methods with high CC often have very low entropy, suggesting that they may be relatively easy to understand. These findings appear to challenge the widely-held belief that there is a clear-cut causal relationship between understandability and cyclomatic complexity, and suggest that at the very least CC and similar measures need to be reconsidered and refined if they are to be used as a metric for code understandability.
@inproceedings{cc,
Author = {Jurgen J. Vinju and Michael W. Godfrey},
Title = {What does control flow really look like? Eyeballing the Cyclomatic Complexity Metric},
Booktitle = {Ninth IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM)},
Publisher = {IEEE Computer Society},
Year = {2012},
}
Mark Hills, Paul Klint, Tijs van der Storm and Jurgen J. Vinju. A one-stop-shop for Software Evolution Tool Construction. ERCIM News 2012-88, 2012. abstract bibtex
Real problems in software evolution render impossible a fixed, one-size-fits-all approach, and these problems are usually solved by gluing together various tools and languages. Such ad-hoc integration is cumbersome and costly. With the Rascal meta-programming language the Software Analysis and Transformation research group at CWI explores whether it is feasible to develop an approach that offers all necessary meta-programming and visualization techniques in a completely integrated language environment. We have applied Rascal with success in constructing domain specific languages and experimental refactoring and visualization tools.
@article{ERCIM2012,
author = {Mark Hills and
Paul Klint and
Tijs van der Storm and
Jurgen J. Vinju},
title = {A One-Stop-Shop for Software Evolution Tool Construction},
journal = {ERCIM News},
volume = {2012},
number = {88},
year = {2012},
ee = {http://ercim-news.ercim.eu/en88/special/a-one-stop-shop-for-software-evolution-tool-construction},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
2011
Mark Hills, Paul Klint, and Jurgen J. Vinju. A case of visitor versus interpreter pattern. In Proceedings of the 49th International Conference on Objects, Models, Components and Patterns, TOOLS, 2011. abstract bibtex
We compare the Visitor pattern with the Interpreter pattern,investigating a single case in point for the Java language. We have produced and compared two versions of an interpreter for a programming language. The first version makes use of the Visitor pattern. The second version was obtained by using an automated refactoring to transform uses of the Visitor pattern to uses of the Interpreter pattern. We compare these two nearly equivalent versions on their maintenance characteristics and execution efficiency. Using a tailored experimental research method we can highlight differences and the causes thereof. The contributions of this paper are that it isolates the choice between Visitor and Interpreter in a realistic software project and makes the difference experimentally observable.
@inproceedings{TOOLS2011,
title = {A Case of Visitor versus Interpreter Pattern},
author = {Mark Hills and Paul Klint and Jurgen J. Vinju},
year = {2011},
booktitle = {Proceedings of the 49th International Conference on Objects, Models, Components and Patterns},
series = {TOOLS},
}
Jeroen van den Bos, Mark Hills, Paul Klint, Tijs van der Storm, and Jurgen J. Vinju. Rascal: From Algebraic Specification to Meta-Programming AMMSE 2011, EPTCS Volume 56, pp 15-32, 2011. abstract bibtex
Algebraic specification has a long tradition in bridging the gap between specification and programming by making specifications executable. Building on extensive experience in designing, implementing and using specification formalisms that are based on algebraic specification and term rewriting (namely Asf and Asf+Sdf), we are now focusing on using the best concepts from algebraic specification and integrating these into a new programming language: Rascal. This language is easy to learn by non-experts but is also scalable to very large meta-programming applications. We explain the algebraic roots of Rascal and its main application areas: software analysis, software transformation, and design and implementation of domain-specific languages. Some example applications in the domain of Model-Driven Engineering (MDE) are described to illustrate this.
@Inproceedings{EPTCS56.2,
author = "van den Bos, Jeroen and Hills, Mark and Klint, Paul and van der Storm, Tijs and Vinju, Jurgen J.",
year = "2011",
title = "Rascal: From Algebraic Specification to Meta-Programming",
editor = "Dur\'an, Francisco and Rusu, Vlad",
booktitle = "Proceedings Second International Workshop on Algebraic Methods in Model-based Software Engineering (AMMSE)",
series = "Electronic Proceedings in Theoretical Computer Science",
volume = "56",
publisher = "Open Publishing Association",
pages = "15-32",
}
Bas Basten, Paul Klint, and Jurgen Vinju. Ambiguity detection: Scaling to scannerless. In International Conference on Software Language Engineering (SLE), LNCS. Springer, 2011. abstract bibtex
Static ambiguity detection would be an important aspect of language workbenches for textual software languages. The challenge is that automatic ambiguity detection of context-free grammars is undecidable. Sophisticated approximations and optimizations do exist, but these do not scale to grammars for so-called "scannerless parsers", as of yet. We extend previous work on ambiguity detection for context-free grammars to cover disambiguation techniques that are typical for scannerless parsing, such as longest match and reserved keywords. This paper contributes a new algorithm for ambiguity detection in character-level grammars, a prototype implementation of this algorithm and validation on several real grammars. The total run-time of ambiguity detection for character-level grammars for languages such as C and Java is dramatically reduced by several orders of magnitude, without loss of precision. The result is that ambiguity detection for realistics grammars can be done efficiently and may now become a tool in language workbenches.
@inproceedings{sle2,
title = {Ambiguity Detection: Scaling to Scannerless},
author = {Bas Basten and Paul Klint and Jurgen Vinju},
booktitle = {International Conference on Software Language Engineering (SLE)},
year = 2011,
publisher = {Springer},
series = {LNCS},
}
Bas Basten and Jurgen Vinju. Parse forest diagnostics with Dr. Ambiguity. In International Conference on Software Language Engineering (SLE), LNCS. Springer, 2011. abstract bibtex
In this paper we propose and evaluate a method for locating causes of ambiguity in context-free grammars by automatic analysis of parse forests. A parse forest is the set of parse trees of an ambiguous sentence. Deducing causes of ambiguity from observing parse forests is hard for grammar engineers because of (a) the size of the parse forests, (b) the complex shape of parse forests, and (c) the diversity of causes of ambiguity.
We first analyze the diversity of ambiguities in grammars for programming languages and the diversity of solutions to these ambiguities. Then we introduce Dr. Ambiguity: a parse forest diagnostics tools that explains the causes of ambiguity by analyzing differences between parse trees and proposes solutions. We demonstrate its effectiveness using a small experiment with a grammar for Java 5.
@inproceedings{sle3,
title = {Parse Forest Diagnostics with Dr. Ambiguity},
author = {Bas Basten and Jurgen Vinju},
booktitle = {International Conference on Software Language Engineering (SLE)},
year = 2011,
publisher = {Springer},
series = {LNCS},
}
Mark Hills, Paul Klint, and Jurgen Vinju. RLSrunner: Linking Rascal with K for Program Analysis. In International Conference on Software Language Engineering (SLE), LNCS. Springer, 2011. abstract bibtex
The Rascal meta-programming language provides a number of features supporting the development of program analysis tools. However, sometimes the analysis to be developed is already implemented by another system. In this case, Rascal can provide a useful front-end for this system, handling the parsing of the input program, any transformation (if needed) of this program into individual analysis tasks, and the display of the results generated by the analysis. In this paper we describe a tool, RLSRunner, which provides this integration with static analysis tools defined using the K framework, a rewriting-based framework for defining the semantics of programming languages.
@inproceedings{sle1,
title = {RLSRunner: Linking Rascal with K for Program Analysis},
author = {Mark Hills and Paul Klint and Jurgen Vinju},
booktitle = {International Conference on Software Language Engineering (SLE)},
year = 2011,
publisher = {Springer},
series = {LNCS},
}
2010
Stijn de Gouw, Frank de Boer, and Jurgen Vinju. Prototyping a tool environment for run-time assertion checking in jml with communication histories. In 12th Workshop on Formal Techniques for Java-like Programs, 2010. abstract bibtex
In this paper we present prototype tool-support for the run-time assertion checking of the Java Modeling Language (JML) extended with communication histories specified by attribute grammars. Our tool suite integrates Rascal, a meta programming language and ANTLR, a popular parser generator. Rascal instantiates a generic model of history updates for a given Java program annotated with history specifications. ANTLR is used for the actual evaluation of history assertions.
@inproceedings{FTfJP2010,
Author = {Stijn de Gouw and Frank de Boer and Jurgen Vinju},
Booktitle = {12th Workshop on Formal Techniques for Java-like Programs},
Title = {Prototyping a tool environment for run-time assertion checking in JML with Communication Histories},
Year = {2010}}
Diego Ordóñez Camacho, Kim Mens, Mark van den Brand, and Jurgen Vinju. Automated Generation of Program Translation and Verification Tools using Annotated Grammars. Science of Computer Programming, 72(1):3-20, jan 2010. abstract bibtex
Automatically generating program translators from source and target language specifications is a non-trivial problem. In this paper we focus on the problem of automating the process of building translators between operations languages, a family of DSLs used to program satellite operations procedures. We exploit their similarities to semi-automatically build transformation tools between these DSLs. The input to our method is a collection of annotated context-free grammars. To simplify the overall translation process even more, we also propose an intermediate representation common to all operations languages. Finally, we discuss how to enrich our annotated grammars model with more advanced semantic annotations to provide a verification system for the translation process. We validate our approach by semi-automatically deriving translators between some real world operations languages, using the prototype tool which we implemented for that purpose.
@article{SCP2010,
Title = {Automated Generation of Program Translation and Verification Tools using Annotated Grammars},
Author = {Diego Ord\`o\~nez Camacho and Kim Mens and Mark van den Brand and Jurgen Vinju},
Doi = {http://dx.doi.org/10.1016/j.scico.2009.10.003},
Journal = {Science of Computer Programming},
Publisher = {Elsevier}
Month = {jan},
Number = {1},
Pages = {3-20},
Volume = {72},
Year = {2010},
}
Paul Klint, Tijs van der Storm, and Jurgen Vinju. On the Impact of DSL Tools on the Maintainability of Language Implementations. In Proceedings of the tenth workshop on Language Descriptions Tools and Applications, 2010. abstract bibtex
Does the use of DSL tools improve the maintainability of language implementations compared to implementations from scratch? We present empirical results on aspects of maintainability of six implementations of the same DSL using different languages (Java, JavaScript, C#) and DSL tools (ANTLR, OMeta, Microsoft “M”). Our evaluation indicates that the maintainability of language implementations is indeed higher when constructed using DSL tools.
@inproceedings{ldta2010,
Author = {Paul Klint and Tijs van der Storm and Jurgen Vinju},
Booktitle = {Proceedings of the tenth workshop on Language Descriptions Tools and Applications (LDTA)},
Title = {On the Impact of DSL tools on the Maintainability of Language Implementations.},
Series = {Electronic Notes in Theoretical Computer Science},
Publisher = {Elsevier}
Year = {2010}
}
Vincent Lussenburg, Tijs van der Storm, Jurgen J. Vinju, and Jos Warmer. Mod4j: A Qualitative Case Study of Model-driven Software Development. In Dorina Petriu, Nicolas Rouquette, and Øystein Haugen, editors, Model Driven Engineering Languages and Systems, 13th International Conference, MODELS 2010, Oslo, Norway, October 3-8, 2010. Proceedings, Lecture Notes in Computer Science. Springer, 2010. abstract bibtex
Model-driven software development (MDSD) has been on the rise over the past few years and is becoming more and more mature. However, evaluation in real-life industrial context is still scarce. In this paper, we present a case-study evaluating the applicability of a state-of-the-art MDSD tool, MOD4J, a suite of domain specific languages (DSLs) for developing administrative enterprise applications. MOD4J was used to partially rebuild an industrially representative application. This implementation was then compared to a base implementation based on elicited success criteria. Our evaluation leads to a number of recommendations to improve MOD4J. We conclude that having extension points for hand-written code is a good feature for a model driven software development environment.
@inproceedings{MODELS2010,
Author = {Vincent Lussenburg and Tijs {van der Storm} and Jurgen J. Vinju and Jos Warmer},
Title = {Mod4J: A Qualitative Case Study of Model-Driven Software Development},
Booktitle = {Model Driven Engineering Languages and Systems, 13th International Conference, MODELS 2010, Oslo, Norway, October 3-8, 2010. Proceedings},
Editor = {Dorina Petriu and Nicolas Rouquette and {\O}ystein Haugen},
Publisher = {Springer},
Series = {Lecture Notes in Computer Science},
Year = {2010}
}
Bas Basten and Jurgen Vinju. Faster ambiguity detection by grammar filtering. In Claus Brabrand and Pierre-Etienne Moreau, editors, Proceedings of the tenth workshop on Language Descriptions Tools and Applications, 2010. abstract bibtex
Real programming languages are often defined using ambiguous context-free grammars. Some ambiguity is intentional while other ambiguity is accidental. A good grammar development environment should therefore contain a static ambiguity checker to help the grammar engineer. Ambiguity of context-free grammars is an undecidable property. Nevertheless, various imperfect ambiguity checkers exist. Exhaustive methods are accurate, but suffer from non-termination. Termination is guaranteed by approximative methods, at the expense of accuracy. In this paper we combine an approximative method with an exhaustive method. We present an extension to the Noncanonical Unambiguity Test that identifies production rules that do not contribute to the ambiguity of a grammar and show how this information can be used to significantly reduce the search space of exhaustive methods. Our experimental evaluation on a number of real world grammars shows orders of magnitude gains in efficiency in some cases and negligible losses of efficiency in others.
@inproceedings{LDTA2010,
Author = {Bas Basten and Jurgen Vinju},
Title = {Faster Ambiguity Detection by Grammar Filtering},
Booktitle = {Proceedings of the tenth workshop on Language Descriptions Tools and Applications},
Editor = {Claus Brabrand and Pierre-Etienne Moreau},
Publisher = {Elsevier Electronic Notes in Theoretical Computer Science},
Year = {2010}
}
2009
Paul Klint, Tijs van der Storm, and Jurgen Vinju. EASY Meta-programming with Rascal. Leveraging the Extract-Analyze-Synthesize Paradigm for Meta-programming. In Proceedings of the 3rd International Summer School on Generative and Transformational Techniques in Software Engineering (GTTSE'09), LNCS. Springer, 2010. abstract bibtex
@inproceedings{RascalGTTSE,
title = {EASY Meta-Programming with Rascal. Leveraging the Extract-Analyze-SYnthesize Paradigm for Meta-Programming},
author = {Paul Klint and Tijs van der Storm and Jurgen J. Vinju},
year = {2010},
booktitle = {Proceedings of the 3rd International Summer School on Generative and Transformational Techniques in Software Engineering (GTTSE'09)},
location = {Braga, Portugal},
series = {LNCS},
publisher = {Springer},
}
Paul Klint, Tijs van der Storm, and Jurgen J. Vinju. Rascal: A Domain Specific Language for Source Code Analysis and Manipulation. In Ninth IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2009, Edmonton, Alberta, Canada, September 20-21, 2009, pages 168-177. IEEE Computer Society, 2009. abstract bibtex
Many automated software engineering tools require tight integration of techniques for source code analysis and manipulation. State-of-the-art tools exist for both, but the domains have remained notoriously separate because different computational paradigms fit each domain best. This impedance mismatch hampers the development of each new problem solution since desired functionality and scalability can only be achieved by repeated, ad hoc, integration of different techniques. RASCAL is a domain-specific language that takes away most of this boilerplate by providing high-level integration of source code analysis and manipulation on the conceptual, syntactic, semantic and technical level. We give an overview of the language and assess its merits by implementing a complex refactoring.
@inproceedings{rascal,
Author = {Paul Klint and Tijs van der Storm and Jurgen J. Vinju},
Title = {RASCAL: A Domain Specific Language for Source Code Analysis and Manipulation},
Booktitle = {Ninth IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM)},
Doi = {http://doi.ieeecomputersociety.org/10.1109/SCAM.2009.28},
Isbn = {978-0-7695-3793-1},
Pages = {168-177},
Publisher = {IEEE Computer Society},
Year = {2009},
}
Paul Klint, Jurgen J. Vinju, and Tijs van der Storm. Language Design for Meta-programming in the Software Composition Domain. In Alexandre Bergel and Johan Fabry, editors, Software Composition, 8th International Conference, SC 2009, Zurich, Switzerland, July 2-3, 2009. Proceedings, volume 5634 of Lecture Notes in Computer Science, pages 1-4. Springer, 2009. abstract bibtex
@inproceedings{SC2009,
Author = {Paul Klint and Jurgen J. Vinju and Tijs van der Storm},
Title = {Language Design for Meta-programming in the Software Composition Domain},
Booktitle = {Software Composition},
Doi = {http://dx.doi.org/10.1007/978-3-642-02655-3_1},
Editor = {Alexandre Bergel and Johan Fabry},
Isbn = {978-3-642-02654-6},
Pages = {1-4},
Publisher = {Springer},
Series = {Lecture Notes in Computer Science},
Volume = {5634},
Year = {2009}
}
Giorgios Economopoulos, Paul Klint, and Jurgen J. Vinju. Faster scannerless GLR parsing. In Oege de Moor and Michael I. Schwartzbach, editors, Compiler Construction, 18th International Conference, CC 2009, York, UK, March 22-29, 2009. Proceedings, volume 5501 of Lecture Notes in Computer Science, pages 126-141. Springer, 2009. abstract bibtex
Analysis and renovation of large software portfolios requires syntax analysis of multiple, usually embedded, languages and this is beyond the capabilities of many standard parsing techniques. The traditional separation between lexer and parser falls short due to the limitations of tokenization based on regular expressions when handling multiple lexical grammars. In such cases scannerless parsing provides a viable solution. It uses the power of context-free grammars to be able to deal with a wide variety of issues in parsing lexical syntax. However, it comes at the price of less efficiency. The structure of tokens is obtained using a more powerful but more time and memory intensive parsing algorithm. Scannerless grammars are also more non-deterministic than their tokenized counterparts, increasing the burden on the parsing algorithm even further. In this paper we investigate the application of the Right-Nulled Generalized LR parsing algorithm (RNGLR) to scannerless parsing. We adapt the Scannerless Generalized LR parsing and filtering algorithm (SGLR) to implement the optimizations of RNGLR. We present an updated parsing and filtering algorithm, called SRNGLR, and analyze its performance in comparison to SGLR on ambiguous grammars for the programming languages C, Java, Python, SASL, and C++. Measurements show that SRNGLR is on average 33% faster than SGLR, but is 95% faster on the highly ambiguous SASL grammar. For the mainstream languages C, C++, Java and Python the average speedup is 16%.
@inproceedings{CC2009,
Author = {Giorgios R. Economopoulos and Paul Klint and Jurgen J. Vinju},
Title = {Faster Scannerless {GLR} Parsing},
Booktitle = {Compiler Construction (CC)},
Doi = {http://dx.doi.org/10.1007/978-3-642-00722-4_10},
Editor = {Oege de Moor and Michael I. Schwartzbach},
Isbn = {978-3-642-00721-7},
Pages = {126-141},
Publisher = {Springer},
Series = {Lecture Notes in Computer Science},
Volume = {5501},
Year = {2009},
}
Philippe Charles, Robert M. Fuhrer, Stanley M. Sutton Jr., Evelyn Duesterwald, and Jurgen Vinju. Accelerating the Creation of Customized, Language-specific IDEs in Eclipse. In Shail Arora and Gary T. Leavens, editors, Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2009, October 25-29, 2009, Orlando, Florida, USA., pages 191-206, 2009. abstract bibtex
Full-featured integrated development environments have become critical to the adoption of new programming languages. Key to the success of these IDEs is the provision of services tailored to the languages. However, modern IDEs are large and complex, and the cost of constructing one from scratch can be prohibitive. Generators that work from language specifications reduce costs but produce environments that do not fully reflect distinctive language characteristics. We believe that there is a practical middle ground between these extremes that can be effectively addressed by an open, semi-automated strategy to IDE development. This strategy is to reduce the burden of IDE development as much as possible, especially for internal IDE details, while opening opportunities for significant customizations to IDE services. To reduce the effort needed for customization we provide a combination of frameworks, templates, and generators. We demonstrate an extensible IDE architecture that embodies this strategy, and we show that this architecture can be used to produce customized IDEs, with a moderate amount of effort, for a variety of interesting languages.
@inproceedings{imp,
Author = {Philippe Charles and Robert M. Fuhrer and Stanley M. Sutton Jr. and Evelyn Duesterwald and Jurgen Vinju},
Title = {Accelerating the Creation of Customized, Language-Specific IDEs in Eclipse},
Booktitle = {Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA)},
Editor = {Shail Arora and Gary T. Leavens},
Pages = {191-206},
Year = {2009}
}
2008
Paul Klint, Taeke Kooiker, and Jurgen J. Vinju. Language Parametric Module Management for IDEs. Electronic Notes in Theoretical Computer Science, 203(2):3-19, 2008. abstract bibtex
An integrated development environment (IDE) monitors all the changes that a user makes to source code modules and responds accordingly by flagging errors, by re-parsing, by rechecking, or by recompiling modules and by adjusting visualizations or other information derived from a module. A module manager is the central component of the IDE that is responsible for this behavior. Although the overall functionality of a module manager in a given IDE is fixed, its actual behavior strongly depends on the programming languages it has to support. What is a module? How do modules depend on each other? What is the effect of a change to a module? We propose a concise design for a language parametric module manager: a module manager that is parameterized with the module behavior of a specific language. We describe the design of our module manager and discuss some of its properties. We also report on the application of the module manager in the construction of IDEs for the specification language ASF+SDF as well as for Java. Our overall goal is the rapid development (generation) of IDEs for programming languages and domain specific languages. The module manager presented here represents a next step in the creation of such generic language workbenches.
@article{LDTA2008,
title = {Language Parametric Module Management for IDEs},
author = {Paul Klint and Taeke Kooiker and Jurgen J. Vinju},
year = {2008},
doi = {http://dx.doi.org/10.1016/j.entcs.2008.03.041},
tags = {programming languages, SDF, code generation, language design, programming, Meta-Environment, ASF+SDF, Java, IDE, generic programming},
journal = {Electronic Notes in Theoretical Computer Science},
volume = {203},
number = {2},
pages = {3-19},
}
2007
Jurgen J. Vinju. Annotated parse trees for a language parametric IDE. In PLIDE, November 2007. abstract bibtex
M.G.J.van den Brand, M.Bruntink, G.R.Economopoulos, H.A.deJong, P.Klint, T. Kooiker, T. van der Storm, and Jurgen J. Vinju. Using The Meta-environment for Maintenance and Renovation. In Proceedings of the Conference on Software Maintenance and Reengineering (CSMR'07). IEEE Computer Society Press, 2007. abstract bibtex
2006
Jurgen J. Vinju and J.R. Cordy. How to make a bridge between transformation and analysis technologies? In J.R. Cordy, R. Lämmel, and A. Winter, editors, Transformation Techniques in Software Engineering, number 05161 in Dagstuhl Seminar Proceedings. Internationales Begegnungs- und Forschungszentrum (IBFI), Schloss Dagstuhl, Germany, 2006. abstract bibtex
@inproceedings{dagstuhl,
Author = {J.J. Vinju and J.R. Cordy},
Booktitle = {Transformation Techniques in Software Engineering},
Editor = {J.R. Cordy and R. L{\"a}mmel and A. Winter},
Issn = {1862-4405},
Number = {05161},
Publisher = {Internationales Begegnungs- und Forschungszentrum (IBFI), Schloss Dagstuhl, Germany},
Series = {Dagstuhl Seminar Proceedings},
Title = {How to make a bridge between transformation and analysis technologies?},
Year = {2006}
}
Diego Ordóñez Camacho, Kim Mens, Mark van den Brand, and Jurgen J. Vinju. Automated derivation of translators from annotated grammars. In Language Descriptions Tools and Applications, ENCTS, pages 121-137, 2006. abstract bibtex
M.G.J. van den Brand, A.T. Kooiker, Jurgen J. Vinju, and N.P. Veerman. A Language Independent Framework for Context-sensitive Formatting. In CSMR '06: Proceedings of the Conference on Software Maintenance and Reengineering, pages 103-112, Washington, DC, USA, 2006. IEEE Computer Society Press. abstract bibtex
Jurgen J. Vinju. UPTR: a simple parse tree representation format. In Software Transformation Systems Workshop, October 2006. abstract bibtex
2005
J.J.Vinju. Analysis and Transformation of Source Code by Parsing and Rewriting. PhD thesis, Universiteit van Amsterdam, November 2005. abstract bibtex
In this thesis the subject of study is source code. More precisely, I am interested in tools that help in describing, analyzing and transforming source code. The overall question is how well qualified and versatile the programming language ASF+SDF is when applied to source code analysis and transformation. The main technical issues that are addressed are ambiguity of context-free languages and improving two important quality attributes of analyses and transformations: conciseness and fidelity. The overall result of this research is a version of the language that is better tuned to the domain of source code analysis and transformation, but is still firmly grounded on the original: a hybrid of context-free grammars and term rewriting. The results that are presented have a broad technical spectrum because they cover the entire scope of ASF+SDF. They include disambiguation by filtering parse forests, the type-safe automation of tree traversal for conciseness, improvements in language design resulting in higher resolution and fidelity, and better interfacing with other programming environments. Each solution has been validated in practice, by me and by others, mostly in the context of industrial sized case studies. In this introductory chapter we first set the stage by sketching the objectives and requirements of computer aided software engineering. Then the technological background of this thesis is introduced: generic language technology and ASF+SDF. We zoom in on two particular technologies: parsing and term rewriting. We identify research questions as we go and summarize them at the end of this chapter.
@phdthesis{thesis2005,
Author = {J.J. Vinju},
Month = nov,
Supervisor = {Paul Klint and {Mark van} den Brand},
School = {Universiteit van Amsterdam},
Title = {Analysis and Transformation of Source Code by Parsing and Rewriting},
Year = {2005}}
M.G.J. van den Brand, A.T. Kooiker, N.P. Veerman, and Jurgen J. Vinju. An industrial application of context-sensitive formatting. In International Conference on Software Maintenance, 2005. abstract bibtex
M. Bravenboer, R. Vermaas, Jurgen J. Vinju, and E. Visser. Generalized type-based disambiguation of meta programs with concrete object syntax. In Generative Programming and Component Engineering (GPCE), 2005. abstract bibtex
In meta programming with concrete object syntax, object-level programs are composed from fragments written in concrete syntax. The use of small program fragments in such quotations and the use of meta-level expressions within these fragments (anti-quotation) often leads to ambiguities. This problem is usually solved through explicit disambiguation, resulting in considerable syntactic overhead. A few systems manage to reduce this overhead by using type information during parsing. Since this is hard to achieve with traditional parsing technology, these systems provide specific combinations of meta and object languages, and their implementations are difficult to reuse. In this paper, we generalize these approaches and present a language independent method for introducing concrete object syntax without explicit disambiguation. The method uses scannerless generalized-LR parsing to parse meta programs with embedded objectlevel fragments, which produces a forest of all possible parses. This forest is reduced to a tree by a disambiguating type checker for the meta language. To validate our method we have developed embeddings of several object languages in Java, including AspectJ and Java itself.
@inproceedings{BVVV05,
Author = {M. Bravenboer and R. Vermaas and J.J. Vinju and E. Visser},
Booktitle = {Generative Programming and Component Engineering (GPCE)},
Title = {Generalized Type-Based Disambiguation of Meta Programs with Concrete Object Syntax},
Year = {2005}
}
M.G.J. van den Brand, B.Cornelissen, P.A. Olivier, and J.J Vinju. TIDE: a Generic Debugging Framework. In J. Boyland and G. Hedin, editors, Language Design Tools and Applications, June 2005. abstract bibtex
A language specific interactive debugger is one of the tools that we expect in any mature programming environment. We present applications of TIDE: a generic debugging framework that is related to the ASF+SDF Meta-Environment. TIDE can be applied to different levels of debugging that occur in language design. Firstly, TIDE was used to obtain a full-fledged debugger for language specifications based on term rewriting. Secondly, TIDE can be instantiated for any other programming language, including but not limited to domain specific languages that are defined and implemented using ASF+SDF. We demonstrate the common debugging interface, and indicate the amount of effort needed to instantiate new debuggers based on TIDE.
@inproceedings{ldta05,
Author = {Brand, {M.G.J. van den} and B. Cornelissen and Olivier, P.A. and Vinju, J.J},
Booktitle = {Language Design Tools and Applications},
Series = {Electronic Notes in Theoretical Computer Science},
Publisher = {Elsevier},
Editor = {J. Boyland and G. Hedin},
Month = jun,
Title = {{T}{I}{D}{E}: a generic debugging framework},
Year = 2005
}
M.G.J. van den Brand, P.E. Moreau, and Jurgen J. Vinju. A Generator of Efficient Strongly Typed Abstract Syntax Trees in Java. IEE Proceedings-Software, 2005. abstract bibtex
Abstract syntax trees are a very common data-structure in language related tools. For example compilers, interpreters, documentation generators, and syntax-directed editors use them extensively to extract, transform, store and produce information that is key to their functionality. We present a Java back-end for ApiGen, a tool that generates implementations of abstract syntax trees. The generated code is characterized by strong typing combined with a generic interface and maximal sub-term sharing for memory efficiency and fast equality checking. The goal of this tool is to obtain safe and more efficient programming interfaces for abstract syntax trees. The contribution of this work is the combination of generating a strongly typed data-structure with maximal sub-term sharing in Java. Practical experience shows that this approach is beneficial for extremely large as well as smaller data types.
@article{IEE2005,
title = {{A generator of efficient strongly typed abstract syntax trees in Java}},
author = {Van Den Brand, Mark and Moreau, Pierre-Etienne and Vinju, Jurgen},
booktitle = {{IEE Proceedings - Software Engineering}},
publisher = {IEEE},
pages = {70--87},
journal = {IEE Proceedings - Software Engineering},
volume = {152},
number = {2 },
year = {2005},
}
Jurgen J. Vinju. Type-driven automatic quotation of concrete object code in meta programs. In N. Guelfi and A. Savidis, editors, Rapid Integration of Software Engineering techniques, volume 3475 of LNCS, 2005. abstract bibtex
Meta programming can be facilitated by the ability to represent program fragments in concrete syntax instead of abstract syntax. The resulting meta programs are more self-documenting. One caveat in concrete meta programming is the syntactic separation between the meta language and the object language. To solve this problem, many meta programming systems use quoting and anti-quoting to indicate precisely where level switches occur. These “syntactic hedges” can obfuscate the concrete program fragments. This paper describes an algorithm for inferring quotes, such that the meta programmer no longer needs to explicitly indicate transitions between the meta and object languages.
@inproceedings{RISE2005,
title = {Type-Driven Automatic Quotation of Concrete Object Code in Meta Programs},
author = {Jurgen J. Vinju},
year = {2005},
pages = {97-112},
booktitle = {Rapid Integration of Software Engineering Techniques, Second International Workshop, RISE 2005, Heraklion, Crete, Greece, September 8-9, 2005, Revised Selected Papers},
editor = {Nicolas Guelfi and Anthony Savidis},
volume = {3943},
series = {Lecture Notes in Computer Science},
publisher = {Springer},
isbn = {3-540-34063-7},
}
Jurgen J. Vinju, Paul Klint,Tijs van deri Storm. Term Rewriting Meets Aspect Oriented Programming. In Aart Middeldorp, Vincent van Oostrom, Femke van Raamsdonk, and Roel C. de Vrijer, editors, Processes, Terms and Cycles: Steps on the Road to Infinity, Essays Dedicated to Jan Willem Klop, on the Occasion of His 60th Birthday, volume 3838 of Lecture Notes in Computer Science. Springer, 2005. abstract bibtex
2004
M.G.J. van den Brand and J.J.Vinju. Generation by Transformation in ASF+SDF. In GPCE Workshop on Software Transformation Systems (STS), 2004. abstract bibtex
2003
M.G.J. van den Brand, P.Klint, and J.J. Vinju. Term Rewriting with Traversal Functions. ACM Transactions on Software Engineering and Methodology (TOSEM), 12(2):152-190, 2003. abstract bibtex
M.G.J. van den Brand, S. Klusener, L. Moonen, and Jurgen J. Vinju. Generalized Parsing and Term Rewriting - Semantics Directed Disambiguation. In Barret Bryant and Joãao Saraiva, editors, Third Workshop on Language Descriptions Tools and Applications, Electronic Notes in Theoretical Computer Science, 2003. abstract bibtex
Generalized parsing technology provides the power and flexibility to attack real-world parsing applications. However, many programming languages have syntactical ambiguities that can only be solved using semantical analysis. In this paper we propose to apply the paradigm of term rewriting to filter ambiguities based on semantical information. We start with the definition of a representation of ambiguous derivations. Then we extend term rewriting with means to handle such derivations. Finally, we apply these tools to some real world examples, namely C and COBOL. The resulting architecture is simple and efficient as compared to semantic directed parsing.
@inproceedings{BMV03,
Author = {Brand, {M.G.J. van den} and Klusener, S. and Moonen, L. and Vinju, J.J.},
Title = {{G}eneralized {P}arsing and {T}erm {R}ewriting - {S}emantics {D}irected {D}isambiguation},
Booktitle = {Third Workshop on Language Descriptions Tools and Applications},
Editor = {Barret Bryant and Jo{\~a}o Saraiva},
Series = {Electronic Notes in Theoretical Computer Science},
Publisher = {Elsevier}
Year = 2003
}
M.G.J. van den Brand, P.E. Moreau, and Jurgen J. Vinju. Environments for Term Rewriting Engines for Free! In R. Nieuwenhuis, editor, Proceedings of the 14th International Conference on Rewriting Techniques and Applications (RTA'03). Springer-Verlag, 2003. abstract bibtex
2002
M.G.J. van den Brand, P. Klint, and Jurgen J. Vinju. Term Rewriting with Type-safe Traversal Functions. In B. Gramlich and S. Lucas, editors, Second International Workshop on Reduction Strategies in Rewriting and Programming (WRS 2002), volume 70 of Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers, 2002. abstract bibtex
M.G.J. van den Brand, J. Scheerder, Jurgen J. Vinju, and E. Visser. Disambiguation Filters for Scannerless Generalized LR Parsers. In R. Nigel Horspool, editor, Compiler Construction, volume 2304 of LNCS, pages 143-158. Springer-Verlag, 2002. abstract bibtex
In this paper we present the fusion of generalized LR parsing and scannerless parsing. This combination supports syntax definitions in which all aspects (lexical and context-free) of the syntax of a language are defined explicitly in one formalism. Furthermore, there are no restrictions on the class of grammars, thus allowing a natural syntax tree structure. Ambiguities that arise through the use of unrestricted grammars are handled by explicit disambiguation constructs, instead of implicit defaults that are taken by traditional scanner and parser generators. Hence, a syntax definition becomes a full declarative description of a language. Scannerless generalized LR parsing is a viable technique that has been applied in various industrial and academic projects.
2001
M.G.J. van den Brand, A. van Deursen, J. Heering, H.A. de Jong, M. de Jonge, T. Kuipers, P. Klint, L. Moonen, P. A. Olivier, J. Scheerder, Jurgen J. Vinju, E. Visser, and J. Visser. The ASF+SDF Meta-Environment: a Component-Based Language Development Environment. In R. Wilhelm, editor, CC'01, volume 2027 of LNCS, pages 365-370. Springer-Verlag, 2001. abstract bibtex
The ASF+SDF Meta-Environment is an interactive development environment for the automatic generation of interactive systems for constructing language definitions and generating tools for them. Over the years, this system has been used in a variety of academic and commercial projects ranging from formal program manipulation to conversion of COBOL systems. Since the existing implementation of the Meta-Environment started exhibiting more and more characteristics of a legacy system, we decided to build a completely new, component-based, version. We demonstrate this new system and stress its open architecture.
2000
M.G.J. van den Brand and Jurgen J. Vinju. Rewriting with Layout. In Claude Kirchner and Nachum Dershowitz, editors, Proceedings of RULE2000, 2000. abstract bibtex
Rewriting technology has proved to be an adequate and powerful mechanism to perform source code transformations. These transformations can not only be efficiently implemented using rewriting technology, but it also provides a firmer grip on the source code syntax. However, an important shortcoming of rewriting technology is that source code comments and layout are lost during rewriting. We propose ``rewriting with layout'' to solve this problem. We present a rewriting algorithm that keeps the layout of sub-terms that are not rewritten, and reuses the layout occurring in the right-hand side of the rewrite rules.
1999
J.J Vinju. Optimizations of List Matching in the ASF+SDF compiler. Master's thesis, University of Amsterdam, September 1999. abstract bibtex
Presentations
This is a selection of presentation slides.
2013
Debugging and all that for Master Software Engineering, May 2nd, Centrum Wiskunde & Informatica, The Netherlands.
Slides on Modularity for Bachelor Computer Science, Jan 13th, Universiteit van Amsterdam, The Netherlands.
Software Analysis and Transformation with RascalJan 11th, 2013, BioAssist Meeting, Utrecht, The Netherlands.
2012
Introduction to Rascal and Eyeballing the Cyclomatic Complexity MetricMay 11th, 2012, INRIA Lille Software Engineering Seminar, Villeneuve d'Ascq (France)
Constructing specialist software tools using Rascal: Metrics. April 24 2012, Sogyo, De Bilt (The Netherlands)
The mechanics of building a DSL using Rascal. April 17th 2012, IPA Spring Days, Gelderen (The Netherlands)
Professional Feedback. March 29th, 2012, CSMR Doctoral Symposium, Szeged (Hungary).
