next up previous
Next: Goals of this tutorial Up: Introduction Previous: Introduction

Setting the stage

Is there enough legacy software in the world to justify investments in software renovation technology? It turns out that we are living on a software volcano: large numbers of new and old software systems control our lives. We admire the sheer bulk of the magnificent volcano, benefit from the fertile grounds surrounding it, yet at the same are suffering from frequent eruptions of lava, steam, and poisonous gas, uncertain what is going on within the volcano, and when the next large eruption will be.

The figures collected by Jones [31] provide insight in the size of the problem. He uses the function point (FP) as unit of measurement for software. It abstracts from specific programming languages and specific presentation styles of programs. The correlation between function points with the measurement in lines of code differs per programming language, and is summarized in Table 1(a). Another point of reference is that the size of Windows 95 is equal to $8.5 \times 10^4$ FP.

The total volume of software is estimated at $7\times10^9$ FP (7 Giga-FP). The distribution of the various programming languages used to implement all these function points is summarized in Table 1(b). Older languages dominate the scene: even today 30% of the 7 Giga-FP is written in COBOL. If we (hypothetically) assume that all software is written in COBOL we get an estimation (via 107 COBOL statements per FP) of $6.4\times10^{11}$ COBOL statements for the total volume of software.

As measure of software quality (or rather, the lack of it), Jones has estimated that on average 5 errors occur per function point. This includes errors in requirements, design, coding, documentation and bad fixes. The result is a frightening figure of $35\times10 ^ 9$ programming errors (35 Giga-bugs) waiting for a chance to burst out sooner or later.


Table 1: (a) Function Points versus Lines of Code; (b) Distribution of languages

Language Statements/FP
Assembler 320
C 128
Fortran77 107
Cobol85 107
C++ 64
Visual Basic 32
Perl 27
Smalltalk 21
SQL 13

 
Language Used in % of total
COBOL 30
Assembler 10
C 10
C++ 10
500 other languages 40
(a)   (b)


Developing better ways of developing new software will not solve this problem. When an industry approaches 50 years of age--as is the case with computer science-- it takes more workers to perform maintenance than to build new products. Based on current data, Table 2 shows extrapolations for the number of programmers working on new projects, enhancements and repairs. In the current decade, four out of seven programmers are working on enhancement and repair projects. The forecasts predict that by 2020 only one third of all programmers will be working on projects involving the construction of new software.


Table 2: Forecasts for numbers of programmers (worldwide) and distribution of their activities
Year New projects Enhancements Repairs Total
1950 90 3 7 100
1960 8500 500 1000 10000
1970 65000 15000 20000 100000
1980 1200000 600000 200000 2000000
1990 3000000 3000000 1000000 7000000
2000 4000000 4500000 1500000 10000000
2010 5000000 7000000 2000000 14000000
2020 7000000 11000000 3000000 21000000


Therefore, we must conclude that the importance of maintenance and gradual improvement of software is ever increasing and deserves more and more attention both in computer science education and research.


next up previous
Next: Goals of this tutorial Up: Introduction Previous: Introduction
Paul Klint 2001-06-10