Researcher at CWI in Amsterdam (first non-military internet site in Europe - 1988, whole of Europe connected to USA with 64kb link!)
Co-designed the programming language ABC, that was later used as the basis for Python.
Wrote part of GCC.
At the end of the 80's built a system that you would now call a browser.
Organised 2 workshops at the first Web conference in 1994.
Chaired HTML WG for the best part of a decade.
Co-author of HTML4, CSS, XHTML, XForms, RDFa, etc.
Looking 10 years into the future.
Planting seeds.
90%
The American DoD did some research and discovered that 90% of the cost of software production was in debugging.
This means that to reduce programming costs and time, the first task of a programming language design should be to minimise the need for debugging.
What does research tell us?
I have a program that has a rather complicated data structure at its heart.
It is a mapping from pairs of integers to lists of tuples.
Those tuples contain two pairs of integers, a string, and two maps, the first a mapping from integers to a tuple of a string and a pair of integers, and the second which is a mapping from integers to strings.
{[(0,0)]: {(0,0), (0,0), "", {[0]: ("", (0, 0))}, {[0]: ""}}}
In one remote corner of the program, for a case that rarely occurs, I accidentally added a string and a single integer in the first of the mappings in the list.
I tried to run it, and the system (which is a JIT compiler) wouldn't run it, complaining about that bit of code, because, luckily, it is a strongly-typed system, and it saw the type mismatch.
It would have been a hell of a job debugging that.
Amazingly, it is possible to design a programming language that guarantees at compile time that no null pointer will ever be dereferenced.
Similarly, it is possible to design a programming language that guarantees at compile time that no array bounds will ever be overrun.
These are typically big sources of bugs in programming languages, and typically very hard to locate.
Tony Hoare (who introduced null pointers in 1965 "because it was so easy to implement") called it his "Billion Dollar Mistake".
This program reads and sorts a list of numbers. The compiler can guarantee it is pointer-safe.
type tree = structure(int value; ptree le, gt), ptree = pointer tree; insert(int val; ptree p) { if (p fits reference tree r) { if (val ≤ r→v) insert(val, r→le) else insert(val, r→gt) } else { if (p=malloc(tree) fits reference tree r) { r→val=val; r→le=null; r→gt=null; } else stop("Memory full") } } tprint(pointer tree p) { if (p fits reference tree r) { tprint(r→le); print(r→val); tprint(r→gt); } } ptree t; int i; t=null; while (not eof(input) { read(i); insert(i, t); } tprint(t);
It is known that "if then else" is a very poor construct, particularly when it comes to debugging.
When I was a university lecturer, I gave students three versions of a program much like the one below, one in assembly language, one in "if then else" style, and one in "case" style (see example). I changed the names of the variables used, so that it wasn't obvious that they were the same program, and then asked them to analyse the programs, and tell me under what conditions certain things happened. (In all three programs the same thing, but with different names).
In assembly language, I stopped them after 17 minutes, because they still didn't have the answer.
In "if then else", it took them a few minutes.
In "case" style, it was almost immediate.
if cheese then if cold then milk else if morning then water else coffee else if butter then if warm then milk else if evening then tea else water else coffee
Under what conditions 'water', and under what conditions 'coffee'?
if case cheese: if case cold: milk case morning: water case not cold & not morning: coffee fi case butter: if case warm: milk case not warm: if case evening: tea case not evening: water fi fi case not cheese and not butter: coffee fi
In designing ABC, we analysed very many data types, and very many programs, and came to some conclusions:
So we produced a language with a tiny set of very-high level data types, and made sorting and searching a primitive of the language.
Our conclusions:
Fred Brookes in his book "The Mythical Man Month" reported that the number of bugs in a program is not linear with the length of a program but a power function:
b ∝ l1.5
This means if a program is ten times as long, it has 30 times as many bugs, which means it costs 30 times as much to make.
Conversely, a program that is 10 times smaller costs 3% of the larger program.
This is a graph of the relative program size (blue line) versus the relative number of bugs to be expected in a program of the size (the red line).
Alternatively, the red line represents the relative cost of producing a program of that size.
Just one of 21 cabinets making up the computer
Want to guess if and by how much faster the Raspberry Pi is than the Elliott?
About one million times faster...
Elliott 405 | Raspberry Pi Zero | Factor | |
---|---|---|---|
Year | 1957 | 2015 | 58 |
Price | £85,000 (1957) (£2M-7M in 2015 money) | £4 (~ £0.15 in 1957 money) | ½M-2M |
Instruction cycle time | 10.71-0.918 ms (93-1089 Hz) | 1 ns (1 GHz clock) | 1M |
Main memory | 1280 bytes (nickel delay lines) | 512 MB LPDDR2 SDRAM | ½M |
Secondary memory | 16 kB drum store | 8 GB (typical micro SD flash card - not included) | ½M |
Tertiary memory | 1.2MB (300,000 word magnetic film) | 1TB (Typical external disk - not included) | 1M |
Output bandwidth | 25 characters/s | 373 MB/s (HDMI) 60MB/s (USB) | 2M-10M |
Weight | 3-6 tons | 9 g | ½M |
Size | 21 cabinets, each 2m x 77cm x 77cm | 65mm x 30mm x 5.4mm | 2.3M |
Median UK wage | Around £250 (men), £125 (women) | Around £30000 (men), £25000 (women) | 120-200 |
(Updated version of this original)
So the Raspberry Pi is:
A factor of a million million (billioen in Dutch, trillion in English).
A terabyte is a million million bytes: nowadays we talk in terms of very large numbers.
Want to guess how long a million million seconds is?
Is 30,000 years...
In other words, a really big number...
Funnily enough, a million million is almost exactly what Moore's Law would have us expect for that length of time (even though 1957 is before Moore's Law was thought of, and predates integrated circuits).
Last year Moore's Law turned 50 years old.
Or less prosaically: Moore's Law was 33⅓ iterations of itself old.
The first time I head that Moore's Law was nearly at an end was in 1977. From no less than Grace Hopper, at Manchester University.
Since then I have heard many times that it was close to its end, or even has already ended. There was a burst of such claims last year, which caused a wag to tweet
"The number of press articles speculating the end of Moore's Law doubles every eighteen months."
As an excellent example, in February last year, almost exactly three years after the announcement of the first version, version 2 of the Raspberry Pi computer was announced.
Since three years is exactly two cycles of Moore's Law, does the new Raspberry Pi deliver a four-fold improvement?
So Moore's Law has been doing its work, and computers have been getting exponentially faster.
In 1988 my laptop had a power of 800. My present one has a power of more than 25M. That is 15 doublings!
This is November 2006:
Six years later, the cheapest 4GB stick cost €2.99.
Often people don't understand the true effects of exponential growth.
A BBC reporter: "Your current PC is more powerful than the computer they had on board the first flight to the moon". True, but oh so wrong.
Take a piece of paper, divide it in two, and write this year's date in one half:
Now divide the other half in two vertically, and write the date 18 months ago in one half:
Now divide the remaining space in half, and write the date 18 months earlier (or in other words 3 years ago) in one half:
Repeat until your pen is thicker than the space you have to divide in two:
This demonstrates that your current computer is more powerful than all other computers you have ever had put together (and way more powerful than the computer they had on board the first moonshot).
Since the 50's, computers have become incredibly cheap and powerful, and yet we are still programming them with programming languages that were designed to save the computer work.
In fact most computers spend most of their time idle. Here is my computer as I write these slides. The bump at the end is caused by taking the screen shot.
What we need to do is take the load off the programmer, and put more of it on the computer. Use some of that spare power!
In the 50's, computers cost in the millions. Nearly no one bought them, nearly everyone leased them.
To rent time on a computer then would cost you of the order of $1000 per hour: several times the annual salary of a programmer!
When you leased a computer in those days, you would get programmers for free to go with it. Programmers were essentially free (in comparison with the cost of the computer).
In the 50's the computer's time was expensive.
So a programmer would write the program, copy it to special paper, give it to a typist, who would type it out, then give the result to another typist who would then type it out again to verify that it had been typed correctly the first time.
Why all this extra work? Because it was much cheaper to let 3 people do this work, than to let the computer discover the errors for you.
And so programming languages were designed around the needs of the computer, not the programmer. It was much cheaper to let the programmer spend lots of time producing a program than to let the computer do some of the work for you.
Programming languages were designed so that you tell the computer exactly what to do, in its terms, not what you want to achieve in yours.
It happened slowly, almost unnoticed, but nowadays we have the exact opposite position:
Compared to the cost of a programmer, a computer is almost free.
I call this Moore's Switch.
Relative costs of computers and programmers, 1957-2016
But, we are still programming in programming languages that are direct descendants of the languages designed in the 1950's!
We are still telling the computers what to do.
1960: Algol60
procedure bottles(n); value n; integer n; begin if n < 1 then outstring(1, "no more ") else outinteger(1, n); if n = 1 then outstring(1, "bottle") else outstring(1, "bottles"); outstring(1, " of beer"); end; integer i; for i := 99 step -1 until 1 do begin bottles(i); outstring(1, " on the wall, "); bottles(i); outstring(1, "\n"); outstring(1, "take one down and pass it around, "); bottles(i - 1); outstring(1, " on the wall.\n"); end; end
Now: Python
for quant in range(99, 0, -1): if quant > 1: print quant, "bottles of beer on the wall,", quant, "bottles of beer." if quant > 2: suffix = str(quant - 1) + " bottles of beer on the wall." else: suffix = "1 bottle of beer on the wall." elif quant == 1: print "1 bottle of beer on the wall, 1 bottle of beer." suffix = "no more beer on the wall!" print "Take one down, pass it around,", suffix print "--"
A new way of programming: declarative programming.
This describes what you want to achieve, but not how to achieve it.
Let me give some examples.
Declarative approaches describe the solution space.
A classic example is when you learn in school that
The square root of a number n is the number r such that r × r = n
This doesn't tell you how to calculate the square root; but no problem, because we have machines to do that for us.
function f a: { x ← a x' ← (a + 1) ÷ 2 epsilon ← 1.19209290e-07 while abs(x − x') > epsilon × x: { x ← x' x' ← ((a ÷ x') + x') ÷ 2 } return x' }
1000 lines, almost all of it administrative. Only 2 or 3 lines have anything to do with telling the time.
And this was the smallest example I could find. The largest was more than 4000 lines.
type clock = (h, m, s) displayed as circled(combined(hhand; mhand; shand; decor)) shand = line(slength) rotated (s × 6) mhand = line(mlength) rotated (m × 6) hhand = line(hlength) rotated (h × 30 + m ÷ 2) decor = ... slength = ... ... clock c c.s = system:seconds mod 60 c.m = (system:seconds div 60) mod 60 c.h = (system:seconds div 3600) mod 24
XForms is a declarative system that lets you define applications.
It is a W3C standard, and in worldwide use.
A certain company makes BIG machines (walk in): user interface is very demanding — traditionally needed 5 years, 30 people.
With XForms this became: 1 year, 10 people.
Do the sums. Assume one person costs 100k a year. Then this has gone from a 15M cost to a 1M cost. They have saved 14 million! (And 4 years)
Manager: I want you to come back to me in 2 days with estimates of how long it will take your teams to make the application.
(Two days later)
Programming man: I'll need 30 days to work out how long it will take to program it.
XForms man: I've already programmed it!
The British National Health Service started a project for a health records system.
One person then created a system using XForms.
CM Pro (Netherlands)
Inventive Designers (Belgium)
BetterForm* (Germany)
XSLTForms* (France)
Jadu (UK)
Orbeon* (USA)
XForms is also part of OpenOffice* and LibreOffice*
*=open source
For historical reasons, present programming languages are at the wrong level of abstraction: they don't describe the problem, but only one particular solution to the problem, in the computer's terms.
Once project managers realise they can save 90% on programming costs, they will switch to declarative programming.
I believe that eventually everyone will program declaratively: less errors, more time, more productivity.
Declarative programming allows programmers to be ten times more productive: what you now write in a week, would take a morning; what now takes a month would take a couple of days.
Advert: XForms Day at CWI in May (watch my homepage).
Slides are online.