Web Content Authoring

The author

Steven Pemberton, CWI, Amsterdam

Contents

1988: Internet

November 1988: I was the first user of the open internet in Europe.

The author in the 80s

The CWI, where I work, was the first European open internet node: all of Europe was connected to all of North America with a 64kbps line.

We already used internet protocols in-house, but had no external connection.

We also had international email, a store and forward system.

1989

A year later we doubled the speed, but in fact since then the speed has more-or-less doubled per year.

We now have the world's fastest internet connection currently peaking at 14Tbps.

Throughput statistics 2025, peaking at 14Tb

1991: The Web

TBL and Robert Cailliau

The coming of the internet to Europe enabled the creation of the web.

Tim Berners-Lee and Robert Cailliau at CERN had the first web server running in 1991.

In 1994 I helped organise workshops at the first web conference in CERN; in 1995 I helped start W3C.

The CWI had one of the first 500 web sites in the world.

Pre-web

Post-internet, pre-web, there was ftp, a primitive method of serving documents to others. You downloaded the file to your computer before using it. There was email and chat programs.

Several research groups were experimenting with hypertext systems, but not yet connecting to the internet.

One of the clever things about the web was including existing methods of delivering documents, like ftp, in the repertoire of access mechanisms.

What this meant was that you could build a website with existing content really quickly.

The other clever thing of course was making it free.

Early Content Authoring: Characters

The Web was advanced for the time, using the 256 characters of Latin1 instead of just the 128 of ASCII (email and DNS only used ASCII). At last you could use accented letters!

Unicode wasn't there yet, the first draft being published in 1991, with 65,000 code points. Now with more than a million.

Early Content Authoring: Documents

HTML was also advanced for the time, being principally a document description language: it had little to do with presentation (browser manufacturers didn't understand this).

Bear in mind that the original HTML didn't support embedded images, and there was little to no control over presentation.

Early Content Authoring: Presentation

After the addition of images and tables to HTML, they were (incorrectly) used as methods of presentation, with spacer images, and tables used for positioning.

Browser makers also started incorrectly adding presentation elements to HTML.

So more or less the first effort of W3C was to introduce style sheets, in the form of CSS, in order to protect HTML, and make presentation easier (since you didn't have to touch the content).

Netscape was opposed to stylesheets, but Microsoft saw it as a way of getting its foot in the door, after its initial rejection of the internet.

The Server

The first web serverComputers get about 10 times faster per 5 years at constant price, 100 times faster per decade.

So computers today are around a million times faster than then.

Typically a computer was reserved for the web server, called "www", which is why so many URLs start with that. People started thinking it was an essential element of a URL (even though the very first two servers at CERN didn't use it).

As computers became faster, people wanted to host more than one server per computer, but HTTP, the web protocol, didn't support that, so had to be updated first to make it possible.

Networks

Network speeds double per year at constant price, which is about 1000 times faster per decade, so networks are now about a thousand million times faster.

In the mid-nineties networks were slow, and there were complaints that WWW stood for "World-Wide Wait".

This meant that you had to design content for lowest-common-denominator users; typically 1200 baud telephone lines to homes.

This meant: few images, no video.

The Browser

The other lowest common denominator you had to design for was the browser.

Unicode, Tables, PNG images: you had to wait until there was a critical mass of people who had a browser that supported them.

For PNG, that was due to an error in the design of HTML.

You didn't have to wait for style sheets though, because they were designed that way: pages would still display without CSS.

The read-write web

The web is designed to be read-write: early browsers could edit as well as display pages.

The idea was also that at home you would serve pages.

The first major popular browser didn't implement that part, which meant that you got server-side solutions instead, which has meant the web has become centralised instead of distributed as intended.

Don't get me started on passwords...

The Webmaster

All early websites had a "web master", the person responsible for maintaining the content, running the site, doing backups, keeping logs.

Every website had an email address webmaster@<site.tld> to reach that person.

There were no dynamic sites: all content was static. The only way for dynamism and interaction was to use the "Common Gateway Interface" to run programs for you on the server. This of course doesn't scale.

Content Management

SIGCHI Bulletin January 1996

There were no web content-management systems: you had to either author everything by hand, or devise something yourself.

For instance, in 1995 I was editor of one of the first magazines to publish simultaneously online. (Still online)

To manage this, I devised a content-management system using the Unix make facility. (See Management of a Large Website with Make)

Present day

The designers of HTML5 made some big design errors.

Amongst them was: if you need new functionality, you can add it with Javascript.

As a result we now have frameworks.

Two results: content is less sustainable in the long term since it depends on the Javascript implementation.

Secondly, there are now 20 different versions of HTML, each single-sourced, and non-standardised.

This means you are locked in: if you want to use a different framework, you have to rewrite your whole website. There is no standardisation any more.

It also means that most sites you see these days use mostly <div> and <span> and little else.

Conclusion

It's a mixed bag.

The early web was a simpler, easier place, but with less support, and browser makers muddying the water.

Now, much more is possible, but the browser makers have messed it up again.

My dream for the future is the design of the web going back to we the users, not the browsers, and the web becoming a truly distributed space, not owned by big companies.

Content management will always be needed, but it can, with the right design, be easier than what we have today.