The author

HTML5 is the New Flash

Steven Pemberton, CWI, Amsterdam

Contents

About me

Co-designed programming language that Python is based on.

5th person on European Internet

Browser

Web: 1st Dutch web co. One of the first 500 websites.

One of first online journals: CMS

CHI, Interactions

CSS, HTML, ODF

A typical project meeting

A project meeting

Discussing HTML

A snowy Boston

The New Web, by programmers, for programmers

HTML5 has changed the Web.

Some parts are good, but mostly it is based on a lack of proper design, and a lack of understanding of design principles and how to design notation.

I don't believe that HTML5 can lead the Web to its full potential.

Declarative

A Declarative Definition

We learn in school what numbers are, and how to add, subtract, multiply and divide.

However, when we get to square roots, we are only told:

The square root of a number n is the number r such that r × r = n.

This is a declarative definition. It tells you what something is, it tells you how to recognise it, but it doesn't tell you how to calculate it.

Most people know what a square root is, few people leave school knowing how to calculate one.

A Procedural Definition

So take a look at a procedural definition of square root:

function f a:
{
   x ← a
   x' ← (a + 1) ÷ 2
   eps ← 1.19209290e-07
   while abs(x − x') > eps × x: 
   {
      x ← x'
      x' ← ((a ÷ x') + x') ÷ 2
   }
   return x'
}

Side by side

Declarative Procedural
root(n)= r such that r × r = n.
function f a:
{
   x ← a
   x' ← (a + 1) ÷ 2
   eps ← 1.19209290e-07
   while abs(x − x') > eps × x: 
   {
      x ← x'
      x' ← ((a ÷ x') + x') ÷ 2
   }
   return x'
}

Advantages of the Declarative Approach

  1. (Much) Shorter
  2. Easier to understand
  3. Independent of implementation
  4. Less likely to contain errors
  5. Easier to see it is correct
  6. Tractable

Declarative numbers

number: optional sign, digit+.
optional sign: "-"?.
digit: "0"; "1"; "2"; "3"; "4"; "5"; "6"; "7"; "8"; "9".
A number has its normal everyday meaning.

This is:

Procedural numbers: HTML5

2.4.4 Numbers

2.4.4.1 Signed integers

A string is a valid integer if it consists of one or more ASCII digits, optionally prefixed with a "-" (U+002D) character.

A valid integer without a "-" (U+002D) prefix represents the number that is represented in base ten by that string of digits. A valid integer with a "-" (U+002D) prefix represents the number represented in base ten by the string of digits that follows the U+002D HYPHEN-MINUS, subtracted from zero.

The rules for parsing integers are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will return either an integer or an error.

  1. Let input be the string being parsed.
  2. Let position be a pointer into input, initially pointing at the start of the string.
  3. Let sign have the value "positive".
  4. Skip whitespace.
  5. If position is past the end of input, return an error.
  6. If the character indicated by position (the first character) is a "-" (U+002D) character:
    1. Let sign be "negative".
    2. Advance position to the next character.
    3. If position is past the end of input, return an error.

    Otherwise, if the character indicated by position (the first character) is a "+" (U+002B) character:

    1. Advance position to the next character. (The "+" is ignored, but it is not conforming.)
    2. If position is past the end of input, return an error.
  7. If the character indicated by position is not an ASCII digit, then return an error.
  8. Collect a sequence of characters that are ASCII digits, and interpret the resulting sequence as a base-ten integer. Let value be that integer.
  9. If sign is "positive", return value, otherwise return the result of subtracting value from zero.

2.4.4.2 Non-negative integers

A string is a valid non-negative integer if it consists of one or more ASCII digits.

A valid non-negative integer represents the number that is represented in base ten by that string of digits.

The rules for parsing non-negative integers are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will return either zero, a positive integer, or an error.

  1. Let input be the string being parsed.
  2. Let value be the result of parsing input using the rules for parsing integers.
  3. If value is an error, return an error.
  4. If value is less than zero, return an error.
  5. Return value.

Inflation

So the HTML5 definition of signed numbers is 16 times longer and has internal inconsistencies.

It will not surprise you to learn as a result that the HTML5 spec is very large.

HTML5, the Spec

The HTML5 Spec printed

HTML5 is almost, but not quite, entirely not about markup

You can tell by reading the HTML5 spec that it was written by programmers.

"When your only tool is a hammer, all your problems look like nails"

And yet, they're programmers and they forgot about how you use libraries in programs? The HTML5 spec is one huge monolithic program.

Declarative Markup

Declarative methods are not only for specification.

HTML used to be all about being declarative.

The poster-child of HTML declarative markup is the <a> element:

<a href="talk.html" title="..." target="..." class="..." >My Talk</a>

This compactly encapsulates a lot of behaviour including

Doing this in programming would be a lot of work.

CSS

CSS is another example of a successful declarative approach.

When W3C started the CSS activity, Netscape, at the time the leading browser, declined to join, saying that they had a better solution, JSSS, based on Javascript.

Instead of

h1 { font-size: 20pt; }

you could use script to say

document.tags.H1.fontSize = "20pt";

Wikipedia:

"JSSS lacked the various CSS selector features, supporting only simple tag name, class and id selectors. On the other hand, since it is written using a complete programming language, stylesheets can include highly complex dynamic calculations and conditional processing."

Which brings us to Javascript.

Javascript

Javascript- the definitive guide


Javascript: So Good it has Good Parts!

Javascript, the good parts book

Javascript

The good parts vs the rest

Javascript: Where even the Good Parts have Bad Parts

"Javascript: the Good Parts" is peppered with text such as this:

"If the operand [of typeof] is an array or null, then the result is 'object', which is wrong"

and

"The mechanism that Javascript provides to [make a new object] is messy and complex, but it can be significantly simplified"

and

"The best thing about Javascript is its implementation of functions. It got almost everything right. But, as you should expect with Javascript, it didn't get everything right"

So apparently even some of the good parts are bad.

Javascript Debugging

Javascript debugging is hard, because of misuse of the Robustness Principle.

The Robustness Principle (also known as Postel's Law) states:

"Be conservative in what you do, be liberal in what you accept from others"

Although it has its uses, not everyone thinks this is necessarily a good rule. I don't, and I'm not the only one.

I personally think this principal has had a bad effect on the web, because if thanks to being liberal, browsers accept all sorts of junk, then authors don't know it is wrong, and then they can't easily be conservative in what they produce.

It produces "suck it and see" coding. Once it looks OK in the browser, you stop, thus increasing the amount of junk on the web.

This is bad, because the message you think you are sending is not the message being received, but you don't know it. It can eat hours of your time trying to find out why your CSS doesn't work, only to discover it's because the browser thinks the HTML is different to what you think it is.

Robustness Principle

The Robustness Principle was proposed in order to improve interoperability between programs, which has been expressed by one wag as:

"you’ll have to interoperate with the implementations of others with lower standards. Do you really want to deal with those fools? Better to silently fix up their mistakes and move on."

However, for the reasons mentioned earlier, the Robustness Principle should never be applied to programming languages!

Unfortunately it has been to Javascript. Did you know that

++[[]][+[]]+[+[]] evaluates to the string "10" ?

Javascript is very hard to debug, since it silently accepts certain classes of errors, that then don't show up until much later.

Studies have shown that 90% of the cost of software comes from debugging. Reducing the need for debugging is really important.

(By the way: Mr. Null)

Published recently:

"Hello, I’m Mr. Null. My Name Makes Me Invisible to Computers"

by Christopher Null

http://www.wired.com/2015/11/null/

Programming

One of the problems with using programming as the basis of functionality is that standardisation flies out of the window.

Example: CSS presentation mode (which I am using here). This allows you to specify how any document can be formatted when doing a presentation.

Alas, HTML5 has taken the approach that you can do this better in Javascript. No one supports Presentation Mode any more. And there are now lots of Javascript packages to do presentation.

ALL DIFFERENT!

You can no longer switch in a different presentation package, and use that, because you have to CHANGE THE DOCUMENT.

The programmers are doing the document design, so all the documents become proprietary, and there is no interoperability, which is the whole point of standards.

Elements

This is why there are so few new elements in HTML5: they haven't done any design, and instead said "if you need anything, you can always do it in Javascript".

And they all have.

And they are all different.

Frameworks

So which framework do you use?

Raw bare-metal Javascript?

Angular? Dojo? Bootstrap? Or one of the other 26 listed on Wikipedia?

Are they compatible? No.

What happens when your chosen package dies, or is no longer supported, or doesn't get updated for the latest version of all browsers? YOU HAVE TO CHANGE ALL YOUR DOCUMENTS.

This is why we need standards, not proprietary formats like frameworks.

Example

"What flavor of Javascript are you going to use? Are you gonna use a transpiler? From what language? Grunt? Gulp? Bower? Yeoman? Browserify? Webpack? Babel? Common.js? Amd? Angular? Ember? Linting? What am I talking about? Am I mixing things up? Am I confused? "

"Talking to the community about my “analysis paralysis loop” caused by the excessive amount of available tools to choose from and to investigate resulted in the community suggesting to try out, spend time, learn and investigate four more technologies that I haven’t even considered in the first place. Good job, Javascript!"

Pistaccio

Example incompatibilites

document.getElementById('test-table');

dojo.byId('test-table');

$('test-table')

delete Ext.elCache['test-table']; 
Ext.get('test-table');

$jq('#test-table');

YAHOO.util.Dom.get('test-table');

document.id('test-table');

Availability

"The U.K.’s GDS (Government Digital Service) ran an experiment to determine how many of its users did not receive JavaScript-based enhancements, and it discovered that number to be 1.1 percent, or 1 in every 93 users. For an ecommerce site like Amazon, that’s 1.75 million people a month, which is a huge number."

alistapart

Bloat

To look at the webpage of one single tweet of 140 characters, you have to download just under a megabyte. It's 5200 lines of HTML before you even get to the five Javascript packages.

The whole of James Joyce's Ulysses is only half as long again.

The Website Obesity Crisis

Accessibility

"Many developers who have grown up only using frameworks have a total lack of understanding about the fundamentals of HTML, such as valid and semantic markup ... This is of great concern as semantic markup is one of the core principles of an accessible web."

Russ Weakley

Programming

Nicole Henninger:

"You know... I feel like I blinked and then all of the sudden what I thought was my job was suddenly not my job but now I'm being told that I need to do this other stuff that I don't even like and people wonder why I'm wielding a stiletto like a weapon and screaming, "I HATE JAVASCRIPT! YOU CAN'T MAKE ME! NO MEANS NO!" and considering a second career in comedy writing."

Example

A fake dropdown

<div class="dropdown">
   <button id="dropdownMenu1" data-toggle="dropdown" aria-haspopup="true" aria-expanded="true">
      Dropdown
      <span class="carat"?</span>
    </button>
    <ul class="dropdown-menu" aria-labelledby="dropdownMenu1">
        <li><a href="#">Action</a></li>
        <li><a href="#">Another</a></li>
        <li><a href="#">Something</a></li>
        <li><a href="#">Separated</a></li>
    </ul>
</div>

Design...

And then there are the design techniques they used when they did do some design.

"Paving the cowpaths"

This is a design-principle based on architecture: when you build a campus or estate, don't pave the paths, but wait and see where people walk, so you can see where they need paths.

A desire path

Desire paths

The HTML5 design principles document got it wrong:

"When a practice is already widespread among authors, consider adopting it rather than forbidding it or inventing something new."

Paving the cowpaths would be more like noticing that huge numbers of sites have a navigation drop-down, and supporting that natively.

Paving cowpaths as design principle

Cows are not designers.

Cowpaths are data.

If you pave cowpaths, you are setting in stone the behaviours caused by the design decisions in the past.

Cowpaths tell you where the cows want to go, not how they want to get there. If they have to take a path round a swamp to get to the meadow, then maybe it would be a better idea to drain the swamp, not pave the path they take round it.

Paving cowpaths is a bad design principle, at least in the way that they applied it. (In fact it can be a good design principle, but they misunderstood it).

One example of bad cowpath-based design

The HTML5 group spidered millions of pages, because they could, and then on the basis of that data decided what should be excluded from HTML5.

This is not "paving the cowpaths"! This is putting fences across cowpaths that are used by fewer cows than some other paths, and even goes against their own proclaimed design principles!

For instance: @rev.

<link rel="next" href="chap2.html"/>
<link rev="prev" href="chap2.html"/>

@rel and @rev are complementary attributes, they are a pair, like +/-, up/down, left/right.

The HTML5 people decided that not enough people were using @rev, and so removed it.

  1. This breaks backwards compatibility.
  2. What are the people who did use it supposed to do?? Bad luck for them apparently.

Irritated by Colon Disease

For years, the wider community had agreed to use a colon (:) to separate a name from the identification of where it came from. A colon was a legal name character, and it was chosen to be backwards compatible, but in some environments could be interpreted in a certain way.

eg: xml:lang

But no, they had to develop a new separator, the hyphen.

eg:

<div role="searchbox" aria-labelledby="label" aria-placeholder="MM-DD-YYYY">03-14-1879</div>

Not Invented Here Syndrome

"Four social dynamics appear to inderlie NIH:

  1. Belief that internal capabilities are superior to external ones.
  2. Fear of of losing control.
  3. Desire for credit and status.
  4. Significant emotional and financial investment in internal initiatives."

    Lidwell et al., Universal Principles of Design

Not Invented Here Syndrome

"The amount of “not invented here” mentality that [pervades] the modern HTML5 spec is odious. Accessibility in HTML5 isn’t being decided by experts. Process, when challenged through W3C guidelines, is defended as being “not like the old ways”, in essence slapping the W3C in the face. Ian’s made it clear he won’t play by the rules. When well-meaning experts carefully announce their opposing positions and desire for some form of closing the gaps, Ian and the inner circle constantly express how they don’t understand." http://cssquirrel.com/blog/2009/08/03/behold-leviathan-confused/

Many groups had already solved problems that HTML5 could have used, but that HTML5 decided to reinvent (usually with worse results, since they were for areas that they were not experts in).

Example NIH: RDFa

The question was: How should you represent general metadata in HTML?

2003: Cross working group task force created of interested parties.

2004: First working draft of RDFa

2008: RDFa Recommendation

So RDFa represented more than 5 years of work and agreement and consensus on how metadata should be represented in HTML and other technologies.

2009: HTML5 creates microdata out of the blue:

FUD ensues.

2013: Microdata abandoned.

(BTW: Scholarly HTML)

How to use HTML to create scholarly articles, using RDFa

http://scholarly.vernacular.io/

Forward compatibility: Empty elements

If XML did one thing right, it introduced a new notation for empty elements:

<br/>

This one simple change meant that you could parse a document without a DTD or Schema; you could parse any document without knowledge of the elements involved, which made the parser forward-compatible.

HTML5 dropped the requirement of using this notation (probably because of Irritated by Colon Disease), meaning that they can now never add a new empty element without breaking something.

Quotes

Apparently the XHTML rules were too restrictive, having to enclose every attribute with quotes. And yet:

https://mathiasbynens.be/notes/unquoted-attribute-values

"Even with these simplified definitions, it’s still a pain to remember all the rules for unquoted attribute values, especially as they differ between HTML and CSS. When in doubt, it’s probably best to just use quotes. If you’re confused, it’s likely to confuse your colleagues too. If you’re using user input in an attribute value, always quote (and escape) it to prevent XSS security vulnerabilities. "

Or as another wag put it:

"You know what would be cool? JSON, but invented by a less obsessive personality -- optional quotes on properties, trailing commas, and comments." @marijnjh

Show source

Well, this has always been a problem, but it seems to be getting worse and worse. This is just a page selected at random. There are far worse examples. There is no 'document' anymore.

<div class="row">            <div class="span4 col pull-right  bTB1S  padTB30">
                                        <div class="padT0 marB10">
                                            <h4 class="franklin-bold size-one-twenty-pc marT0">‘Basically owned the technology in cryptography'</h4>
                                        <div class="franklin-light size-fourteen lh17em">
                                            </div>
                </div>
                                            <div class="video-image-wrapper">
                <div class="video-container not-lead" class="playing">
          <div class="video right-rail-video" style="height: auto; width: auto;" id="wp_69ebfb98-049e-11e5-93f4-f24d4af7f97d" data-show-endscreen="0" data-autoplay="0" data-video-uuid="69ebfb98-049e-11e5-93f4-f24d4af7f97d" data-companion-ad="0">
        <div class="innerWrapper" style="width: 100%;"></div>
        </div>
        </div>
                        <div class="image-button-wrapper">
                                            <div class="spinner">
                            <img class="image" data-right="true" data-original="http://www.obfust.com/sf/wp-content/themes/wapo-blogs/inc/imrs.php?src=http://s3.amazonaws.com/posttv-thumbnails-prod/thumbnails/55660d34e4b0ba0b9fd407d8/CROCKER1.jpg&authapi-mob-redir=0" src="http://www.obfust.com/sf/wp-content/themes/wapo-blogs/inc/imrs.php?src=http://s3.amazonaws.com/posttv-thumbnails-prod/thumbnails/55660d34e4b0ba0b9fd407d8/CROCKER1.jpg&w=1080&authapi-mob-redir=0" />
                        </div>
                                        <div class="imm-video-overlay">
                        <img data-video-id="wp_69ebfb98-049e-11e5-93f4-f24d4af7f97d" class="wp-loading imm-loading-btn-small" src="http://www.obfust.com/posttv/resources/img/loading_wp_white_100.gif" style="display: none;" alt="loading" />
                        <div class="imm-video-play-btn imm-video-play-btn-small">
                            <i class="fa fa-play"></i>
                            <span class="franklin-bold">Play Video</span>
                        </div>
                    </div>
                </div>
                    </div>
                  
                                                    <div class="marT10">
                                            <p class="marTB0 light-grey size-fourteen franklin-light">
                                                    Steve Crocker, worked on early networking technology for DARPA.                                                                        </p>>

HTML5 wasn't a markup group

It was a DOM group.

The old DOM group got closed long ago; it shouldn't have been.

What do we need?

Modularity

We had this. It has been taken away. It needs to be put back.

Extensibility

We had this. It has been taken away. It needs to be put back.

Accessibility

We will all be 80 one day. We will all need to continue using the web.

Accessibility should be in the design from the ground-up. It is not something you can add on later.

Declarative

Easier to specify

No bloat

100 year web

The web is the way now that we distribute information. We will need the web pages we create now to be readable in 100 years time, just as we can still read 100-year-old books.

Requiring a webpage to depend on a particular 100-year-old implementation of Javascript is not exactly evidence of future-thinking.

At least declarative markup is easier to keep alive because (see the 5th slide) it is INDEPENDENT OF THE IMPLEMENTATION!

A Call to Action!

It is time for a new movement, to lead the Web to its full potential.

We should seize back the Declarative Web.

We can still create meaningful declarative documents, and serve them to HTML browsers.

HTML5 can become the assembly language of the web, and we can go back to having a coherent, declarative, author-friendly web.