I introduced IXML originally at Balisage 2013.
We choose which representations of our data to use, JSON, CSV, XML, or whatever, depending on habit, convenience, or the context we want to use that data in.
On the other hand, having an interoperable generic toolchain such as that provided by XML to process data is of immense value.
How do we resolve the conflicting requirements of convenience, habit, and context, and still enable a generic toolchain?
Invisible XML (ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content.
"This is clearly a submission that needs to be shredded, burned, and the ashes buried in multiple locations"
"I think the audience will eat him alive. But I want to be there to hear it."
It was a proposal.
I did a pilot implementation.
I also have a background in usability, and applied usability principles.
The following talk resulted.
Different people have different psychologies.
This seems almost too obvious to be true, but it is surprising how many people don't properly understand it.
My favourite description of how people – particularly programmers – differ is in chapter 15 of Bruce Tognazzini's book Tog on Interface.
When Sensories drive to work, they are aware of the birds, the trees, the hills turning green. They notice a cow lowing in the field. [...]
Different people have different psychologies.
This seems almost too obvious to be true, but it is surprising how many people don't properly understand it.
My favourite description of how people – particularly programmers – differ is in chapter 15 of Bruce Tognazzini's book Tog on Interface.
When Sensories drive to work, they are aware of the birds, the trees, the hills turning green. They notice a cow lowing in the field. [...]
Intuitives live in their own private universe, depending on an internal model of external events. [...]
Different people have different psychologies.
This seems almost too obvious to be true, but it is surprising how many people don't properly understand it.
My favourite description of how people – particularly programmers – differ is in chapter 15 of Bruce Tognazzini's book Tog on Interface.
When Sensories drive to work, they are aware of the birds, the trees, the hills turning green. They notice a cow lowing in the field. [...]
Intuitives live in their own private universe, depending on an internal model of external events. [...]
When Intuitives drive to work, they watch the tectonic plates, deep in the earth's crust, rubbing together...
Different people have different psychologies.
This seems almost too obvious to be true, but it is surprising how many people don't properly understand it.
My favourite description of how people – particularly programmers – differ is in chapter 15 of Bruce Tognazzini's book Tog on Interface.
When Sensories drive to work, they are aware of the birds, the trees, the hills turning green. They notice a cow lowing in the field. [...]
Intuitives live in their own private universe, depending on an internal model of external events. [...]
When Intuitives drive to work, they watch the tectonic plates, deep in the earth's crust, rubbing together. They run into the cow.
The problem is that the people designing things are usually not the people who will be using those things, and they tend to design for themselves.
So... you have to use HCI techniques:
Usability is about designing things (software/programming languages/cookers) to allow people to do their work:
Efficient, Error-free,
Enjoyable or
Fast, Faultless and Fun
Don't confuse usability with learnability: they are distinct and different.
No one really talks seriously about the usability of notations.
Notations affect what you can do with them.
For instance, Roman numerals:
CXXVIII+CXXVIII
=CCXXXXVVIIIIII
=CCXXXXVVVI
=CCXXXXXVI
=CCLVI
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Also pretend, just for a moment, that their chief method of communicating with a computer was with programming languages.
Imagine, hypothetically, that programmers are humans...despite all evidence to the contrary:
Also pretend, just for a moment, that their chief method of communicating with a computer was with programming languages.
What should you do?
We designed a programming language: ABC
We used the HCI principles:
ABC went on to form the basis of Python.
Looking at
it appears that there is no real rule.
Apparently:
NE: Nevada or Nebraska?
NE: Nevada or Nebraska?
It's Nebraska, but NB would have been a better choice
NE: Nevada or Nebraska?
It's Nebraska, but NB would have been a better choice
MI: Mississippi, Missouri, Michigan, or Minnisota?
NE: Nevada or Nebraska?
It's Nebraska, but NB would have been a better choice
MI: Mississippi, Missouri, Michigan, or Minnisota?
It's Michigan, but MG would have been a better choice
NE: Nevada or Nebraska?
It's Nebraska, but NB would have been a better choice
MI: Mississippi, Missouri, Michigan, or Minnisota?
It's Michigan, but MG would have been a better choice
MS: Mississippi, Missouri, or Minnisota?
NE: Nevada or Nebraska?
It's Nebraska, but NB would have been a better choice
MI: Mississippi, Missouri, Michigan, or Minnisota?
It's Michigan, but MG would have been a better choice
MS: Mississippi, Missouri, or Minnisota?
It's Mississippi, but MP would have been a better choice.
But solving these problems with reading 2-letter codes would still not solve the problem of writing them.
Winter-school was open in December
Water is warm
Even if we could solve the problems of recognising a state code, it still wouldn't help you with remembering them.
I couldn't believe it wasn't possible to do the 2-letter codes better. So I wrote a program.
The simplest rule I came up with:
It can be done!
My point here is that the 2-letter codes were introduced for automation.
The solution they chose was technically sufficient.
That is no excuse for ignoring the needs of people.
A method for treating any context-free parsable document as XML.
The input
pi×(10+b)
can result in the XML
<prod> <id>pi</id> <sum> <number>10</number> <id>b</id> </sum> </prod>
or
<prod> <id name='pi'/> <sum> <number value='10'/> <id name='b'/> </sum> </prod>
The input
http://www.w3.org/TR/1999/xhtml.html
can give
<url> <scheme name='http'/> <authority> <host> <sub name='www'/> <sub name='w3'/> <sub name='org'/> </host> </authority> <path> <seg sname='TR'/> <seg sname='1999'/> <seg sname='xhtml.html'/> </path> </url>
{"name": "pi", "value": 3.145926}
can give
<json> <object> <pair string='name'> <string>pi</string> </pair> <pair string='value'> <number>3.145926</number> </pair> </object> </json>
<test lang="en" class="test"> This <em>is</em> a test. </test>
gave
<xml> <element name='test' close='test'> <attribute name='lang' value='en'/> <attribute name='class' value='test'/> <content> This <element name='em' close='em'> <content>is</content> </element> a test.</content> </element> </xml>
Getting all sorts of other stuff into XForms
Possibly: Creating a non-XML version of XForms.
Already used in at least one Dutch Government project
ixml works by describing the document to be treated in a grammar:
expr: term; sum. sum: expr, "+", term. term: factor; prod. prod: term, "×", factor. factor: id; number; "(", expr, ")". id: letter+. number: digit+. letter: ["a"-"z"]. digit: ["0"-"9"].
(This is the notation we are interested in).
In the initial design, the document was parsed to a parse-tree, and then the parse-tree was serialised to XML, using marks that you added to the grammar definition rules:
expr: term; ^sum. sum: expr, "+", term. term: factor; ^prod. prod: term, "×", factor. factor: ^id; ^number; "(", expr, ")". id: letter+. number: digit+. letter: ^["a"-"z"]. digit: ^["0"-"9"].
After user-testing we identified a number of changes that could be made to make ixml more usable.
It is easier to design the data description by starting from the full parse tree, and incrementally pruning the parts that are not needed.
Very many non-terminals are not necessary in the final serialisation at all and it is more sensible to prune these at the definition rather than the use-point.
-expr: term; sum.
Occasionally you want to prune all uses of a nonterminal but one, so it is useful to be able to mark a definition as deleted, but mark it as inserted at a use-point.
-expr: term; sum. ... factor: id; number; "(", ^expr, ")".
There are occasions where you need to say "any character except this list is acceptable at this position" (this had as consequence that a notation for character sets was necessary, something that was rejected in the initial design).
string: '"', ~["]*, '"'.
It is useful to have an explicit notation for something that is optional.
number: sign?, digit+.
It is useful to be able to use Unicode character classes.
letter: [lc].
<expr> <term> <prod> <term> <factor> <id> <letter>p</letter> <letter>i</letter> </id> </factor> </term>× <factor>( <expr> <sum> <expr> <term> <factor> <number> <digit>1</digit> <digit>0</digit> </number> </factor> </term> </expr>+ <term> <factor> <id> <letter>b</letter> </id> </factor> </term> </sum> </expr>) </factor> </prod> </term> </expr> <expr> <term> <prod> <term> <factor> <id> <letter>p</letter> <letter>i</letter> </id> </factor> </term>× <factor>( <expr> <sum> <expr> <term> <factor> <number> <digit>1</digit> <digit>0</digit> </number> </factor> </term> </expr>+ <term> <factor> <id> <letter>b</letter> </id> </factor> </term> </sum> </expr>) </factor> </prod> </term> </expr>
expr: term; sum. sum: expr, "+", term. -term: factor; prod. prod: term, "×", factor. -factor: id; number; "(", expr, ")". id: letter+. number: digit+. letter: ["a"-"z"]. digit: ["0"-"9"].
This removes the element from the serialisation, but not its children.
<expr> <prod> <id> <letter>p</letter> <letter>i</letter> </id>×( <expr> <sum> <expr> <number> <digit>1</digit> <digit>0</digit> </number> </expr>+ <id> <letter>b</letter> </id> </sum> </expr>) </prod> </expr>
expr: term; sum. sum: expr, "+", term. -term: factor; prod. prod: term, "×", factor. -factor: id; number; "(", expr, ")". id: letter+. number: digit+. -letter: ["a"-"z"]. -digit: ["0"-"9"].
<expr> <prod> <id>pi</id>×( <expr> <sum> <expr> <number>10</number> </expr>+ <id>b</id> </sum> </expr>) </prod> </expr>
-expr: term; sum. sum: expr, "+", term. -term: factor; prod. prod: term, "×", factor. -factor: id; number; "(", expr, ")". id: letter+. number: digit+. -letter: ["a"-"z"]. -digit: ["0"-"9"].
<prod> <id>pi</id>×( <sum> <number>10</number>+ <id>b</id> </sum> </prod>
You can delete the extraneous characters if you wish:
sum: expr, -"+", term. -factor: id; number; -"(", expr, -")".
Changing
id: letter+. number: digit+.
to
id: @name. name: letter+. number: @value. value: digit+.
or
id: name. @name: letter+. number: value. @value: digit+.
gives
<prod> <id name='pi'/> <sum> <number value='10'/> <id name='b'/> </sum> </prod>
-expr: term; sum. sum: expr, -"+", term. -term: factor; prod. prod: term, -"×", factor. -factor: id; number; -"(", expr, -")". id: @name. name: ["a"-"z"]+. number: @value. value: ["0"-"9"]+.
There is strictly speaking no reason why the parse tree need be in XML, but could be equally well serialised in some other form, such as JSON.
With
<expr> <prod> <letter>a</letter> <sum> <digit>3</digit> <letter>b</letter> </sum> </prod> </expr>
You might be tempted to say:
{"expr": {"prod": {"letter": "a"; "sum": {"digit":"3"; "letter":"b"} } } }
But JSON object members are more like XML attributes than child elements:
Solution is to use arrays, and single-member objects:
{"expr": [{"prod": [{"letter": "a"}], [{"sum": [{"digit":"3"}], [{"letter":"b"}] }] }] }
If a notation is to be human-facing, then it is not enough to make it functionally sufficient.
HCI techniques, although usually applied to interaction, are also applicable to make notations more usable for the people using them.