Treating JSON as a subset of XML

About me

Researcher at CWI in Amsterdam (first non-military internet site in Europe - 1988, whole of Europe connected to USA with 64kb link!)

Co-designed the programming language ABC, that was later used as the basis for Python

At the end of the 80's built a system that you would now call a browser.

Organised 2 workshops at the first Web conference in 1994

Chaired the first style and internationalization workshops at W3C.

Co-author of HTML4, CSS, XHTML, XML Events, XForms, RDFa, etc

Forms co-chair at W3C

XForms

XForms originally designed as a replacement for HTML Forms.

analysis of HTML features
requirements analysis derived from usage of HTML Forms and other electronic forms systems.

The resultant design

MVC-based
intent-based controls
XML as a first-class data format, both for initialising data from external sources, as for submission.

Example

What this concretely means is that the data is physically separated from the controls in the form. The data is placed in the head of the document, and the controls bind to the data.

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
   <model xmlns="http://www.w3.org/2002/xforms">
      <instance>
         <data xmlns="">
            <year>2012</year>...
         </data>
      </instance>
   </model>
</head>
<body> ...

Controls and initial values

Controls in the body refer to values in the data instance(s) using XPath expressions:

<input ref="year">...
<input ref="event[1]/title/@language">...

The controls can be initialised by putting values in the data:

<data xmlns=""><year>2001</year>...</data>

but the data can also be initialised from external sources:

<instance src="http://www.example.org/events"/>

Constraints

Relationships between, and restrictions on, values can be specified in the model, allowing dependent values to be calculated automatically and data checking to be performed on the client rather than on the server.

<bind nodeset="year" constraint=". &gt; 1752"/>
<bind nodeset="state" required="../country = 'USA'"/>
<bind nodeset="age" calculate="../thisyear - ../birthdate/year"/>
<bind nodeset="birthdate" type="date"/>

Output

Values can be exposed in the document itself, using an output control:

The result for the year <output ref="year"/> is ...

Intent-based Controls

Controls are intent-based, by expressing what the control should do, rather than how it should look. So a control like this:

<select1 ref="colour">
   <label>Colour:</label>
   <item><label>red</label>
      <value>#ff0000</value></item>
   <item><label>green</label>
       <value>#00ff00</value></item>
   <item><label>blue</label>
      <value>#0000ff</value></item>
</select1>

can be represented in different ways depending purely on styling.

THe same control three times, each with a different styling

Initial experience

far more powerful and flexible than the HTML Forms it was replacing
too slavishly followed the HTML design

Particularly in the use of fixed strings rather than (potentially) calculated values for such things as the submission URI.

As a consequence this restricted what was possible with the language.

XForms 1.0 → 1.1

As a consequence, XForms 1.1 addressed these shortcomings

Resultant language turned out to be far more than a forms language, but a declarative application language.

Since XForms has input, output, and a processing engine, XForms is Turing-complete, and much more than just forms is now possible with the language.

XForms 1.0 → 1.1

Experience: application production time can be reduced by an order of magnitude

One large project reporting a reduction from 5 years with 30 programmers using traditional programming, to 1 year with 10 programmers using XForms.

Data Opacity

XForms treats its data internally as if it is XML
XPath both to address data as to calculate new values
Not the intention that external data be only in XML

Just as a photo editor doesn't care about the external format, nor does XForms
However, since the internal form of the data that XForms deals with is XML (since the data is accessed using XPath), there has to be a mapping between the external form and the internal one.

JSON

An obvious data format widely in use on the web is JSON.

There are several mappings defined in both directions between XML and JSON, but largely because JSON can only represent a subset of what XML can represent, many of the mappings are cumbersome, and unnatural.

Example

For instance, just to take one example, here of the mapping from JXON, the following XML:

<BOOKS>
  <BOOK id="1">
    <TITLE>My Favorite Book</TITLE>
    <PRICE>1.23</PRICE>
  </BOOK>
  <BOOK id="1a">
    <TITLE>XML for Dummies</TITLE>
    <PRICE>5.25</PRICE>
  </BOOK>
  <BOOK id="3">
    <TITLE>JSON for Dummies</TITLE>
    <PRICE>200.95</PRICE>
  </BOOK>
</BOOKS>

would be transformed into:

{
 "childNodes": [
  {
   "childNodes": [
    {
     "childNodes": ["My Favorite Book"],
     "tagName": "TITLE"
    },
    {
     "childNodes": [1.23],
     "tagName": "PRICE"
    }
   ],
   "id": 1,
   "tagName": "BOOK"
  },
  {
   "childNodes": [
    {
     "childNodes": ["XML for Dummies"],
     "tagName": "TITLE"
    },
    {
     "childNodes": [5.25],
     "tagName": "PRICE"
    }
   ],
   "id": "1a",
   "tagName": "BOOK"
  },
  {
   "childNodes": [
    {
     "childNodes": ["JSON for Dummies"],
     "tagName": "TITLE"
    },
    {
     "childNodes": [200.95],
     "tagName": "PRICE"
    }
   ],
   "id": 3,
   "tagName": "BOOK"
  }
 ],
 "tagName": "BOOKS"
}

JSON in XForms

During the design phase we went through several iterations

Key realisation: since the aim is only to address existing JSON stores, it is not necessary to be able to convert every possible XML representation into an equivalent JSON representation, only the reverse.

This reduces the task considerably, since it means several features of XML do not have to be addressed, such as namespaces, attributes, and mixed content.

Requirements

Some of the requirements for a mapping from JSON to XML for XForms included:

All possible JSON values be representable
Round-trippable, so that you can both read from and submit to a JSON store.
As natural-looking selectors as possible.

Opaque data

Ideally, an XForm processing JSON data shouldn't have to know which data format has been used; so that, for instance, data such as

{"company":"example.com", "locations":[{"city": "Amsterdam"},{"city": "London"}]}

with the right mapping could be selected with XPath selectors like

locations/city[1]

In this way data could be loaded using content negotiation, and will work whether the data comes in as XML or JSON.

Transformation used

The basic mapping designed is rather simple . Since JSON has no attributes, all content can be represented in elements, and attributes are therefore free to be used to help with the mapping.

Since a JSON value can have several values at the top level, a root element is used <json>. JSON names become XML elements:

{"name": "XForms"}

becomes

<json><name>XForms</name></json>

Types

Strings are the default datatype. In order to allow the processor to distinguish between {"size": 30} and {size: "30"} when serialising, other types are marked:

"age": 21

becomes

<age type="integer">21</age>

and

"registered": true

becomes:

<registered type="boolean">true</registered>

Nesting

Nested values are obvious:

"name": {"given": "Isaac", "family": "Newton"}

becomes

<name><given>Isaac</given><family>Newton</family></name>

Arrays

Arrays are marked specially:

"colour": ["red", "green", "blue"]

becomes

<colour starts="array">red</colour>
<colour>green</colour>
<colour>blue</colour>

This allows selectors like colour[3] to work, but also allows to distinguish things like single element arrays:

{city: ["Amsterdam"]}

from

{city: "Amsterdam"}

and empty arrays:

{"set": []}

from

{"set": ""}

Example

To take an example from the JSON site:

{"bindings": [
        {"ircEvent": "PRIVMSG",
         "method": "newURI",
         "regex": "^http://.*"},
        {"ircEvent": "PRIVMSG",
         "method": "deleteURI",
         "regex": "^delete.*"},
        {"ircEvent": "PRIVMSG",
         "method": "randomURI",
          "regex": "^random.*"}
    ]
}

would become

<json>
   <bindings starts="array">
      <ircEvent>PRIVMSG</ircEvent>
      <method>newURI</method>
      <regex>^http://.*</regex>
   </bindings>
   <bindings>
      <ircEvent>PRIVMSG</ircEvent>
      <method>deleteURI</method>
      <regex>^delete.*</regex>
   </bindings>
   <bindings>
      <ircEvent>PRIVMSG</ircEvent>
      <method>randomURI</method>
      <regex>^random.*</regex>
   </bindings>
</json>

and a JSON selector like

bindings[0].method

would become in XPath (JSON is 0-based, XPath 1-based):

bindings[1]/method

Special Cases

There are a small number of special cases that have to be accounted for:

JSON allows the empty name "", which XML does not allow.
JSON names may contain characters that are not allowed as name characters in XML.
JSON strings may contain any Unicode character; XML disallows most characters below #x20.

Dealing with special cases

Empty names and illegal name characters are easy to deal with: any character that is not possible in XML is replaced with an underscore, and an attribute name is added to the element giving the correct name. The empty name is replaced with a single underscore, and an empty name attribute is used.

For example:

"$": "$"

would be transcribed:

<_ name="$">$</_>

Characters

The third is harder to deal with, with an example being:

{"backspace": "\b"}

The backspace character is completely disallowed in XML (even hex encoded), leaving the only option to leave those illegal characters encoded in JSON notation.

Implementation

Implementation of the mapping is relatively trivial:

At the point where an implementation normally receives a document of type application/xml (or similar), either during initial instance initialisation from an external resource, or as the return value of a submission, if the media type of the resource is application/json, the resource can be parsed, and transformed to an equivalent XML instance, as described above.

The media type can be recorded as an attribute of the root element, so that it can be reused if the instance is to be resubmitted as JSON.

Other formats

Clearly this method can be extended to other datatypes such as VCARD and iCalendar. For instance an iCalendar value such as

BEGIN:VCALENDAR
  METHOD:PUBLISH
  PRODID:-//Example/ExampleCalendarClient//EN
  VERSION:2.0
  BEGIN:VEVENT
    ORGANIZER:mailto:a@example.com
    DTSTART:19970701T200000Z
    DTSTAMP:19970611T190000Z
    SUMMARY:ST. PAUL SAINTS -VS- DULUTH-SUPERIOR DUKES
    UID:0981234-1234234-23@example.com
  END:VEVENT
END:VCALENDAR

can be transformed to

<VCALENDAR>
  <METHOD>PUBLISH</METHOD>
  <PRODID>-//Example/ExampleCalendarClient//EN</PRODID>
   
    <VERSION>2.0</VERSION>
  <VEVENT>
    <ORGANIZER>mailto:a@example.com</ORGANIZER>
    <DTSTART>19970701T200000Z</DTSTART>
    <DTSTAMP>19970611T190000Z</DTSTAMP>
    <SUMMARY>ST. PAUL SAINTS -VS- DULUTH-SUPERIOR DUKES</SUMMARY>
    <UID>0981234-1234234-23@example.com</UID>
  </VEVENT>
</VCALENDAR>

Conclusions

Due to the lack of a need to represent arbitrary XML in JSON, dealing with external JSON values in XForms becomes easy, and natural, in most cases not even exposing the fact that the external data type is not XML in the XForm. The approach can be extended to other types, and thanks to the generality of XML, mostly without restriction.

Future XML: allow all Unicode please; and do something about character entities...

XForms resources

A tutorial: http://www.w3.org/MarkUp/Forms/2010/xforms11-for-html-authors/

For an overview of all features, elements and attributes of XForms 1.1, see the XForms 1.1 Quick Reference.

It's not easy reading, but the final arbiter in questions of doubt is the XForms 1.1 Specification.

XForms 2.0 Draft: http://www.w3.org/MarkUp/Forms/wiki/XForms_2.0

The implementation used for the examples in this talk is XSLTForms.

Treating JSON as a Subset of XML

About me

XForms

Example

Controls and initial values

Constraints

Output

Input and output

Intent-based Controls

Controls are abstract

Initial experience

XForms 1.0 → 1.1

XForms 1.0 → 1.1

Examples

XForms 1.1 → XForms 2.0

Data Opacity

JSON

Example

JSON in XForms

Requirements

Opaque data

Transformation used

Types

Nesting

Arrays

Example

Special Cases

Dealing with special cases

Characters

Implementation

Other formats

Conclusions

XForms resources