Invisible XML: State of Play and Future Directions

Renaming

Renaming allows a different name from the rule name on serialisation. Without renaming, you get

date: y, m, d.

⇒

<date><y>2025</y><m>08</m><d>06</d></date>

While using renaming gives a different serialisation:

date > d: y, m, d.

⇒

<d><y>2025</y><m>08</m><d>06</d></d>

This is completed work; not yet officially published but already widely implemented.

See the modularisation paper [mod] for examples of its use, and the work-in-progress ixml draft [ixml2] for its definition.

Types of ambiguity: bad grammars

Some ambiguities are due to badly written grammars. For instance,

expression: number; identifier; expression, op, expression.
op: ["+-×÷"].
identifier: [L]+.
number: ["0"-"9"]+.

For an input like

a-b-c

this will produce two different parses with different meanings: (a-b)-c and a-(b-c).

The underlying problem is that the input is improperly described, any attempted solution would not guarantee that you get the serialisation that you want in all cases. It is a potential source of technical debt.

Namespaces

In designing XML, the group responsible did a clever thing when adding namespaces:

namespace declarations look like attributes, so that XML documents would be syntactically compatible with earlier software,
they have a different semantic interpretation because they begin with the characters xmlns .

Exactly the same approach could be used for ixml:

things that look like attributes in the serialisation but begin with the characters xmlns should be interpreted as namespace declarations.

For implementations that produce textual output, this adds no extra processing.

For implementations that go directly to an XML internal form, the namespace declarations have to be recognised and handled appropriately, as they are in XML processors.

Versioning

Most software (for instance programming languages) doesn't require its input to specify which version of the processor is required, though some data formats do.

XML does, but it is not obvious what the advantages are; it certainly seems to have obstructed adoption of new versions in the case of XML.

It may be used as a pragma to the processor to require a certain type of processing or checking, but this is only really necessary when the semantic meaning of a particular syntactic structure has changed between versions.

The current method of specifying the ixml version was added in haste shortly before publication of the specification, which was a mistake, because it left no time to implement and try it out beforehand.

A user shouldn't need to know which version they are using: the absence of a version should always be taken to mean "use the most recent version".

References

[abnf] D. Crocker, Ed., RFC 5234 Augmented BNF for Syntax Specifications: ABNF, ietf.org, 2008, https://datatracker.ietf.org/doc/html/rfc5234

[art] Mary Holstege, “Invisible Fish: API Experimentation with InvisibleXML.” In Proceedings of Balisage: The Markup Conference 2024, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Holstege01

[bank] Steven Pemberton, Banking with ixml and XForms, Proc. Declarative Amsterdam 2024, Amsterdam, The Netherlands. https://declarative.amsterdam/article?doi=da.2024.pemberton.banking

[css] Håkon Wium Lie et al. (eds.), Cascading Style Sheets level 1, W3C, 1996, https://www.w3.org/TR/CSS1/

[ixml2] Steven Pemberton (ed.), Invisible XML Specification Community Group Editorial Draft, Invisible XML Organisation, 2026, https://invisiblexml.org/current/

[ixml] Steven Pemberton (ed.), Invisible XML Specification, Invisible XML Organisation, 2022, https://invisiblexml.org/1.0/

[knit] Bethan Tovey-Walsh, “When women do algorithms: a semi-generative approach to overlay crochet with iXML and XSLT.” In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Tovey-Walsh01

[lit] Steven Pemberton, “The Book of Doublends Jined: Parsing Finnegans Wake with ixml.” In Proceedings of Balisage: The Markup Conference 2025. Balisage Series on Markup Technologies, vol. 30 (2025). https://doi.org/10.4242/BalisageVol30.Pemberton01

[mod2] Norm Tovey-Walsh, An Invisible XML modularity proposal, 2026, https://nineml.org/proposals/2026/modularity/

[mod] Steven Pemberton, Modular ixml, Proc. MarkupUK 2025, pp 6-20, https://markupuk.org/pdf/proceedings-2025-2.pdf

[msg] Ari Nordström, “Adventures in Mainframes, Text-based Messaging, and iXML.” In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Nordstrom01

[peg] Bryan Ford, "Parsing Expression Grammars: A Recognition Based Syntactic Foundation" (PDF). Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 2004, ACM. pp. 111–122. doi:10.1145/964001.964011. ISBN 1-58113-729-X.

[pragma] Tomos Hillman, C. M. Sperberg-McQueen, Bethan Tovey-Walsh and Norm Tovey-Walsh. “Designing for change: Pragmas in Invisible XML as an extensibility mechanism.” In Proceedings of Balisage: The Markup Conference 2022. Balisage Series on Markup Technologies, vol. 27 (2022). https://doi.org/10.4242/BalisageVol27.Sperberg-McQueen01

[prio] E. Shinan, Lark: A parsing toolkit for python (2025), github, https://github.com/lark-parser/lark

[rti] Alain Couthures, Text normalization with Invisible XML round-tripping, Proc Declarative Amsterdam 2025, https://declarative.amsterdam/article?doi=da.2025.couthures.grammix

[rt] Steven Pemberton, Round-tripping Invisible XML, in Proc. XML Prague 2024, Prague, Czechia, 2024, pp 153-164, ISBN 978-80-907787-2-6, https://archive.xmlprague.cz/2024/files/xmlprague-2024-proceedings.pdf#page=163

[sym] Various, The First International Symposium on Invisible XML, invisiblexml.org, 2026, https://invisiblexml.org/events/symposium2026/

[trials] C. M. Sperberg-McQueen, “From Word to XML via iXML: a Word-first XML workflow in the TLRR 2e project.” In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Sperberg-McQueen01

[vdb] M.G.J. van den Brand, et al., Disambiguation Filters for Scannerless Generalized LR Parsers. In: Horspool, R.N. (eds) Compiler Construction CC 2002. Lecture Notes in Computer Science, vol 2304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45937-5_12, https://cwi.nl/~jurgenv/papers/CC-2002.pdf

[vin] Ari Nordström, It's Useful After All — VIN Numbers, DITA, and iXML, Proc XML Prague 2024, pp 295-306 https://archive.xmlprague.cz/2024/files/xmlprague-2024-proceedings.pdf#page=305

[wg] Invisible Markup Community Group, https://www.w3.org/community/ixml/[xmlns] Tim Bray et al., Namespaces in XML 1.0, W3C, 2009, https://www.w3.org/TR/xml-names/

[xp4] John Lumley, Invisible XML workbench, Github, 2024, https://johnlumley.github.io/jwiXML.xhtml

[zen] Tim Peters, PEP 20 - The Zen of Python, python.org, 2004, https://peps.python.org/pep-0020/

Invisible XML: State of Play and Future Directions

Abstract

Contents

Introduction

Notation Design

Renaming

Renaming

Modularisation

Example

Round tripping

How it works

Corollary

Ambiguity

Types of ambiguity: input

Types of ambiguity: bad grammars

Types of ambiguity: inherent

Fixing inherent ambiguity

Approaches

Priorities

Example

Adding a priority

Anti ambiguity

Spaces

Dealing with spaces

Ignoring spaces?

Lexerless parsing

Proposal

Namespaces

Proposal

Greedy Matching

Option

Numbered Repeats

Possibility

Pragmas

Proposal

Versioning

Conclusion

References