Banking with ixml and XForms

The author
Steven Pemberton
, CWI, Amsterdam

Contents

ixml: talks per year

The number of talks about ixml each year

2004

Not a talk about ixml, but a keynote I gave on the design of notations.

Where I said, in passing:

"Parsing is quite easy

It would be fairly easy to add a generalised part to the XML pipeline that parsed unmarked-up text, and produced XML as a parse tree: it's just a different sort of transform.

We could have our cake and eat it!"

2013

First paper.

Two best reviews ever!

"This is clearly a submission that needs to be shredded, burned, and the ashes buried in multiple locations"

"I think the audience will eat him alive. But I want to be there to hear it."

That notwithstanding, it did get accepted:

We will also ask you to:

Timeline

2016, 2018: Iterations, based on user experience

2020: Working group formed

2022: Formal specification published

2024: 5 or 6 implementations, with others in development

What is ixml?

Most textual data has an implicit structure.

Dates: 8/11/2024

URLs: http://cwi.nl/~steven/Talks/2024/11-08-banking/

Bibliographies: Steven Pemberton, Banking with ixml and XForms, Declarative Amsterdam 2024.

ixml turns data with implicit structure into data with explicit structure.

<date>
   <day>8</day>
   <month>11</month>
   <year>2024</year>
</date>

How does it work?

You describe the format.

date: day, -"/", month, -"/", year.
day: digit, digit?.
month: digit, digit?.
year: digit, digit, digit, digit.
-digit: ["0"-"9"].

Feed the format and the data through the ixml processor.

Get structured data!

How does it work?

Feed the format and the data through the ixml processor:

ixml process

Aims included

Get more data into XForms:

XForms expects data in XML, and many sources of data aren't in XML

Define XForms programs in text:

It can be faster, easier, and more human-oriented to write programs in text, and let the computer convert it to XML.

How ixml is being used

Banking

My bank has stopped sending paper statements.

It offers online statements instead, either in PDF for printing yourself, or CSV.

It does offer some online search facilities, but is slow, and inconvenient.

I wanted something similar to how Quicken used to work.

So I wrote an application in XForms.

Demo

XForms

XForms is a declarative, XML-based, programming language

It is a declarative rather than a procedural language: this makes life much easier for the programmer. It is like spreadsheets but generalised.

Large projects have shown huge productivity gains:

Data

To write the app in XForms I first had to get the data into a usable form.

As I said, the bank supplied PDF, or CSV, so CSV it had to be.

CSV

Use sed?

"Date","Name / Description","Account","Counterparty","Code","Debit/credit","Amount (EUR)","Transaction type","Notifications"
"20161230","The Movies Art House AMSTERDAM","NL80INGB1234567890","","BA","Debit","11,00","Payment terminal","Card sequence no.: 009 29/12/2016 18:21 Transaction: 90B7V1 Term: FWG77T Value date: 30/12/2016"
"20161229","DIJKMAN B.V. MUZIEK AMSTERDAM","NL80INGB1234567890","","BA","Debit","99,00","Payment terminal","Card sequence no.: 009 28/12/2016 15:59 Transaction: 48X1S3 Term: 465101 Value date: 29/12/2016"
"20161227","CCV*FOODHALLEN AMSTERD AMSTERDAM","NL80INGB1234567890","","BA","Debit","6,25","Payment terminal","Card sequence no.: 009 24/12/2016 17:56 Transaction: 84P101 Term: CT449077 Value date: 27/12/2016"

Sed

The sed was largely opaque:

head -n 2 $1 | tail -1 | sed 's/"\(....\)[^,]*,[^,]*,"NL..INGB\([^"]*\)".*/<bank year="\1" acct="\2">/'

tail --lines=+2  $1 | sed '
     s/^/<entry>/
     s/&/&amp;/g
     s/"\(....\)\(..\)\(..\)",/<date>\1-\2-\3<\/date>/
     s/"\([^"]*\)",/<type>other<\/type><name>\1<\/name>/
     s/"\([^"]*\)",/<from>\1<\/from>/
     s/"\([^"]*\)",/<to>\1<\/to>/
     s/"\([^"]*\)",/<code>\1<\/code>/
     s/"Af","\([^,]*\),\([^"]*\)",/<amount>-\1.\2<\/amount>/
     s/"Bij","\([^,]*\),\([^"]*\)",/<amount>\1.\2<\/amount>/
     s/"\([^"]*\)",/<sort>\1<\/sort>/
     s/"\([^"]*\)"/<description>\1<\/description>/
     s/$/<\/entry>/
    '

echo '</bank>'

ixml

The ixml version at least is more readable.

   bank: -head, entry*.
   head: -'"Date","Name / Description","Account","Counterparty","Code","Debit/credit","Amount (EUR)","Transaction type","Notifications"', nl.
  entry: date, type, name, from, to, code, amount, sort, description, nl.
   type: +"other".
   date: -'"', y, +"-", m, +"-", d, -'"', -",".
     -y: digit, digit, digit, digit.
     -m: digit, digit.
     -d: digit, digit.
   name: field, -",".
   from: field, -",".
     to: field, -",".
   code: field, -",".
   sort: field, -",".
description: field.
 -field: -'"', c*, -'"'.
     -c: ~['"'; #a; #d].
 amount: neg; pos.
   -neg: -'"Af",',    +"-", number, -",";
 -'"Debit",', +"-", number, -",".
   -pos: -'"Bij",',         number, -",";
 -'"Credit",',      number, -",".
 -number: -'"', euros, -",", +".", cents, -'"'.
  -euros: digit+.
  -cents: digit, digit.
  -digit: ["0"-"9"].
     -nl: (-#a; -#d)+.

ixml detail

A bank statement consists of a head (the labels) that is completely ignored, followed by any number of entries:

bank: -head, entry*.
head: -'"Date","Name / Description","Account","Counterparty","Code","Debit/credit","Amount (EUR)","Transaction type","Notifications"', nl.

By including the labels literally, if the bank ever changes the format, I will get an immediate error.

ixml detail: entries

Each entry has a number of fields:

entry: date, type, name, from, to, code, amount, sort, description, nl.

I have added an extra field, type, that my application will use. In the input it is empty, and will appear in the output as a preset field.

type: +"other".

will always appear as

<type>other</type>

which will later get changed in the application.

ixml detail: dates

A date is a field with a string of numbers in the input, like "20161230" so I add hyphens at the right places:

date: -'"', y, +"-", m, +"-", d, -'"', -",".
  -y: digit, digit, digit, digit.
  -m: digit, digit.
  -d: digit, digit.

which would give

<date>2016-12-30</date>

I don't need date structures to be any more refined than that.

ixml detail: character fields

A number of the fields just contain unstructured character data:

       name: field, -",".
       from: field, -",".
         to: field, -",".
       code: field, -",".
       sort: field, -",".
description: field.
     -field: -'"', c*, -'"'.
         -c: ~['"'; #a; #d].

A field is just zero or more characters surrounded by quotes, where a character is anything except a quote or an end-of-line character.

<name>The Movies Art House AMSTERDAM</name>

ixml detail: amounts

Finally amounts are recorded in an odd way, using two fields. Banks don't believe in negative numbers, just positive amounts of debit or credit:

"Credit", "11,00"
"Debit", "6,25"

Note the European style of number representation.

amount: neg; pos.
  -neg: -'"Debit",', +"-", number, -",".
  -pos: -'"Credit",',      number, -",".
-number: -'"', euros, -",", +".", cents, -'"'.
 -euros: digit+.
 -cents: digit, digit.
 -digit: ["0"-"9"].

Which gives

<amount>11.00</amount>

or

<amount>-6.25</amount>

The XML

<bank>
   <entry>
      <date>2016-12-30</date>
      <type>other</type>
      <name>The Movies Art House AMSTERDAM</name>
      <from>NL80INGB1234567890</from>
      <to/>
      <code>BA</code>
      <amount>-11.00</amount>
      <sort>Payment terminal</sort>
      <description>Card sequence no.: 009 29/12/2016 18:21 Transaction: 90B7V1 Term: FWG77T Value date: 30/12/2016</description>
   </entry>
   <entry>
      <date>2016-12-29</date>
      <type>other</type>
      <name>DIJKMAN B.V. MUZIEK AMSTERDAM</name>
      <from>NL80INGB1234567890</from>
      <to/>
      <code>BA</code>
      <amount>-99.00</amount>
      <sort>Payment terminal</sort>
      <description>Card sequence no.: 009 28/12/2016 15:59 Transaction: 48X1S3 Term: 465101 Value date: 29/12/2016</description>
   </entry>
   <entry>
      <date>2016-12-27</date>
      <type>other</type>
      <name>CCV*FOODHALLEN AMSTERD AMSTERDAM</name>
      <from>NL80INGB1234567890</from>
      <to/>
      <code>BA</code>
      <amount>-6.25</amount>
      <sort>Payment terminal</sort>
      <description>Card sequence no.: 009 24/12/2016 17:56 Transaction: 84P101 Term: CT449077 Value date: 27/12/2016</description>
   </entry>
</bank>

Credit card account

While my bank will gladly give me downloadable statements for my current account, for some reason it doesn't for my credit card, even though it doesn't send paper statements for that either.

All it does is display my transactions on the screen.

The Display

The Credit Card Display

The solution: ixml

I can select all this text from the screen, and copy it as text to a file.

Then I can use ixml to transform it to structured data.

Text

Transactions current period
2024
Today
EUR

Kobo Software Ireland D02T380 IE
−5.49
22 October
EUR

THEATRE ROYAL HAYMARKE LONDON GBR
−214.62
10 October
EUR

SP THEPHONESHOPBE JETTE BEL
−1,199.00
Period of 5 Sept 2024 to 4 Oct 2024
Opening balance for this period:
0.00
Transaction total:
−289.33
Monthly repayment:
289.33
Closing balance for this period:
0.00
2024
4 October
EUR

AFLOSSING
289.33
3 October
EUR

OTT* NT AT HOME LONDON GBR
−11.39
2 October
EUR

Google Payment IE LTD Dublin IRL
−14.99
24 September
EUR

TEAPIGS BRENTFORD GBR
−140.56
23 September
EUR

The Pilgrim Hotel London GBR
−4.38
22 September
EUR

The Pilgrim Hotel London GBR
−9.18
18 September
EUR

TRIPIT REDMOND USA
−44.99

The ixml: top level

So let's write some ixml for this. The top level is the current period, followed by a number of earlier periods.

Both contain a number of days.

     cc: current, period*.

current: -"Transactions current period", -#a,
         day*.

 period: -"Period of ", from, -" to ", to, -#a,
         opening, total, repayment, closing,
         day*.

   from: -date.
     to: -date.

The ixml: detail

This is mostly uninteresting stuff, since the opening and closing balances are always zero, and the repayment amount is in the transactions anyway.

  opening: -"Opening balance for this period:", -#a, -amount, -#a.
  closing: -"Closing balance for this period:", -#a, -amount, -#a.
repayment: -"Monthly repayment:", -#a,               -amount, -#a.
    total: -"Transaction total:", -#a,               -amount, -#a.

The ixml: days

The interesting stuff is in the days. Each day has the date and a number of transactions (the currency is always EUR):

        day: date, -#a, (-"EUR", -#a)?, transaction+.
transaction: -#a, payee, -#a, amount, -#a.
      payee: ~[#a]*.

The ixml: amounts

About the only impressive thing about this mess is that they do acknowledge that negative numbers exist, and (correctly) use #2212 as minus sign, which I replace with hyphen.

They use commas to separate thousands, and point for the decimal separator:

amount: (-#2212, +"-")?, (digit; -",")+, ".", digit, digit.

The ixml: dates

Ugh. Sometimes the year is before, sometimes after, sometimes not at all. Sometimes it's the year and the word "Today":

date: y, -#a, (d, -" ", m; "Today"); 
      d, -" ", m, (-" ", y)?.
   d: digit, digit?.

Month names are either written in full, or as 3 letters, with the exception of Sept...

m: "January"; "February"; "March"; "April";
   "May"; "June"; "July"; "August"; "September";
   "October"; "November"; "December";
   "Jan"; "Feb"; "Mar"; "Apr"; "May"; "Jun";
   "Jul"; "Aug"; "Sept"; "Oct"; "Nov"; "Dec".
y: digit, digit, digit, digit.

-digit: ["0"-"9"].

The XML

<cc>
   <current>
      <day>
         <date>
            <y>2024</y>Today</date>
         <transaction>
            <payee>Kobo Software Ireland D02T380 IE</payee>
            <amount>-5.49</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>22</d>
            <m>October</m>
         </date>
         <transaction>
            <payee>THEATRE ROYAL HAYMARKE LONDON GBR</payee>
            <amount>-214.62</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>10</d>
            <m>October</m>
         </date>
         <transaction>
            <payee>SP THEPHONESHOPBE JETTE BEL</payee>
            <amount>-1199.00</amount>
         </transaction>
      </day>
   </current>
   <period>
      <from>
         <d>5</d>
         <m>Sept</m>
         <y>2024</y>
      </from>
      <to>
         <d>4</d>
         <m>Oct</m>
         <y>2024</y>
      </to>
      <opening>0.00</opening>
      <total>-289.33</total>
      <repayment>289.33</repayment>
      <closing>0.00</closing>
      <day>
         <date>
            <y>2024</y>
            <d>4</d>
            <m>October</m>
         </date>
         <transaction>
            <payee>AFLOSSING</payee>
            <amount>289.33</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>3</d>
            <m>October</m>
         </date>
         <transaction>
            <payee>OTT* NT AT HOME LONDON GBR</payee>
            <amount>-11.39</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>2</d>
            <m>October</m>
         </date>
         <transaction>
            <payee>Google Payment IE LTD Dublin IRL</payee>
            <amount>-14.99</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>24</d>
            <m>September</m>
         </date>
         <transaction>
            <payee>TEAPIGS BRENTFORD GBR</payee>
            <amount>-140.56</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>23</d>
            <m>September</m>
         </date>
         <transaction>
            <payee>The Pilgrim Hotel London GBR</payee>
            <amount>-4.38</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>22</d>
            <m>September</m>
         </date>
         <transaction>
            <payee>The Pilgrim Hotel London GBR</payee>
            <amount>-9.18</amount>
         </transaction>
      </day>
      <day>
         <date>
            <d>18</d>
            <m>September</m>
         </date>
         <transaction>
            <payee>TRIPIT REDMOND USA</payee>
            <amount>-44.99</amount>
         </transaction>
      </day>
   </period>
</cc>

XForms Code

The code is simple: we repeat over the top level elements (either current or period), provide a heading, then repeat over the days, and in the days, repeat over the transactions:

<repeat ref="*">
   <label class="period">
      <output value="if(from, concat(from/y, '-', from/m, '-', from/d, ' to ', to/y, '-', to/m, '-', to/d), 'Current')"/>
   </label>
   <repeat ref="day">
      <output ref="d"/> <output ref="m"/>
      <repeat ref="transaction">
         <output class="amount" ref="amount"/> <output class="payee" ref="payee"/>
      </repeat>
   </repeat>
</repeat>

Demo

Conclusions

ixml makes textual data available to the XML pipeline as structured data

ixml is being added to XPath as a function

Which means that ixml will shortly automatically be available to XForms