Lynda's view of the current state of our research
25 Jan 2001
and edited 19 Feb 2001
and edited 8 May 2001

This piece is meant to give an insight into where I think the group's
research is and is going. It is meant to be broad and not go into any
particular topic in detail.  It is a snap-shot of where we are, and
will be interesting in a year or so to see where we went.

The sections are based on the figure of 5 layers (left hand side,
layers are processing steps) and 4 experts (right hand side, each
expert is a knowledge repository).

Layers: communication means, communicative devices, qualitative and
quantitative constraints, final presentation.

Experts: application, user, design, context.

Background information van be found in the WWW10 paper, the HT 00
paper and Jacco's HT 01 Semantic Web submission (correction, rejection :-( ).


Discourse model
---------------

We need to understand what the discourse model layer is.  Both in
terms of what the structures of the layer should be (e.g. is RST a
good idea, or not), and in terms of how it fits in with the annotated
database items (see Application Expert).  We need a (currently manually
created) connection between the domain-specific ontology and the
communicative devices. (There is a parallel here with style sheets
between an XML document and final presentation, where the ontology is
the "XML document", the communicative device is the "final
presentation" and the rhetorical mapping is the "style sheet".  One
question is, can we ever expect the mapping from the domain-specific
ontology to the communicative devices to be done automatically?)

The information about how to communicate the content is something like the
pedagogical style in learning environments. It may be a domain-independent
(i.e. not domain-specific) ontology (or several domain-independent
ontologies), or it may not.  If it isn't an ontology, then could it be an
XML schema?  I think the question I am asking is whether this "rhetorical
mapping" layer should be information rich, and processable in its own
right, or whether it should be somewhat lighter, such as a style sheet, and
just describe a simple mapping.

I think we need to try things out before we can really answer this.

An article by Boll et al. (see leesklub 7th February 2001) assumes a
consistent storyline and replaces media items in a document with
semantically similar ones (where semantically similar can also depend
on the context).  They do not take into account, nor ask the question,
as to whether the "discourse roles" of the substitute media items are
equivalent.

I think the TUE group have an RMM notion, with database slices.  We
have instead a fragment of the ontology, and the "presentation rules"
(is this is correct RMM term?)  are not just hypertext pages but
include space, time and links. I think they go from domain-specific
ontology directly to presentation and miss out the rhetorical part.


Communicative devices
---------------------

The processing layer needs to select appropriate communicative
devices, but a lot has to happen before we get that far.  A lot of
communication needs to happen between this process and the Design
Expert (including the Network Expert), the User Expert (including the
Context Expert) and then a decision as to which communicative devices are
going to be used is made.

Here perhaps we need to know what the options are for individual
communicative devices and how to take into account information in the
experts. 

We think we know what the communicative devices are - combinations of
links, spatial and temporal layout.  They seem suspiciously like hypermedia
design patterns.  Hopefully Susanne and I can do some investigation in
this direction. Franca Garzotto is already involved with some work in
this direction, and has colleagues in Lugano.  She has already given
me a pointer to a design pattern collection.
http://www.designpattern.lu.unisi.ch

Research questions here are to what extent are existing design
patterns suitable for our time-based world (links and spatial ones
should already exist and be applicable).  Can we then encode them and
have them usefully incorporated in the generation system.  What sort
of encoding? What sort of decision making needs to be made to select
among them.  Do we need to find other more time-based ones (looking to
the worlds of film, TV, or even radio).

Can we already take some existing things and add them to the software
framework?  (No programming time?) Should I first try to write some things
down and then look at how to express them?

Sources for ideas on potential communicative devices are the Scott
McCloud books and the Tufte book.  I'd like to see if I can at least
write down some ideas in English. (Maybe they won't have the status of
patterns, since they are not common solutions to a problem, but it
should be interesting trying to write them down.)

(I am currently working on this.)

Qualitative (and quantitative) constraints
------------------------------------------

I'm not sure what to say about this, other than read the WWW10 paper.
Hopefully this is Joost's master's work, with joint supervision with
Krzysztof Apt.  Does he need other help?  CWI? NL? International?


Generating final form presentation
----------------------------------

So we generate SMIL.  We should also be able to generate WML, or
something that is different. We can also generate annotated SMIL
i.e. SMIL+RDF. MPEG4 has also been suggested.

An example where adding semantics to the generated presentation is
useful is the following. When generating presentation level links there are
two sorts - one sort where the information is less related, and
another sort where they should have been on the screen together, but
there was no space.  This is where adding annotation to the generated
document can be very useful (for later processing - e.g. all the
hub/authority algorithms need to know why the link was there, although
they would need to inderstand the annotation (which can be looked up
in the URL of the ontology, right?)).

There are (at least) 2 types of knowledge that can be included - the
content-based ontology and the communicative means ontology (if it is
an ontology).  This then gets fairly close to the Boll et al. paper on
including annotated media elements in a document.

See Mike Uschold's paper on a categorisation of ontologies and what they
can be used for. (KAW99 paper.)


Application expert
------------------

This is where we get to the ontology work. There are 4 separate
things: 
domain-specific ontology;
annotations from this ontology attached to the media items;
a multimedia-specific ontology (video has scenes, scenes have sequences);
some way of finding relevant media items.

We need a domain-specific ontology (we don't want to create this
ourselves).  The media items in the database need to be annotated with
these.  Again, we don't want to do this ourselves. It would also be
more preferable if the annotations are attached within the media, and
not just to complete objects.

As well as the domain-specific ontology we may also need a multimedia
ontology, describing the composition of the individual media.

The annotated items are going to be used for information retrieval as
well as for presentation generation.
(A WWW10 article by Ronny Lempel and Aya Soffer "PicASHOW: Pictorial
Authority Search by Hyperlinks on the Web" shows that you don't need
ontologies to do image retrieval, but that successful tools can be built on
top of text-search engines using link analysis.  I guess we don't mind,
although we need to know what the items are about to start with, _how_ they
are found we shouldn't have to care about.)

I think the user query should be defined in terms of the
domain-specific ontology.  How that is done may be through a sort of
selection interface (I have an animal classification CD, and the user
can browse through the categories.)  Or it may be via a typed-in
query, or some combination in the background of following the user browse.

Having found the relevant images, texts, audio(?) and video(?) we want
to combine them in a presentation.  We need the domain-specific
ontology.

More details on multimedia and ontologies are in blue note
[[cross-reference to be created]]
"Relationship between ontologies and multimedia".
How does the application expert relate to all the ontologies mentioned
there?  (There is a one-sheet talk for the VU which sort of shows the
application expert and its relationship with the annotations and MM
DB.  http://www.cwi.nl/~media/semantics/GenerationVU.pdf)

Question - do we need a multimedia-specific ontology?  Or do we just
think we do? Can we write a sensible argument for either way?

Note that Jane Hunter's paper at WWW10 is extremely relevant for this.
(Leesklub 9 May 2001.) She suggests combining RDF and XML schemas to use
each for representing what it is best at.

User expert
-----------

I don't think we really want to research this, but we do need to get
some user models from somewhere.  Can Susanne help?
We need to know what needs to be in the user model, and how it is used
in the various layers of the generation process.

I presume the variance will take place in the "upper" processing
levels, such as using different communication means or communicative
devices.  It would be nice if some user characteristics apply more to
the one layer or the other.


Design expert
-------------

The Design Expert needs to know about different users, different
hardware and different network conditions.  It also needs to be a
repository for design information, such as different genres and
styles. It may be the place to store the collection of communicative
devices we should be developing. Perhaps they need to be stored here,
since we may want to use different collections of communicative
devices for different hardware, different users or different available
network bandwidths.

Perhaps we also need a network expert - an interface to the network
that can be used by different parts of the processing chain.  For
example, we may want to build large parts of the presentations knowing
that there will be a approximately consistent bandwidth available, but
that at the final stage, small variations are needed to send the most
appropriate bandwidth video.

We need some sets of design rules, based on hypermedia design
patterns, and more.  Also layout techniques, and even film techniques.
Since we don't have many years of time-based hypermedia experience,
we'll just have to try to capture what we can find.

Perhaps I need to draw some diagrams for information flow between the
communicative device layer, the design and user experts (and network
expert?).


Context expert
--------------

I'm not too interested in this.  It should be a history box which we
will find out whether we need or not, and when we find it out, can
condense knowledge out of it.  No doubt this will feed the development
of the user's online persona...

Similar to the User Expert - we want to use it but not build it.
Fits in nicely with work going on at the TUE - Hong Ying and Alexandra
Cristea starting in May.

-----
The work of the group.

So where do people fit in here?

Michèle is looking at integrating multimedia and semantic markup
formats.  Not part of the processing per se, but a tool to be used within
the process.  He is also looking at integrating multiple MM
information sources within the environment.  Definitely part of the
Application Expert.

FrankN is looking at trying to extract low-level characteristics of
(video) data and then combine them into higher-level descriptions of
the material.  Partly to do with annotating the media (Application
Expert) and partly to do with the Design Expert (what are the things
we can talk about in the Design Expert).

Stephane is interested in the consequences of such an approach for
high-level authoring issues.

Jacco - RDF/XMLschema issues
Joost - qualitative constraints
Lloyd -  XML fragments for MMM 01
Lynda - make design knowledge explicit
Saurabh - network adaptivity
Susanne - user profiles

-----

Miscellaneous notes

Don't forget that what we are doing (to some extent) is automating the
stages of HDM.

The Optima system, Lambert Schomaker, follows the user browsing, uses
it to build up the user expert and context expert.  Would be nice to
see it working.

			      ---***---