Relationship between ontologies and multimedia
	      ----------------------------------------------

				   Lynda and Jacco

This version 5 June 2001
Original version, 29 January 2001
Jacco's notes (cvs: papers/semweb-agenda/draft.txt) from 16 January 2001


This is based on the talk I gave a the Luxemburg Semantic Web workshop.
The trip report for the workshop is at
http://www.cwi.nl/~media/trip-reports/semweb00/semweb-11-00.txt

The slides for the talk are in
~lynda/Talks/SemWeb00/SemWeb00.pdf

This is based on the red boxes in the talk - basically how multimedia
introduces problems in the "standard" world of text-based documents which
will soon include semantic mark-up, which in turn needs ontologies.


Multimedia on the Semantic Web: existing infrastructure
=======================================================

Multimedia and XML

 SMIL is basically about creating marked-up multimedia documents for
 the Web.  XML is the underlying syntax definition, SMIL specifies
 multimedia specifics on top of this.

Multimedia and XPointer/XPath

 XPointer and XPath allow parts of XML documents to be referred to.
 In general, models and syntaxes for addressing document fragments are
 functions of the document's MIME type.  A way is needed for
 addressing fragments of multimedia documents marked up in XML
 (e.g. SMIL) and non-XML formats.  While a pointer language for
 XML-based documents could be based on XPointer, it will need
 extension to support the multimedia-specific aspects of the target
 documents (e.g. time-sensitivity etc).  (See Lloyd's MMM01
 submission, http://www.cwi.nl/~media/blue_book/LinkConstr.html)
 For non-XML formats, (e.g. MPEG and other streamed media formats) new
 pointer languages need to be developed.

 Pointing languages are not only useful for hyperlinking and building
 documents out of pieces of multimedia (e.g. SMIL referring to video
 clips), but also for relating ontology instances to multimedia data.

Multimedia and RDF(S)

 RDF(S) defines a number of primitives which allow domain-specific
 information to be created.  RDFS, while more expressive than RDF, is
 still insufficient for specifying many standard aspects in
 ontologies.  It also lacks the formal underlying semantics that is
 needed to build the necessary tools (e.g. inference engines,
 automated checkers etc).

Multimedia and DAML+OIL

 DAML+OIL is built on top of RDF schema.  It provides a number of
 extra primitives for expressing ontologies, while maintaining the
 formal semantics that allow "cheap" reasoning services.

 Within the world of SMIL, DAML+OIL can already be used to annotate
 SMIL 2.0 multimedia documents (See example at
 http://www.cwi.nl/~media/semantics/smilexample.html) but a number of
 fundamental problems remain.  These are listed below.


Multimedia and ontologies
=========================

We want to combine multimedia and ontologies for a number of reasons:
First, we need to assume that an appropriate domain-specific ontology
already exist;
we then want to associate parts of this to (parts of) media items;
we also want to develop multimedia-specific ontologies (video has
scenes, scenes have sequences);
and then use all this information for finding relevant media items and
then being able to combine them into integrated presentations.


Multimedia presentation ontologies

 In addition to the domain ontologies that are needed to specify the
 subject of a specific multimedia document, we also need ontologies
 that describe the fundamental properties of the multimedia
 presentation itself, independently of the subject-matter domain.  In
 order to be able to create ontologies suitable for (time-varying)
 multimedia, we need to add concepts such as time to the underlying
 ontology language.  If we include a "continuous space", would that
 be sufficient?  (I.e. if you can describe axes with quantitative
 values, is that enough - probably, since HyTime already stopped at
 finite coordinate spaces.)  For most other purposes, current
 "symbolic" features would probably suffice.  Note that this is also
 closely related to the goals of the MPEG7 effort.

 Jane Hunter
 Adding Multimedia to the Semantic Web - Building an MPEG-7 Ontology
 Submitted to Semantic Web workshop, August 2001
 http://archive.dstc.edu.au/RDU/staff/jane-hunter/semweb/paper.html

Annotation techniques for multimedia items and documents

 How do we attach the instances of an ontology to the media items in a
 stored multimedia document?  How do we attach instances to objects in
 live media streams?  How do we make sure transferring and processing
 of metadata does not degrade the quality of service of multimedia
 applications?  Can we stream ontologies and their instances?  This
 would require developement of new, streamable ontology languages and
 inference tools that support the required incremental reasoning.

Annotating current ("legacy") multimedia documents

 To what extent can and should we include annotations in the
 multimedia file itself?  For example, it is possible to include RDF
 in a JPG2000, a SMIL or an SVG file.  The first seems like a
 hack, but the other two already seem more sensible.  When does it pay
 off to integrate annotations into the delivery format and when is it
 better to keep the two separate?

 Note that this is in essence the same problem as hypertext links -
 should they be stored within the document itself, separately in a
 central server, or separately in a more scalable, but more complex
 distributed fashion.  The same applies to ontologies used in
 text-based documents.

Merging and combining ontology fragments.

 This applies to the use of ontologies on the Web in general, and
 seems, to a large extent, still to be an unsolved problem.  In
 multimedia, however, the problem is encountered almost immediately.
 We need to be able to annotate a single media item while referring to
 different ontologies.  When needed, we also want to include only the
 necessary parts of the ontology in the description.
 There are two problems:
 extracting a fragment of an ontology - there is as yet no way of
 specifying ontology "modules".  You can use either a single
 expression, or refer to the complete ontology;
 combining multiple ontologies - this is being worked on by Jane Hunter.

 Jane Hunter, Carl Lagoze
 Combining RDF and XML Schemas to Enhance Interoperability Between Metadata
 Application Profiles 
 http://www10.org/cdrom/papers/572/index.html