Old/New Samp<i>L</i>e

Author: Katya
Date: 12 November 2004

The upcoming research process will include an iterative process of refining SampLe knowledge structures and processes until in practice or at least in theory I can show that SampLe knowledge structures and processes can model and realize all the processes presented in overall SampLe vision.

Content:

The overall SampLe vision

SampLe vs. DISC

Presentation structure life cycle in SampLe

Old model rationale

New model rationale

Common data structures issue

Switching between steps in SampLe

Storage and reuse of new presentation structures

How to fit collected material into a genre?

How to define correspondence of the presentation structure to a particular genre?

Context annotations in SampLe

Extension schema (a la DISC)?

1. The overall SampLe vision

SampLe (Semi-Automatic Multimedia Presentation generation Environment) is an authoring system for creating multimedia presentations. The idea behind it is that the system facilitates time consuming actions and supports creative actions. SampLe methodology consists of 3 major stages: stage 1 Topic selection, stage 2 Genre selection and adaptation, stage 3 Material collection and arrangement. The idea is that a presentation creator (or an author) can approach the process of presentation building in a way most suitable to his/her needs. Since we cannot predict what type of user is currently working with the system, we try to provide all possible options. After analysing possible workflows among these 3 stages, it turned out that only 2 workflows are possible, since choosing or manipulating a certain genre assumes its application to a certain topic/character.
The first type of workflow is the one suggested by the current demo, e.g. the one where:
workflow 1:

stage 1 an author first chooses a topic of the presentation during the exploration process
stage 2 then s/he selects a genre, one of available variations of it [and then adapts the chosen variation to his/her needs]
stage 3 finally an author (a) collects media items and (b) orders them

The second type of workflow is the one where an author marks material s/he wants to use in the presentation while browsing, and then tries to fit it into a certain genre structure: workflow 2:

stage 3(a) an author explores a knowledge space to get an idea about what kind of presentation s/he wants to build; while exploring the space the author marks media items s/he would like to use in his/her presentation
stage 1 s/he decides on the most appropriate topic of the presentation
stage 2 the system suggest a genre that fits the chosen material
stage 3(b) final arrangement of the media items inside each presentation section is done

Created presentations should be stored in the system and serve as a reference for new users as well as being a main force for enriching discourse rules and variations of genres. Additionally all the created presentations can be published on the web with explicit semantics.

2. SampLe vs. DISC

In general each system that creates a presentation has to address the question of what a genre (story) template is and how it is represented and used by internal knowledge structures. Both DISC and SampLe contain templates and pursue at the cetain stages of the presentation building process similar/identical goals. Therefore I find it important to discuss the issue of representation and authoring of templates here in order to understand in detail whether they are the same (then I can reuse them and maybe the whole DISC methodology for automatically building initial presentation structures), slightly different (then I have to define what the differences are, why they exist, and what similar parts I could reuse), completely different (I can reference DISC as a related work, forget about DISC and move forward).

On the conceptual level the idea of DISC is not to have templates but to have flexible rules (evolving templates) that allow iteratively build a story line. In this way a story line is evolving path through the knowledge space which appears due to exploration of this space. Practically DISC combines templates and their extension rules in one ontology. The process of building a story switches forwards and backwards between genre-related knowledge and domain knowledge:

biography->PrivateLife narrative unit->MakeSpouse narrative rule->isMarried->Saskia->NewMainCharacter=Saskia->NewGenre=biography...
->MakeOffspring narrative rule->hasChild

Since a new character in DISC can initiate a new story line with a new genre, there is no genre consistency through the story.

DISC is not addressing the problem of finding actual media items when the abstract presentation structure is built. Thus, discourse structures (Discourse class in DISC) are used neither in the model nor in the process of presentation building. In contrast, one of the problems addressed by SampLe is how to retrieve multimedia material based on an abstract structure. We claim that domain annotations are not enough if we want to assist an author at this stage of the process. SampLe introduces Discourse ontology (will be referred in the future as Narrative Function ontology) to distinguish media material based on its possible role in a narrative (discourse).

Ordering of narrative units within the presentation structure is not explicitly present in DISC. On the later stages of the process the flow of the presentation is defined by the structured progression where the order comes from hierarchical relations between units. Basically the order (and thus priorities) is occasional since we cannot count on the certain order in an RDF file. In SampLe it is important to assign the order to narrative units in the genre template, since different ordering might mean violation of the genre rules.

In SampLe the idea of evolving templates is present too. The difference is that they evolve not only due to exploration of the knowledge space (this does happen when I try to extend a template based on the structure of the domain ontology, automatic process), but also indepedently from it. The evolution process would still exist even if no domain ontology is present. In other words presentation structures do not depend on a domain ontology. Of course the connection to the domain ontology exists to help the system in undestanding what each section has to talk about, but this connection is not the major drive of the mechanism which builds the story (presentation structure) and can be provided after the presentation structure is built.

Differences:

in DISC genre templates and rules for thier evolution are merged, in SampLe they are separated;
DISC's extension mechanism depends on domain ontology, in SampLe only one part of it is domain-dependent;
DISC has no control over genre consistency;
DISC is not addressing a problem of finding actual media items, therefore Discourse ontology is not used;
there is no explicit order of narrative units in DISC.

Commonalities:

use evolving genre templates;
use of narrative rules as building blocks for genres;
each narrative rule is connected to a set of domain concepts/relations;
genre templates are defined on the instance level, this allow them to be flexible;

Advantages of DISC:
DISC allows to iteratively build a presentation based on specification of which narrative rules can be applied to which character in a particular genre. A narrative rule can also change the genre which will be applied to the next main character. It has an advantage of being able to build presentations of varying depth [disadvantage - unlimited depth].

Advantages of SampLe model:
A new model allows to specify a flexible mechanism for relating Discourse Structures to Divisions (Prologue/Main/Epilogue) within different genres, narrative units to genres together with specification of order. This means that new variations of a genre can appear.

A part where SampLe could benefit from DISC ideas is the part where a template for a certain presentation needs to be provided. The inputs in this case are the same: the topic=main character and the genre. Currently in SampLe templates are independent of the rules for their extension (domain-dependent extensions), while in DISC they are merged together. Also DISC extension mechanism elaborates only on related characters. In SampLe there is a genre-dependent extension mechanism that can elaborate also on the main character. Thus, SampLe infrastructure should allow modifications of proposed genre templates, a feed-back on whether a newly built presentation structure still correspond to the pre-selected genre and possibility to store this new presentation structure as another template for the genre.

3. Presentation structure life cycle in SampLe

Consequently, the top-level lifecycle of a genre template / presentation structure is the following:

templates for different genres built according to the linguistic rules (term:original genre template)
extensions to these templates depending on a class of the topic (term:original genre template)
extensions to a template based on the structure of the domain when we know the instance of the topic (term:presentation structure)
modifications of the structure made by an author (+ verification whether the new structure still correspond to the selected genre) (term:presentation structure)
storage the new structure as a new template (term:created genre template)

I will try to refer to correspondent concepts in the text by the terms in brackets. I call 'a presentation structure' a structure an author is working on right now. At the moment the final version of this structure is verified as corresponding to a genre, it is stored as a 'created genre template'.

Genre template lifecycle in more detail (specifying only things that are necessary for each stage):

1. There are templates for each genre (biography, essay, article, etc.) that are built according to the accepted rules on what biography, essay, etc. have to be like. Note the presense of rules. This means that initially there can be no templates at all, just a set of rules that tells a developer (in human terms)/system (knowledge structures and rules applied to them) how to build one. (In DISC these rules on the system level are provided in the ontology so there is no separation bewteen knowledge structures and rules). It is just for the sake of conveniece we have a number of templates already created, which reduces the amount of work that needs to be done during the execution.

For example (slash defines alternatives):
a: refers to the domain namespace, disc: refers to the discourse (narrative function)

On the human level: Genre: description essay | Specification of: essay
1 Introduce main character
2 Elaborate about different aspects of the main character
3 Conclude about main character

On the system level: Genre: description essay | subClassOf: essay
1 disc:introduction and a:X
2 disc:elaboration and a:X
3 disc:conclusion and a:X

On the human level with related characters: Genre: description essay | Specification of: essay
1 Introduce main character
2 Introduce related character
3 Show relations between main character and related character
4 Elaborate about different aspects of the main character
5 Conclude about relations between the characters
6 Conclude about main character

On the system level with related character: Genre: description essay | subClassOf: essay
1 disc:introduction and a:X
2 disc:introduction and a:Y
3 (disc:description or disc:comarison) and a:X and a:Y /
[{(disc:description or disc:comarison) and a:X}] and [{(disc:description or disc:comarison) and a:X}]
4 disc:elaboration and a:X
5 (disc:conclusion or disc:comarison) and a:X and a:Y /
[{(disc:conclusion or disc:comarison) and a:X}] and [{(disc:conclusion or disc:comarison) and a:X}]
6 disc:conclusion and a:X

2. Extended genre templates based on the class of the topic. These templates are also created based on human rules.

On the human level: Genre: description essay | Specification of: essay | Topic: Movement
1 Introduce the movement
2 Describe principles of the movement
3 Present members of the movement and their works
4 Conclude about achievements and influences

On the system level: Genre: description essay | subClassOf: essay | Class: Movement
1 disc:introduction and a:Movement
2 disc:description and a:Principle and a:Movement
3 (disc:elaboration or disc:example) and a:Movement and a:Member
4 disc:conclusion and a:Movement

On the human level with related character:

On the system level with related character:

3. Extended genre templates based on the structure of the domain ontology. This is the automatic process.

On the human level: Genre: description essay | Specification of: essay | Topic: Movement | Concrete topic: De Stijl
1 Introduction to De Stijl
2 Artistic principles of De Stijl
2.1 Abstraction and simplicity as a basic principle of the movement
2.2 Primary colors and straight lines
3 De Stijl members and their works
3.1 Piet Mondrian
3.2 Theo van Doesburg
3.3 Gerrit Rietveld
4 De Stijl influences

On the system level: Genre: description essay | subClassOf: essay | Class: Movement | Instance: De Stijl
1 disc:introduction and a:De_Stijl
2
2.1 disc:description and a:Abstraction_and_simplicity and a:De_Stijl
2.2 disc:description and a:Primary_colors and a:De_Stijl
3
3.1 (disc:elaboration or disc:example) and a:Movement and a:Piet_Mondrian
3.2 (disc:elaboration or disc:example) and a:Movement and a:Theo_van_Doesburg
3.3 (disc:elaboration or disc:example) and a:Movement and a:Gerrit_Rietveld
4 disc:conclusion and a:De_Stijl

On the human level with related character:

On the system level with related character:

4. Rationale behind the old model

discourse structures and relations were modelled in an ontology since I needed to query them. It was sufficient to include initially needed relationships;
genre templates were a part of the implementation since they had to be flexible structures;
initially genre templates did not have an influence on relationships within the discourse ontology;

5. Rationale behind the new model

Initially the problem arose from the fact that the current SampLe implementation has a part of genre- and discourse-related information in ontology and another part in the implementation itself. The discourse ontology was inflexible and there was no relation between it and genre templates. This problem was on the agenda.
This problem also had a practical inconvenience. After considering possible approaches to implement item selection per section, the solution which seemed to be the most appropriate was impossible due to the current data structures. [The solution is to relate each selected media item to the section of the current presentation structure and send only this information through the form.]
Besides, considering future extensions of SampLe the current model was definitely inappropriate (there is no way to implement a possibility to start building a presentation from the material collection phase and then try to find an appropriate genre that fits selected material).

After multiple attempts to model a genre template the first conclusion was that it is not possible to stuff all the information I wanted into the ontology. Then I realized that it was due to the fact that different cases require different schema... After analyzing again what is related to what I got the following picture:

there is an essay that consists of Prologue/Main/Epilogue;
each of Prologue/Main/Epilogue are related to certain narrative units (a unit about artists, a unit about movement principles) depending on genre;
each of Prologue/Main/Epilogue are related to certain discourse structures (introduction, definitions, comparison, quote etc.) depending on genre;
narrative units should have priorities representing their order in the genre;

The trick was to turn the existing model upside down and start modeling from the Division perspective (rather than genre perspective) where Prologue/Main/Elipogue is subClassOf Division.

The result of this model (MuseumV2.pprj) is:

The schema level of the ontology defines the hierarchy of classes and relations between their instances. The "template" of a genre is in fact interrelations of instances of those classes. This means that the model enables to create any possible combinations of Division - Genre - DiscourseStructure - Narrative Unit resulting in multiple presentation structures. (Note, that a problem of creating a non-sense combination is not an issue since presentations are created by a human). Which structure is "standard" for an essay is specified separately. We need standard structure for being able to identify whether a new presentation structure still has features of an essay or it began to correspond more to another genre.
Multiple presentation structures allow to see what narrative rules are applicable to what case and extend those rules initially specified in the system. The fact that the rules are case-specific has 2 advantages: first, the context is already initialy attached to the presentation structure - no additional actions or data-structures are required; second, they allow to generalize their common features to a general case.

6. Common data structures issues (the added value of new meta-data modeling approach):

Problems of the approach discussed during the Discourse ontology workshop:

mixture of partOf and subClassOf relationships in a template;
we were trying to model a template according to the same notions of a template we have in a real life;

The problem was that we were trying to model generic templates and not to build a model which can incorporate necessary structures. If we have such a model, then certain presentation structures can be defined as being "generic" or better "standard" which would actually mean: "This presentation structure is a template for an essay", where presentation structure = InstanceOf (Essay).

From the 2 alternative ways to model generic templates (blue_book/common_datastructures.txt):

they both specify the generic case;
the problem with specifying the generic case as your schema is that you are not able to say certain things about your structures, since these things are case-specific. In other words there was a problem that the schema should look differently in different cases;

For example: I wanted to specify that certain discourse structures (introduction, description, comparison, quote etc.) are used in the particular top-level division section of the presentation (e.g. Prologue) but it only holds for a particular genre.

the flexibility problem in the second template arises because it models unchangeable (they are defined by the schema!) relations between what narrative units can belong to what genre and in which top-level section structure they can be used (Intro/Main/Colclusions -> in my terms they are Prologue/Main/Epilogue);
the extensibility problem of the first template was caused also by the fact that the template of a genre specifies a schema = unchangeable;

The rest of the sections is basically the questions which need to be answered with the implementation.

7. Was/is it possible to implement switching between phases?

8. Was/is it possible to save created presentations and reuse them?

9. Was/is it possible to fit collected material into a genre?

10. How to define correspondence of the presentation structure to a particular genre?

11. Was/is it possible to add context annotations in SampLe?

12. How to implement an extension schema in SampLe?

(a schema which allows iterative building of a presentation structure using domain knowledge together with flexible extension mechanism)

Author: Katya Date: 12 November 2004