by Katya, 06-06-2005

WARNING! Initially this document was a set of my notes, ideas and conclusions, so it was not a coherent story. I tried to make it readable by providing this text in the beginning of the document with introduction into the whole thinking/discovering process. This text contains references to the other parts of this document which should be read in the referred order. The complete coherence is still not guaranted though...

Prehistory of this blue note

After the first SampLe demo and the submission of the journal paper (which is now accepted) I started to think what else I want to do within the Sample environment. There were many possible things to jump on, so I had to choose. To have a better overview of what exactly these things are I analysed possible workflows in Sample. It turned out there are 2 major processes that need to be supported. Some minor variations of these processes covered more or less the required SampLe functionality: the 1st one is the workflow implemented in the first demo; the 2nd one is the workflow when an author starts with repository exploration and collects media material which she then wants to arrange into a meaningful structure. I decided to start with the most general case when the system does not have any other knowledge about what story needs to be built from the set of media items besides the information contained in the annotations of these items, our semantic network and the discourse model. The discourse model was the one (conceptually) used in the first demo, where it contained templates for different genres. A template is represented by a domain and discourse concepts which describe which should be represented in what part of the discourse structure of the genre (Prologue, Main or Epilogue parts).

During this time another work I did was on designing together with Mark van Assem a more complex discourse model which would be able to support actions such as saving a newly built presentation as being an essay even though it does not completely correspond to one of the templates for an essay. The Hypertext paper which was intended to describe these models was not finished since I could not provide any evaluation for the model. One way to evaluate it would be to build a demo that allows the functionality this discourse model is good for. But due to the currently selected path with evaluation still didn't take place.

Further process

The problem stated was to produce (a) presentation(s) out of a set of media items using the discourse model which contains the templates for different genres and then evaluate which of these templates suit better for the particular set of media items. I started implementation by simply matching the annotations of media items to section descriptions in each of the genre templates. A template contains domain classes as a content part description of what each section has to be about. I chose a strategy in which each instance of the section class found in the annotations produces a subsection, for example section:Artist would be divided into 3 subsections: Piet Mondrian, Theo van Doesburg, Gerrit Rietveld, if we find references to 3 different artist in the annotations. This first approach is described by the Case 1. Below there are issues which came up.

'Things discovered' caused the appearance of the list 'Possible things to change' (see both of them below, ignore stuff in between for now). At this point I stated to doubt if there is any added value for having templates in this process. My first idea was to compare 2 strategies: 1st where a presentation is built using templates, and 2nd, when a presentation is built by dynamically building a template from the information we have from the set of media items. Then I also realized that there are many other aspects that influence the result of the presentation building process (the 8 items in the list of the Case 1). So I was thinking to work out different cases where I would change one aspect and see how it influences the whole process. I had an idea that it might lead to a comparison of different things you can achieve if you are using a certain type of a discourse ontology, discourse model etc. (The idea was that there are a number of research strategies in our group and in some other groups that aim to produce a story/a presentation using domain and discourse descriptions of items. But there is no guidance in what case what strategy is better for achieving a certain quality of results). But first it would be very difficult to establish a framework like that, to prove that it should consist of exactly those attributes, to describe in details all the strategies used. Besides, we concluded after the discussion with Jacco and Frank that there are not enough systems to argue about such a framework and in general I don't want to go this way. By going a bit further with the generation process I also realized I'd be comparing generation models (which is not a feasible goal), since the process of creating a presentation (a coherent story) from the set of media items requires a lot of information that is a part of the particular generation model (and it is not present in the ontologies).
Note: it would be interesting to investigate how much of the knowledge currently used (or planned to be used) by the generation model could be trabsformed into 'static' knowledge withing ontologies.

After all that I decided to work further on the problem of creating a coherent story out of the set of media items first with using templates but improving on the generation model and then skipping templates and building a structure of a presentation based only on the knowledge contained within the set (plus domain and disourse role ontology of course). It turnes out the the generation model should contain a lot of various components and there are a number of decisions to be made (see 'Conclusions' and the blue text).
The current idea is to finish that and to look again at the 1st version of the workflow to see which of the processes can be re-used. (In particular it should be possible to reuse presentations created with the second workflow as the templates for the first one). The final goal is to create an environment where various workflows support is available, to see how many and which support processes we really need and which of the subprocesses are re-used within higher-level processes.

Case 2 represents the first attempt on a strategy without templates. 'Tracing annotations' part is the beginning of writing my ideas about the logic which is used for annotating media (I was trying to understand what exactly I wanted to say when I was attaching a particular domain or discourse concept to a media item). I think that these type of decisions can be explicitly represented and used within generation processes. But it is a future story...

Remember to write for all the conclusions I make what this means on the more abstract level (in blue).

Case 1

(current situation - presentation.owl):

small domain ontology;
small discourse role ontology;
simple content templates;
small set of media items (9 texts, 8 images);

domain annotations: a small number of domain concepts attached to a media item;
discourse role annotations: a small number of domain concepts attached to a media item - refer to a media item as a whole;
discourse model: templates are build by attaching domain and discourse concepts to a division in a template, each domain concept in the division creates a section, each discourse role concept defines which discourse roles can be used within this division:
generation model: it matches media items agains domain and discourse role concepts in the templates. Each instance of a domain concept from the template creates a subsection.

genre: Essay
Prologue:
   C1
   [DR]
Main:
   C2
   C3
   C4
   [DR]
Epilogue:
   C5
   [DR]

Things discovered:

- small domain ontology causes simplicity of the templates (even thought they are created by a human author);
- multiple domain concepts per media item cause this media item appearing in different locations in the presentation structure, thus additional strategies/actions are required to resolve this situation. The current idea is to use a notion of 'semantic closeness' to calculate where a particular media item should belong;
- simple matching of annotations and creating new subsections for each instance of a section class in the generation model creates a lot of mess. First, there is an uncontrolable appearance of a 'related character' in the Prologue and Epilogue. Second, I do not specify that only those domain concepts that have relations to the main character (also main character is not explicitly defined) should be used in the presentation.

Conclusions:

Improve template method first before going into comparison with grouping etc.

Current progress:
- main character: max appearance of a concept in annotations
- related character: second max OR all other instances of the main character class appearing in the annotations
- I've taken RelChar = second max
- build a section about RelChar in the beginning of Main if RelCharClass != NUCLasses;
- identify which concepts in annotations have relations to main/related characters (De Stijl/Neo-Plastism); it is done by rdf(concept,?,main/related) or rdf(main/related,?,concept)
- use only directly related concepts for creating sections/subsection;
- build rules for specifying the best matching pairs of main/related characters; - always depends on the situation at hand - such pairs would not make sense;
- dealing with other movements: other movements are divided accoeding to the chronological order into pre- and post-. They are arranged chronologically inside pre- and post- groups as well. Pre- group appears in Main before the RelChar section. Post- group appears in the end of Main.
- dealing with other occuring concepts (/indirectly related concepts): inside a section or subsection allow appearance of media items whose annotation concepts are indirectly related to the section/subsection concept and are of the same class as the section/subsection concept, e.g. in a subsection about Piet Mondrian I allow texts about Picasso, since they belonged to the same movement=Cubism.
- decide how to deal with examples: if there is a text talking about an image, place this image in the section with this text; if MainChar!=Artist, distribute images between Artists and Movements sections; if MainChar=Artist, distribute images chronologically.

-elaborations about MainChar=De Stijl do not get picked for any section. It means thus that all other sections in the presentation already elaborate on the MainChar.
- check what happens if I have equal scores for characters
- what if the story needs to be created around artefacts?
- ?: Cubism doesn't win the 2nd max because cubist paintings for example get counted on their own and do not contribute to Cubism score -> count Name:Value pairs differently. This (contributing to the cubism score) actually means that I assume that concept "Cubism=Movement" is more important than a painting. This is quite logical since this presentation is about De Stijl=Movement. In the same time this might mean a very restrictive way of defining the main character. It might be that with max measure certain topics will never win. Maybe the presentation is about the painting, but since there are so many other concepts needed for the explanation that the painting concept itself does not score very high. In the same time the movement concept can be present in many media items, since it is a 'broader'/more general concept in a way. So maybe we should provide different heuristics for calculating the score for the main character:

Movement scores with max;
Painting scores with max/3;

It seems that I'm coming back here to the thought of prioritizing concepts in the ontology.

- there are multiple possibilities for identifying the main character;
- building a hierarchy of concepts can be useful/beneficial;
- the issue about managing a related character cannot be avoided;
- I can't rely solely on a template for building a coherent structure, I still have to provide additional mechanisms for managing all domain concepts appearing in the annotations of media items from the set;
- maybe building these mechanisms will show me the way to created more useful templates?
- grouping can't be avoided to the certain extent (I still do "grouping" when I create subsections);
- if I think about it, templates seem a logical step when you do not have any starting point or any initial information for presentation building. In the case of building a presentation out of an existing set of media items it is logical to assume that this set contains more than enough of information to steer structure building. Templates should not be necessary in this situation.

Possible things to change:

- simple templates do not have much of an added value. Assumption: a grouping strategy might give the same results.
- enrich domain ontology in depth -> create more sophisticated templates. Assumption: Maybe there is an added value of richer templates;
- Assumption: the current set of media items is too small to notice an advantage of simple templates -> Extend the set.
- change the 'generation model'
(I'll call a 'generation model' a way I'm constructing presentation structure out of a set of media items / as opposite to 'discourse model' - what is in the file presentation.owl - discourse knowledge a generation model operates upon / 'discourse role annotations' - discourse roles concepts attached to a media item)
change the way discourse role annotations are attached - not per text but per concept. Build a new model based on this new structure;
- produce requirements for a discourse model (e.g. do not repeat discourse roles in different divisions of a template).
- try out a low-level approach (set up the first text as the starting point, build the sequence of texts based on the semnatic closeness of domain annotations attached to each text in the set).

Questions:

- should I concentrate on 1 genre only?

Case 2

(presentation1.owl):

small domain ontology;
small discourse role ontology;
no content templates;
small set of media items (9 texts, 8 images);

domain annotations: a small number of domain concepts attached to a media item;
discourse role annotations: a small number of domain concepts attached to a media item - refer to a media item as a whole;
discourse model:consists of discourse roles concepts attached to divisions.

generation model: a presentation structure is built via grouping texts having the same concept in their domain annotations together, single text groups are eliminated as noise, the group with maximum amount of texts is not considered in the process, since it defines the topic of the presentation (in the presentation about De Stijl we don't want to have a section called De Stijl). The generation model builds the structure in the following way:

I decided to extend the amount of media item for all further experiments since the results will be more representative. What in fact is interesting to know are the ways I can achieve maximum coherence of presentations. I have to take a scaling factor and specificity of my case into account. The current situation is that I have a restricted number of media annotated, small amount of domain concepts in use, the ontology is not 'deep'.

Changing parameters:

- discourse role ontology - I don't want to change it (too complex ontology makes it difficult to annotate);
- domain ontology - expensive for me/for others;
The domain ontology is not small. It is shallow. The problem is that not so many domain concepts are used for annotations, because the set of media items annotated is not big. So extending the amount of material annotated will automatically expand the domain ontology. The restrictions of the templates currently used comes from the fact that these templates are using classes not instances. (The domain ontology is defined by the media items).

Tracing annotations:
- When a text has a number of discourse role annotations it might mean that
a) this text represents all these roles;
b) these roles are alternatives and depend on the point of view;

Generation model remarks:

- do not eliminate singles, a concep a single represents might not be that important to be a section on itself, but the information it presents can be very valuable for the overall presentation (e.g. a description of a painting). In general I should try to include all the selected media items in to a presentation. This means eliminating concepts from the structure, but not media items from the presentation.

Just some useful stuff from the literature
Designing Multimedia for Learning: Narrative Guidance and Narrative Construction
... an Aristotelian concept of narrative (at its simplest, the concept that texts should have a beginning, a middle and an end) is located into western European thought and shapes our expectations.

7 July

For papers

Workflow paper

- count combinatorically possible combinations taking constraints into account
- find literature on different types of authoring for each phase in SampLe for text, video etc. I need to show that various sequences of processes appear and those sequences I got out of my analysis make sense

Generation model paper

- define boundary cases for comparison (deeper domain ontology, more detailed discourse structure) based on why I think templates do not have an added value; - see what other parameters are involved and try to foresee their influence (not to program every possible case)

8 July/11 July

Problem with templates (a need for a meta-discourse model?)

- as the implementation of templateMapping (structuring a set of media items with the help of templates) and tesing it with different main characters showed, different rules are needed in the generation model for managing other occuring concepts (other than those defined within a template). This problem occurs since a tempalte contains only classes. Thus, there is no means to specify how different instances of these classes can relate and what to do with each of these instances. E.g. a rule that if MainChar=Movement and there are other related movements, they should be divided between 2 sections: Preceding and Following movements. For an artist as the MainChar the same strategy would not make sense. Why?

Because the time-based relation is not the one that defines representative/meaningful relationships between 2 artists.

Conclusions:
- analyse which relationships I look for in the rules and find a way to include them into a discourse model/template
- the rules/dependencies in a discourse model/template should depend on: MainChar,Concept,Relation, where Concept is the current concept that needs to be palce within the discourse structure, Relation specifies the relation between MainChar and Concept that defines placement of Concept into a certain place in the discourse structure. (Still sounds like DISC, doesn't it?)
- besides, there should be different templates for the same genre (Essay) with different main characters (e.g. Movement/Artist). Since the set of concepts used to build templates is very small, the templates are basically very similar. The only thing which is necessary is reshuffling of concepts (changing their order). The exception is the case of Artist as the main character where different template structure is needed.
I have a general model which defines structure of genres for discourse (it maps directly annotations to the places of these annotation concepts within a discourse structure), but I don't have correspondent structure for domain concepts.
- so there are certain dependences between the class of the main character and the way a template should be built and processed later with the generation model. This brought me to an idea that it would be nice to have a "meta-discourse model", which would provide the rules for building a discourse structure based on th class of the main character and supply with information how to process this structure.
- on the other hand it leads to an idea that templates the way I have them is not a good solution. With regard to a meta-model it sounds pretty much as DISC-approach. So the solution is to reconsider requirements I have for the format/attributes of a discourse structure that come from workflow analysis (this format has to enable the support of the certain processes) and the requirements from the points above (the reasons I was thinking about a meta-model).

Dependences:

Generation model --isIndependentFrom--> Discourse model
Generation model <--definesMapping-- Discourse model --definesMapping--> Semantic Network
Since currently most of the system intelligence is in the generation model, the rules are not explicit and thus are not exchangable, difficult to change and reason about.

Results:

Requirements:
from workflow analysis:
- multiple structures possible for genre+topic
- identification of Prologue/Main/Epilogue parts within a discourse structure and their connection to discourse concepts
- explicitly defined order of sections - what does it depend on?
- do not try to completely restrict all attributes (there can be the rules for the order specification but not templates) from implementation/testing:
- all the specific rules how various related characters should be treated in each specific case (for each class of the main character) should be also preferebly included into a discourse model (e.g. what to do with the concepts having the same class as the main character).
- I'm also mixing rules for creating a discourse structure with rules for finding media items, e.g. if discStructure contains Text1 and Text1 talksAbout Artwork then include an image of this Artwork in the presentation if this image exists in the selection. from DISC:
- overcome uncontrolable development of the story line (can be partially done in DISC
- overcome uncontrolable appearance of related characters (still not sure)

Research questions:

- How to extend DISC approach to support variety of ways for building a discourse structure? Basically how to extent it to confirm to the requirements above. (How to bring it to a more abstract level?)

Things to discuss:

After I come up with some solution or approach, see how it relates to Stefano's former ideas about DISC improvements. Maybe we can have some collaboration on that. Besides, we should also relate the current framework for VOX POPULI (VOX POPULI is in fact a particular type of presentation a user of SampLe might want to build, so SampLe should be able at least in theory to support those discourse structure types). In the end we might come up with the basic components and processes the multimedia authoring system should have based on our experiences with different media types and authoring tasks.

Next step (it makes sense to do it before the previous step, since changes in the annotation schema might have a strong effect on the generation model and the discourse model):

There are multiple prerequisites why I would like to change the way discourse annotations are attached:
- it might influence the process of building discourse structures
- if a user wants to add new media to the system's repository I have to determine the way domain and discourse annotations are defined and attached. The requirements for the annotation schema in this case come from the knowledge the system has about media items at the point annotations about them should be provided. These contextual annotations should/will have much richer structure than the annotations I have now (since there will be more information available about media items and the viewpoint on them will be known). On the other hand it means that the system should be able to make sense/use of these new annotations. Also probably it is useful to have all annotations uniformally defined.