Author: Joost
Date: okt 26-2004

The vicious triangle

This note has a chronological character which is not the best structure
for conveying a message, consequently it probably is hard to follow
completely. The aim is to give you a bit of insight in what I am
thinking about.  The oldest contributions are on top, the newest are at the
bottom. In general, the most recent is the most relevant but to
understand the context reading the beginning is advised. If time is
precious start reading the scenario's indicated by '**************'


vicious-figure.pdf vicious-table.pdf (found in the bluebook dir) are a
table and figure which illustrate some of the ideas.

Intro

The past years we have talked a lot about "the vicious triangle" (or
variants) which embodies the mutual dependencies between content,
presentation structure and style. We used examples of trade-offs which
needed to be made to illustrate difficulties with multimedia documents
compared to text based document structures (Some of the trade offs
apply to text models as well). In our own work however we mostly
avoided the triangle trade-offs since we did not understand fully what
exactly the trade-offs are and when these decisions need to be made in
the generation process. This blue note is an attempt to boil down the
important factors of the triangle. We describe different scenario's
in which we identify the trade-offs and the influence they have on
the presentation.  The overall goal is to establish an
architecture/framework where these trade-offs are made explicit and
can be influenced by the author.

The Triangle

definitions

Content: The set of available media items

Structure: The hierarchical discourse structure of
presentation. (Structured Progression) 

[lynda: I would like to leave discourse out and stick to SP
(importance, grouping, ordering)]]


Style: Perceivable elements of a presentation defining aesthetics and
semantics.

dependencies

Content   - Content [[lynda: this is actually ambiguous]]
The choice for a media item influences the choice for other media
items. For example don't use it within the same presentation again.


Content   - Structure
The choice for a media item influences the structure/discourse of a presentation

Content   - Style
The choice for a media item influences the style of a presentation

Structure - Content
The structure/discourse of a presentation influences the choice of
material used.

Structure - Structure 
The choice for a discourse should remain
constant/consistent. Structure here is Structured Progression which
embodies discourse elements like genre. If the genre is established it
constrains the sub-genres(narrative units?) allowed. For example if
the genre is biography it can contain narrative units private life and
factual data. If however the genre is fairy tale these
sub-genres/narrative units are not really appropriate

Structure - Style
The discourse/genre influences the style. eg. Biography looks
different than a fairy tale

If because of lay-out constraints a (sub) Presentation Structure can
not be fitted on one page but needs to be distributed over two
pages. The fact that these two parts belong together needs to be
conveyed using lay-out.

Style     - Content
The style of a presentation requires media items not to conflict with
this style.

Style	  - Structure
If the style of the presentation is fixed (eg. hard coded margin
border paddings) this influences the structure of the
presentation if the content does not fit onto one screen.


Style	  - Style
Style should remain consistent.


note: I keep meaning discourse when I write structure: Is structure
really discourse? Probably structure is more than just discourse. It is
a structured progression. But is a structured progression than an
abstraction of discourse?

After discussion with Lynda about our different understanding of what
structure in the vicious triangle means, I (we?) realized the triangle
was more like a 3-dimensional space then a 2d space. The extra
dimension is the abstraction of the presentation: one extreme is the
final formatted presentation (smil file) the other extreme are
the knowledge sources. Thus the goal of a presentation generation is
transforming abstract knowledge into a concrete presentation. Although
the explicit knowledge becomes less towards the end the concrete product still
represents/encodes implicitly the knowledge contained in the original
knowledge sources. Structured Progression is (partly) constructed by
using discourse and domain knowledge, this information is implicitly
stored in order, grouping, priorities. It is an abstraction needed to
provide domain/discourse independent rules for presentation
formatting. 


Plot - Presentation Structure - Structured Progression 
The plot, is domain knowledge structured towards
presentation. Presentation Structure is an abstraction of document
structure. Structured Progression is an instantiation of Presentation
Structure.


note: Just Style, Content and Presentation structure seems to be a bit
simplistic. There is also a User who influences all of these
categories. Summarily for the delivery context. Moreover the terms
Style, Content and Presentation are rather abstract while in fact the
trade-offs work at a more suitable level. E.g. Presentation Structure
contains (implicitly) domain structures and discourse structures
Discourse structures include plot but also genre and to some extent
document structure. 2bdone refine categories


Refinement of categories
The following list provides for each individual category Content,
Structure, Style knowledge sources which influence the result for the
respective category. Note that sub categories can occur in different
categories. Typically they provide another viewpoint though.


Presentation Structure

	Discourse: Reflects the intended message of the author,
	organized in such a way it logically make sense.
	
	Domain: Presentation Structure defines roughly the
	organization of content to convey a particular message. To
	make decisions about what information needs to be presented
	together you might need information about the domain. Topia in
	contrast doesn't need this since the hierarchical grouping is
	determined by common attributes (independent of the
	domain). Disc knows about painters and artists en knows what
	is relevant information and how to structure it.

	User: The structure of the presentation is tailored to the
	request of the user. First of all, if the system responds to a
	query of the user, it should be answered. Secondly the
	presentation should know about a the background knowledge of
	the user in order to serve both a domain specialist and a
	novice.

	Document: Partly the structure of a blob of information is
	determined by the inherit structure of the document and the
	output media. For example a report has a particular document
	structure which is different to the structure of an
	essay. 

	Output media:  influences the structure of a
	presentation. A paper media has no temporal dimension,
	material can be presented spatially only. Within a film
	however the temporal structure is dominant. 

	Device: The device influences what media can be used, and by
	doing so influences the presentation structure

Content Selection:

	Domain: To select a particular media item you need to know
	what you want and how it is described by its metadata.

	Device: Selecting a media item only makes sense if it can be
	presented on the device you are using. Note that you might be
	able to adapt the media item to fit your
	requirement. Nevertheless this can be seen as just a larger
	collection of media items to choose from.

	User: A user might have preference for particular modalities,
	or requirement which exclude modalities. Do not show x-rays of
	a painting to a 12 year old.

	Discourse: A media item has a role within a
	presentation. Certain media items are more suited to be used
	within an introduction, others provide detailed information 

	Modality: The modalities available for presentation restrict
	the set of available media items.

	Document: The document structure (and presentation media)
	influence the media choice since they may not support the
	modality of the media item.

	Genre: The genre influences the media choice. A
	biography/documentary document might use a painting of the
	subject. For a more formal document such as a CV a picture
	might be preferred.

	Style: The color scheme of a presentation should match the
	media content.

Style Sheet:

	Design/Layout:	

	User: Apply bright colours for children. Larger fonts for
	people with bad vision.

	Content: The style is influenced by the content. Darker images
	work better on a dark background.

	Device: do not use colours on black-white screens.

	Document Structure: A scientific paper/report has a
	formal/serious lay-out. A powerpoint presentation typically is
	more colourfull.

When talking about Style there are two distinctions to be made. 1)
There is the style in the classical sense (stylesheet) encoding
perceivable style elements of a content/document structure (including
border, padding, colours etc.). And there is 2) style/design/layout
information in the "body" of the presentation. This is how media items
are combined which are not already formatted by the document structure
(A slideshow of images in a mm presentation). The first one is
independent from the discourse (the discourse is implicitly in the
document structure). The second one however is not. Every document is
partly 1 and 2, the level of detail in the document structure
(report,chapter,section,subsection) sets where the border between 1
and 2.

again:

Style:

	Layout: Part of Style is layout which is the spatial/temporal
	position of the elements in a presentation.

		Device: The layout should fit the display

		User: Not too many items on a screen

		Document Structure: If the document structure is
		report the layout typically has chapters, sections
		subsections etc.

		Discourse: If the document structure is relatively
		flat, discourse relations need to be expressed
		explicitly. For example an image, and a text
		explaining the image are next to each other and
		aligned to convey the relationship.

		Content: Media items influence the style because of its
		content. (black-white photograph - abstract art) 


	Style Properties: Includes colour schemes, border, padding and
	margins definitions, fonts etc.

		Device: Don't use colours on black and white devices

		User: User preference for certain colours or colourblindness.
	
		Genre: Children stories typically have bright
		colours. Thrillers are black.

		Content: The colour scheme can be adapted to fit the
		content.


The vicious triangle identifies Presentation Structure, which defines
the structure of an presentation by order, grouping and
priorities. From a discourse perspective the plot would resemble PS the
most. The difference however is that within a plot domain relations
exist while in a SP they are mapped/transformed to order, grouping
priorities. 


Structured Progression

	Plot: a view upon a fabula. A fabula is a graph structure the
	plot is hierarchically structured (dag if you like)
		
		Fabula: (User defined) Subgraph of the World, contains all domain
		knowledge for the presentation. 

		Genre: Narrative Units, Story templates, biography,
		essay etc.

		Document Structure: Report, letter, book, mm

		
Scenarios

So far we discussed examples bottom up. This way we tried to identify the
trade-off dependencies within the vicious triangle. To check whether the "model"
is rich enough we now focus on a top down approach. That is, we use
"existing(practical)" trade-off scenario's and see how the fit the model we have
described so far.


Colourblind User/User Preference - Corporate Colours (user-design)

Show N items - Screen limit (structure - device)

Content which represents the domain concept best - Content which is easier to access

note:
The problem with trade-offs is that most of them happen under the
surface and are not as easily explained and identified as the ones
mentioned above. Example of such a trade-off is the choice for a
document structure (such a report/paper/biography) which influences
the structure of the material and the way it is presented. For more
domain specific document structure (paper/biography) the structure is
mostly fixed a priori. This states the required parts of a
document/presentation and as such it is much like a template. The more
domain specific the template gets the more it limits the scope in
which it can be used. Besides a pre-fixed structure, document
structure has the advantage that a generic style sheet can be used
which makes style issues easier (scientific papers can all be
formatted with one template). On the down side they will have a
pre-cooked/unadapted and therefore 'boring' appearance. 

So, as these trade-offs exist, but not all of them are as explicit as
the examples above a style-sheet-like approach where an author makes
these trade-off explicit might not really be feasible because the
consequences of a choice influences the presentation on multiple levels
and an author cannot oversee this. Instead a more high level approach
such as "strategies" might be better suited to control
dependencies. This is also how formatting works in, for example \LaTeX
where a an author can influence formatting by stating
preferences (adding/removing badness)  but
it is the system which makes the final choice where it makes the
trade off of all requirements. A similar approach is advocated by SRM-IMMPS
and Suzanne's Loeber MAO which make use of experts which have an
overview of the systems as a whole.

How would this work for real? Experts communicate with each other
therefore protocols need to be established. This is related to the
previous dependency problem since it needs to be clear what is
communicated when. This requires linearizing the process which already
involves trade-offs to some extent...


Disc
----
World, User ->	Fabula
		Genre	-> Plot/Discourse
			   Document Structure
			   Media		-> Layout
						Style Props	-> Style
Aria
----						   
User	    ->  Plot/Discourse
		Document Structure
		Media Content	-> Layout
				   Style Props	-> Style
	

Topia
-----
World, User ->	Media Content
		Genre	->	Plot
				Document Structure ->	Layout
							Style Props -> Style

Sample
----
World, User ->	Fabula
		Genre	-> Plot/Discourse
			   Document Structure
			   Media		-> Layout
						Style Props	-> Style


todo: scenario, 4 concepts need to be compared what are the possible trade-offs.
      tailor image to fit scenario
      test table whether it still fits
      make links explicit (influences, subset, uses)
      identify knowledge bases/external knowledge


********************************
Leesclub scenario 
********************************

The objective of this simple scenario is to identify some of the
choices which a presentation engine needs to make. The choice made is
mostly arbitrary and is dependent of the goal of your presentation.

One of  the goals in the presentation is to compare 4 concepts. The
ideal case, according to the rhetoric would be that the four concepts can
be compared simultanously. If the concepts can be represented by
images, this means 4 images are presented at once on the screen. 

In a presentation about Rembrandt's work his use of chiaroscuro is compared to
work of other chiaroscuro artists. The objective of this comparison
is to get the viewer acquainted with the chiaroscuro technique.

Suppose the situation is not optimal and the 4 selected images can not
be presented together at once because of insufficient screen
space. To cope with this situation there are alternative way the
presentation can adapt.

Content - Substitute 1 or more media items with smaller ones. 
          Badness:	- smaller media items lack detail 
			- for comparing, images should be of similar
			  quality.
	- Scale images down
	  Badness:	- smaller media items lack detail 

	- Choose alternative representation medium (audio/text instead
	  of images)
	  Badness: Audio and text are serializations of content which
	  might not be very well suited to comparing

Device  - Change to a device with a larger screen.
	 Badness: Inconvenient for a user

Presentation Structure - Do not show all images at once but use separate pages
	  Badness: comparing is harder especially for complex images

Plot	- Restructure the plot in such a way the comparison is not
	  necessary
	  Badness:	- expensive operation since intermediate
			results are typically no longer valid.

Discussion: Four images cannot be presented together. The suggested
possible solutions happen at different stages during the generation
phase. The substitution, for example, happens when the 4 concepts gets
'matrialized', in cuypers this is when a PS gets transformed to a
HFO. Changing the plot, in contrast is done after the query returned
its results which needs to be structured according to a narrative. The
choice for an alternative presentation medium might also influence the
presentation strucuture which needs to cope with time and
synchronization. 


--

The objective is to generate a presentation about Rembrandt's use of
chiaroscuro for a user who clicked a link on the Rijksmuseum
website. The content provider wants to make sure the user feels the
presentation is part of the rijksmuseum website. The corporate colours
of the rijksmuseum website are a light shade of brownish-green. The
visitor is a young girl who likes bright colours. Since the
presentation is about 17th century art the graphics designer of the
presentation wants to use dark colours with light accents to emphasis
the use chiaroscuro which was important in that time. 

Content - Apply a filter to the content to match the design

	  Badness: - Changing content is not advised since it might
	  change the 'meaning' in an undesired way. Moreover it might
	  be disallowed because of copyright reasons and sometimes not
	  an option if the content (as is the case here) is the topic
	  of the presentation.

	- Choose an alternative image which illustrates the concept in
	an appropriate way and meets the design criteria.

	  Badness: not a realistic option since the available media
	  content is limited.

Device - Change to a device without colours.
	Badness: ignoring the problem

	
Plot - The structure/genre of a the presentation can be changed
to be less serious in which case a dark  style is not
appropriate.

	Badness: expensive

--

A user queries the content database of the Rijkmuseum for the terms
chiaroscuro Rembrandt. The database consist of digital representations of
multiple media type of artifacts in the museum. The result set contains
4 images which need to be presented to the user. 3 images are
self-portraits depicting Rembrandt one is of a student of Rembrandt
who used the chiaroscuro technique. The images just fit all on one screen
and they agree in style. 

Content - Drop image which does not match the narrative
	Badness:	- result set is incomplete


Presentation Structure - acknowledge the 'domain' grouping and present
                         2 groups one of three images, one of 1.

	Badness: - requires more space
		 - Aesthetically less pleasing because balance is lost.

Style - Ignore grouping and present images together on one screen.
	Badness: - Grouping/structure lost which confuses the user who
	expects a relation.

      - Present images on one screen but convey grouping by setting
	different style properties..
	Badness: A user might still be confused if it is not clear
	what the groupings mean.


The scenario's described above are relatively similar in the sense
that they all describe a conflict which needs to be resolved. The
proposed solutions are sometimes a bit sought, nevertheless the
options and choices are valid. There exists no pre defined strategy
which would account for all cases. Moreover, the solutions typically
work at different stages during the processing chain. A choice for a
particular solution influences the generation process and might cause
unforeseen problems.  For example, a choice to for a different media
type influences the presentation structure and style properties. When
making a choice one needs to know the consequences of the choice,
because of that the local character of the choices described above in
fact needs to be seen within a wider scope. Finally the possibility
exist of getting into an infinite loop when there is a three way
dependency problem.


Theory vs Pragmatics.

As mentioned before the options to resolve a conflict are not all that
realistic. For example changing to a different device or revising the
whole plot structure is probably not advisable (although with film
generation revising the plot often only is the only option since
material is scarce). In general one can say a complete automatic
system which can cope with any situation is not realistic and not what
we are after. There exist hardwired choices which limits the scope of
possible adaptations. Cuypers, for example, uses depth first
backtracking, this means the last choice made is revised when a
conflict rises, when that didn't work the choice before that is
revised etc. This might be considered an implementation decision,
which in fact it is. Nevertheless the architecture and data model of a
system typically makes these choices implicitly. By making a choice for
an architecture one needs to understand the scope of the problems
which it can resolve.

The Cuypers architecture is based on depth first backtracking this
typically works well for small scaled problems since the the whole
search-tree can be overseen and enumerated. However when the tree gets
bigger complexity issues arise which make that some choices will never
be revised in favor of performance. In Cuypers the content (PS)
dominates the layout, that is the layout always gets adapted and never
the content. The underlying idea is that if every parent takes care of
its children then we'll' end up with a reasonable result
presentation. What we can not do (easily) with the parent-child
paradigm however is for example choosing a colour scheme. A colour
scheme is based on all content of the presentation and therefore is a
top level choice. The media content however are leaf nodes which means
a matching colour scheme is propagated upwards to the root node. In
the best case this is just rather inefficient, but in case two
children can not agree on a schema things become more complicated. 

Some of the choices in presentation generation requires an overview of
the whole process. Cuypers does not deal well with this situation
because of its architecture and implementation.

With the SRM the 'overview' problem is solved by experts. What the
experts precisely are and do is mostly undefined and if not rather
vague. Especially the links between experts and the generation
process. 


requirements:
- overview of the whole generation process, the choices and the
consequences
- manageable (for example) by rules

------------------------
Towards a solution
warning: very early/undeveloped idea

metaphor: game

In logic there exist a discipline of proving/deducing/solving
statements which uses game playing as a metaphor. There are two or
more opponents who all have their own strategy, in case of a given
propositional statement which needs to be proven true or false, one is
in favor, the other denies it. They take turns in attempt proving
their vision by modifying the statements according to the rules of the
game. For example, if the formula contains an 'and' operator this can
be exploited by the falsifier since she only needs to prove one of the
statements false to win. There exist a number of variations for
different kind of games (chance, hidden, strategy). The metaphor of
game playing might be usable for presentation generation. The vicious
triangle can be seen as a game between three players who have 
different strategies to "win". The (abstract) presentation is the
playfield/model/structure. Players act on moves of their
opponents. Since the players have different objectives the playfield
looks different to them. For example the structure player has ordered
his view according to presentation structure (grouping, order,
priorities) The design player however views her field more like
categories, for images, section paragraph etc. The layout player sees
physical pages (and substructures). The content player sees media
items and combinations of media items.

When a player makes a move, for example the content player selects a
media item to represent a domain concept this influences the
playfield, consequently the view of the design, and the layout player
changes (note the view of the structure player does not change) The
layout player finds out the images doesn't fit the screen and scales
it down. The designer finds out the new image doesn't fit the colour
scheme and applies a filter. etc. etc bla bla.

From an implementation perspective this scenario needs a way of
transforming the abstract presentation into an appropriate view for
the respective players. Moreover a change in this view propagates to
the views of the other players. The players "watch" a view and get
triggered if somethings changes which they do not desire. 

problems: how to avoid cycles, - maybe a referee 

          the process needs to progress - maybe after a progression
step the game is played. Then a next step is made after which the game
is played again etc.
	  

architecture
implementation

conclusion: An architecture for presentation generation systems is
because of different strategies not a linear system. The typical software
engineering principle of divide and conquer techniques might not be
best suited to solve this problem. What are the important aspects of a
presentation which we like to adapt against what costs? The notion of
an overview of the process is advocated by the SRM and the MAO model
by the Suzanne. These models however are more from a conceptual point
of view then an architectural one. Cuypers implemented part of the
model. Because of the architecture of Cuypers some parts of the models
can not be implemented. This bluebook note is a first investigation of
what the dependencies between the different components and the
trade-offs involved are. 

INS2 Scenario's

User Oriented
-------------
 Lynda describes a system in which the user (=visitor) is central. The
actions the system takes influence the Motivation, Ability and
Opportunity (MAO) of a user. The system has rules which optimize MAO.

Katya describes a user as the author/designer of a presentation. The
system supports the author in creating a presentation by providing
suggestions. From that perspective the author is more like the user
except that the adaptations are interactive. The trade-offs become
more explicit (different work-flows) since a user needs to be
supported. Nevertheless in SampLe the user takes the role, of
designer, author and content provider so some of the trade-offs are
made in the head of the user.

Structure Oriented
------------------
Stefano describes a system which tries to convey an argument. The
system has knowledge about the structure of an argument. Furthermore it
knows the discourse effects of editing. To convey an argument it uses
this knowledge to find appropriate material.

Frank describes a scenario in which the conceptual structure of the
presentation is known and should not be changed: Form follows function.

Content/Media Oriented
--------------
LLoyd describes a scenario in which a user gets results from a
query. These results form the basis of a process which generates
structure around these results


Discussion with Jacco:

After reading the groups scenario's Jacco and myself discussed
them. The main observation was the difference in processing
models. Similar to the discussion about the role of media data in a
discourse ontology (whether it was part of the plot or not) we found
that there are different basic assumptions in generating
presentations. Cuypers leaves the choice which media to use almost
until the last moment. Lloyd's Topia in contrast, and to some extent
Stefano's work start with media objects. These are two conflicting
views which are hard to unify in an architecture which emphasis the
processing chain. We discussed a model-view-controller architecture
which abstracts from an explicit processing chain but instead is based
on events. There are a number of "agents" who get notified when a
particular event occurs. For example, when a media item is added this
triggers a design agent to judge whether it fits the style of the
presentation. If not it can adapt the style, or remove the image again
which both generate new events.  These can trigger other agents to
perform an action. Although this architecture appears to be more
flexible and is closer to a real-life scenario in which a human author
gradually adds/removes material and changes work-flow as Katya pointed
out in her scenario. Nevertheless it also introduces problem of
interfacing, what are the atomic actions an agent makes, what is the
data structure it manipulates. A real-life model of a big table and a
collection of media material which an author/designer arranges in such
a way that a presentation is constructed is not representative
since half of the data structures are implicit in the authors head. In
other words the status of the presentation is not what is on the
table. Having said that, producing the presentation is the process of
making the data structures available in the authors head accessible by
means of physical representations. The viewer perceives the
representation (including structure, design etc) and uses it to
construct her mental representation to match the structure the author
intends to convey. Hmm this is sounding very much like semiotics, but
the point is that the presentation is the materialized interface of
communication. Generating such an interface automatically requires
data structures which are only in our brain to be made explicit an
metaphor might therefore be hard to find. We can however analyze how
a human authors a presentation and we'll see it is not a linear
process but it is not completely a-linear either. We take certain
weighted assumptions to start with which layout the basic structure of
the process. If we create a biography about Rembrandt we can start
with a biography structure and find material which matches this
structure, if however we create a presentation about Rembrandt we can
start by looking for material and later decide a biography will suit
the data best. One dominates the other, which is a linear aspect. If
we look at the different generation systems we see that they are all
perform similar task except the order of processing is different:

Aria:
	Structure
	Query
	Concept
	Instance

Disc:
The user enters a query which returns concepts. These concepts are
crawled according to a discourse scheme to find structure. The
concepts are the represented by Media.
	Query
	Concept
	Structure
	Instance
Behavior: If the concept doesn't match structure change the
structure. If the media doesn't fit the Structure change the media.

Noadster:
The user enters a query which results in media items. The media items
are annotated with concepts which is uses to deduce structure.
	Query
	Instance
	Structure

Topia:
The user enters a query which returns media items. The meta data of
the media items is used to cluster the result.
	Query
	Instance
	Concept
	Structure
Behavior: Media is fixed, adapt structure to media

Hera:
Hera uses templates called slices which are domain independent. An
author created slices based on domain dependent schema information. A
user enters a query which return media items/instances which fit a particular slice.
	Structure
	Concept
	Query
	Instance

Backtracking or Design alternatives happen according to the dominance
of the processing scheme. The real question thus is can we create
general components of which the order in which they are used can be
altered. Thus, can we create modules (like) Query, Concept, Structure and
Instance which can arbitrarily be combined, or, is the order of these
modules tightly coupled with the contents of the respective module?  

In order to find this out we need to analyze the data structures used
in the modules at the different stages during the process chain. If
there exist overlap in representation this might suggest that there
exist levels of abstraction which might be unifiable.


Towards Scenario

The scenario needs to illustrate the need for an architecture in which
a top-down and bottom-up approach are combined, or atleast can work
together. In the later case there needs to be support for the fact
that these two need to be combined. Alternatively we can just have a
TD approach, like aria and disc *and* a BU approach like
Topia. Intuitively the combination is obvious: Topia (read clustered
media items) at some stage needs high level structure too since every
presentation has a higher level structure. In Topia this is realized
by a template like document structure: 1) a hierachical index like
structure to give the detailed content context and 2) a detail window
which gives detailed information about the currrently selected
topic. Disc in contrast manipulates the higher level discourse
strucuture (which is reflected in document structure). At some stage
however media items need to be included to reflect the higher level
concepts. The assumption here is that there exactly is a media item
which matches the concept. Disc abstracts over the fact that there can
be multiple media items, of different types and that the media item
possibly does not exist. 

Difference in TD-BU approach is what gets changed/adapted. In Disc the
discourse structure remains fixed, while the choice of media items and
the formatting of these media items is flexible. Topia in contrast has
a fixed media set of which the clustering (=structure) can be
adapted. What we are looking for is a scenario where this trade-off
matters. Both approaches switch from a TD->BU or BU->TD strategy. The
question is what do we gain if we have an architecture which can cope
with both stategies?

A disc scenario in which the structure gets adapted
A topia scenario which chooses modality which fit best.


Top level uses modality knowledge. eg. use graphics for spatial information
natural language for temporal information (from srm premo)

Top-down

Bottom-up


PS_Media & Modality and (2bdone & Delivery Context)

Modality grammar rules form the basis of dealing with media
items. They present (create an HFO) for any sequence of one or more
media items because there exist rules for presenting audible media and
for graphical media. In addition there are also rules which present
combinations of audible and graphical items. Since all media item fit
one of these categories any sequence of media item can be
presented. Of course the presentation of media items is a one size
fits all approach and therefore hardly reflects the underlying
semantics of the media items.  A set of media items can be shallowly
structured by explictly grouping them. We currently support "group"
and "alternative". Although modality grammar rules present any group
of media items, within a presentation you often like to differentiate
between media items based on the concepts they represent or the
rhetorical function they fulfill. Modality grammar rules know nothing
about these which is why we need ps_media. PS_media deals with media
on a higher level, it basicly represents a domain concept independent
of the media item it uses in the final presentation. Within ps_media
we still can select media based on the preferences of the user, or the
device. Furthermore we can create media items if nessescary (e.g.\
captions with images) or transform media to fit the context. (e.g.\
text to audio) 

yet another scenario (...attempt)
------------------------------

Within the grammar rules there is a function selectMedia which selects
a media item (hfo actually) from a set of alternatives. The
alternatives are all valid from a technical point of view and will not
make the constraint solving fail. The choice which one to choose is
part of the vicious triangle.

Cuypers used (some sort of) ccpp profiles to define the delivery
context of the presentation. The assumption here was that there are
different screensizes which require adaptation in layout. However, the
screensize can become so small, for example with mobile phones, that
alternative media types are advisable. Typically the selection of a
different media type influences the structure of the presentation
because media items typically can not be substitute one for the other
while keeping semantically an equivalent presentation. If we consider
ccpp profiles for a mobile phone and a workstation web browser, the
first one preferably will use text or audio while the later one
preferably uses images and audio.

example: Currently the description text of the images does not always
mention the term 'Chiaroscuro'. This however was the keyword which
selected the image in the first place. So if we are only allowed to
use text the images with textual descriptions without 'Chiaroscuro'
should not be selected. In other words, the change for text-only
changes the presentation structure.

needs more...


For the text-only version the aria demo substitutes text for images. As
a consequence there will be shown the chiaroscuro elaboration text
and descriptions of the example paintings is a slide-show. Is is
unclear for the user what text to read and the relationship between
the texts is not clear either. For a textual presentation the
structure of the presentation needs to change to make sense. In case
of the chiaroscuro presentation this means the elaboration text and
the examples should be presented both as individual sections. The
examples should be presented in subsections. There needs to be a
transition/glue between section 1 and 2.

Presentation structures were initially meant for this purpose. If you
choose a different output medium, such as report instead of multimedia
the presentation structures would be: report, section, subsection,
whereas, multimedia had presentation structures: presentation, scene,
sub-scene. Basically they were just some sort of templates, for each
different type of output modality (report/multimedia) you'll need such
a template. The presentation structure itself are in structure very much
alike. Presentation vs. Report, Scene vs Section, Sub-Scene
vs. subsection. These categories are presentation oriented, the
presentation structure Scene knows how to present its children
temporally. The presentation structure Section knows to present it
children spatially, one below the other. This knowledge is encoded in
the presentation structure itself which has some drawbacks: 1) every
output medium needs its own presentation structures 2) For different
genres of presentations you need different structures. 3) If the
output modality is multimedia, but there is a preference for text the
presentation structure should be able to cope.

What is needed: Presentation structures should have no embedded design
knowledge. Presentation Structures should be independent from the
output medium. Presentation Structures should be independent from the
input medium. Presentation structures should be tree structures.
This sounds very much like a structured progression but there is more
to it. A presentation structure should know at what level it occurs
in the tree. Moreover it needs to have a function/purpose (maybe these
can be combined?). I think there are at least three different types:
top-level structures which are mainly influenced by the document
structure (and rhetoric, genre, discourse knowledge) Then there is the
concept level, which are atomic nodes in the discourse (but are not
necessarily equivalent to media items). And there is a level in between which
currently is the most vague. These are functional units, (maybe
communicative devices?) typically with a rhetorical function such as
compare or contrast. In the vicious triangle this assumes content and
style are subordinate to structure/rhetoric. I think this is the most
typical case, however if we choose style as dominant then these
structures have a function like happy or dark, formal, playful. Note
that the top-level structure is kind of independent of this, whether
you choose a rhetorically dominant presentation or a style dominant
presentation you still need a document structure. Same for the lower
level/concept structures. Independent of a rhetorically/style dominant
presentation you always need to select and present media items. 

What's next: The textual example as it is didn't work, theoretically
we can fix it by adding another presentation structure for a text only
document structure (e.g report). However this has scalability issues
as mentioned before and we can't cope with a textual preference within
a multimedia presentation. My suggestion is to create a general
applicable presentation structure (much like it is at the moment) and
make all design/layout decisions explicit outside the presentation
structures (something like a style-sheet). This can serve as a
foundation for further extensions.