14-5-2003 These are notes written during the writing of the ISWC2003 paper. It mainly focusses on 'my' part which was: Given a Structured Progression what are the issues when generating a presentation (hfo tree). joost In a traditional document engineering task an author marks up the content of document. By doing so, the semantic function of the content is made explicit. This then can be used by a style sheet to transform the document into a final form format. An example is this paper which was marked up in \latex, defining sections, subsections, bibliography etc. The marked-up document can (in several transformation steps) generate various output formats for different platforms, such as Postscript for paper and HTML.\ for Hypertext. Making semantics explicit (through document structure) therefore allows us to abstract from the final format of the document. Although \latex defines a rich environment in which semantic functions can be specified (list, emph) there is a point after which content is considered to be atomic: Figures for example are black-boxes of which \latex knows nothing except its dimensions (which are specified by the author). A figure can thus not be reformatted in case it doesn't fit on a page. To some extent the same holds for plain-text paragraphs in which the relations between words\footnote{latex does know about hyphenation of words.} and sentences are not made explicit. Rephrasing of long sentences, for example, is not possible. In general this is not really a problem however because in text the atomic units are small and adaptive enough to fit the layout requirements. In contrast, atomic units in multimedia are typically large (images, video, audio) compared to text. As a consequence document structure for multimedia is not as fine grained as its textual equivalent. Therefore conveying semantics in a multimedia document is, just as the figure in \latex, the responsibility of the author who understands the semantics of the atomic unit. Examples of conveying relationships between media can be realized by using design constructs like alignments or by using similar colored backgrounds. Grouping two units/items changes the message of the individual media items, for example, placing two equally sized images of Rembrandt ans Saskia next to each other suggests a relationship between them. Grouping therefore requires knowledge about the content of the media items, and in addition to that, design structures which convey the relation between the media. Besides knowledge about the semantics of the media its physical properties are also important. Text items and image items are grouped differently then two images. Finally, an author has to make sure the documents physically fits the output medium. This sometimes involves a complete restructuring of the presentation including (re)selection of content representing concepts. Concepts are being represented by media items, the choice which media item represents the concept best is far from trivial, even if we assume media items being annotated with domain concepts directly. The choice for a representing media item is influenced by platform constraints (no video on mobile phones), style (no happy colors in a obituary), document structure (no animation in a letter) and user profile (no scientific American pictures for a 7 year old). To some extend media items can be converted or transformed into media items which do not contradict the constraints. Examples of this include conversion from text to audio in case of small screen devices or displaying several key-frames of a video in a paper document. When automatically generating presentations, while abstracting over the final medium the presentation is played/viewed in, traditional document engineering prerequisites, such as the content of the presentation, as well as the document structure are unknown a priori. Still, the informative content which is used to produce, both a multimedia presentation and a paper document ,for example, about Rembrandt is very much alike. This structure is called Structured Progression and defines grouping, ordering and priorities of narrative unit instances. It describes the main message an author wants to convey, and relates and organizes narrative units contributing relevant information to this message. In sum, SP structures the information content of the presentation while abstracting over document structure and media content which are means to convey the message. [[figure: bio -> SP -> DS. (latex) -> TeX -> foe -> paper ! DS -> fo bio -> SP -> hfo -> mm ]] A document structure allows separation of presentation and content. However, if the atomic units of a document are too large, such as media items in a multimedia presentation, this separation is not realistic anymore. An automatic formatter has to know some of the semantic content of an atomic unit in order to present it properly. This requires domain knowledge about the media item (what is being represented) and domain knowledge about how relations are being represented. This also includes preventing inference of unintended relationships which can occur by presenting them together. There exist a three way dependency between the choice of media, the way they are grouped in order to convey semantics and the document structure of the presentation. In addition all of these are externally influenced by content, platform, user, and style. [[ figure: [content, platform, user, style] | media / \ document structure - grouping | | [content, platform, user, style] [content, platform, user, style] ]] [[examples media: content media represents content platform no video on paper user medical images for a doctor style bright colors for children document structure: content no power point love letter platform no temporal dimension on paper user interactive multimedia for children style MTV style -> multimedia news -> paper grouping content align media to convey relation platform no slide-show on paper user style ]] These choices are not orthogonal, For example, a choice for document structure influences the choice for media and media groupings, and vice-versa. In addition to the interdependencies between the categories mentioned above, they individually have dependencies with \emph{ content (semantic)}, \emph{user profile}, \emph{platform profile} and \emph{style} (see~\cite{ins:smartstyle} for more detail and examples). Observations: * Is this why paper documents have more layers of document structure then multimedia presentations? Since there is no option of media grouping in textual documents? * Overflow works for groupings based on document structure while media groupings need fallible fo's. Semantics seem to be less important in document structure groupings then they are media groupings. Two paragraphs/sections/scenes can be next to each other without the need to have deep knowledge about content. Media groupings in contrast need to know exactly what the relationships are. ]] [[problem: lot of dependencies and need to provide general methods (rules) of transforming SP into a presentation ]] [[1: media: presentation rules which use domain and design ontological knowledge to present concepts using atomic media items. -- mapping from concepts to media is done here, genre specific media (picture in bio) is retrieved, uses media type ontology, adding redundant media such as voice overs and captions for figures, need knowledge about role of media]] [[2: mediagroup: presentation rules which use domain and design knowledge to present groups of media items. includes rules to combine different types (eg text-image, image-image) . uses hfo ontology need different types of hfo then. probably needs domain knowledge too and genre. Argh! this makes the rules very specific which we do not want really.]] [[3:document structure (DS): rules which map SP -levels to DS-levels (chapter, section etc) SP reflects genre leveling. how to decide whether to make a sub-level in the DS or use media grouping ? DS defines how to convey boundaries (fade effects in scene transition, bold headers in text) ]] [[ Are 1, 2 really separate steps? they'd better be otherwise there is not much room for generalizations.]]