INS2 discourse terminology

This document is an attempt to come closer to a common discourse vocabulary. The primary goal is to provide a set of definitions that help describing the different research projects within the group. It might or might not become the basis for a more formal and machine-readable discourse ontology.

Status

This document is work in progress. The initial version was based on the results of the July 30 2004 leesklub. Feel free to add/edit this document.

Comments from Stefano:

Comments from Alia:

Top-level discourse layer structure

We identified the following 5 layers. Disclaimer: the layers are meant to conceptually distinguish among separate, but related, issues. In practice, such a strict separation could be hard to maintain, and in some cases it might even be counter productive.

  1. world: universe of discourse. In (narrative) film: all events with their chronological order and spatial location, causally related. In our systems: the underlying database or RDF repository.

  2. fabula: slice of the world relevant for our plot. In our systems: the query result. Note: the fabula does not include the associated media items. You need, however, have knowledge about the media items to avoid creating a fabula/plot that cannot be realized due to a lack of media items. Synonyms: story world. See also; prototypes, templates, procedures.

  3. plot (= structure): same events as fabula, but arranged and connected to the orderly sequence in which they are presented (quote from Lemon & Reis). In our systems: the structured progression. Note: this includes things the listener adds (a la Scott McCloud's "gutter"). If RST needs to be fitted in these 5 layers, it would be part of the plot. Synonyms: story line, discourse structure, path through fabula

  4. story (= form): final manifestation or expression. In our systems: the multimedia presentation. Synonyms: presentation.

  5. user experience: The coherence of the story. Coherence has been defined for textual discourse as: "The connection which is brought about by something outside the text. This "something" is usually knowledge which a listener or a reader is assumed to possess." Cohesion has been defined as "The connection which results when the interpretation of a textual element (such as a reference) is dependent on another element in the text". [Both definitions from Renkema: "Discourse Studies: An Introductory Textbook]"

- World -> Fabula -> Plot -> Story
Key property: Events/concepts and the relations among them (ontologies used include domain ontology, but perhaps also discourse ontology). Note that this could be purely "conceptual" (background) knowledge but could also include meta data with information directly related to, or even about the media items. - Subset of the world, specified explicitly (by enumeration) or implicitly (by query(s)) - Subset of the fabula, enriched with order, grouping and priorities (recurrence) - A set of media items, implicitly/explicitly arranged within a space/time/link framework
Additional/optional properties: See Media items - Rating in terms of user interest, fitness for role in plot, fitness for overall style - Interaction scheme?
"The Message" (IWA-like)
Genre related, plot-level (non visual) style properties (author persona, rhetoric style)
- "visual" style properties, meta data (e.g. associated annotations with, for example, machine-readable versions of the content, logging of the design decisions taken by the system, copyright information etc)
User role: - Specify topic, locus of interest - Choose: author persona, genre, time available, knowledge of topic, etc (see Susanne Loeber's MAO) - Modality preferences, "visual" style preferences
Additional knowledge needed: - Functional role of events in plot (à la Sample), style properties to fit story ("yellow & fast") - Analysis of fabula ratings, events and relations with respect to genre/story templates and their "top-level" dependencies on output medium (film vs book), "top-level" modality issues. - Communicative devices?
Graphic Design (incl. designer and content provider's prefs) Suitability of modality to express concept

Related definitions (see also chapter 2 of Frank's thesis:

discourse
complex communication (so it involves both 3 and 4)
Theme
That is the initial purpose to a story, which unites the story's separate elements. A theme actually performs two concurrent tasks. First, it arouses the interest of the receiver. In this respect, the underlying selection process for a theme deals merely with general human emotions or interests, which must be elaborated within particular and well-formed material. Theme and genre are linked in this task. The second role of a theme within a narrative is to stimulate and maintain interest. The effect of a theme depends, in this case, on the intended emotion the theme should evoke, since emotions are a powerful medium for maintaining attention.
genre
A genre is an abstract network of features, such as a set of possible narrative objects (Aristotle's Poetics provides a set for tragedy), and characteristic objects and actions from the real world, upon which the individual plot defines its structure. Examples of genres are the Russian folktales described by Propp (1968) or, for film, the screwball comedy as described in (Brunovska Karnick, 1995; Gehring, 1986). Thus, genre can be seen as the macro structure of a plot. See also stereotypes below.
prototypes
organize the identification of types of persons, actions, localities, etc.,
templates
articulate common story formats, where each formal element represents a structural story movement, realizing the stages through which the agent of the story must pass, such as: Orientation, Complication, Evaluation, Resolution (This model is based on Waletzky, see also Bremond, Greimas and Propp)
procedures
organize the search for appropriate motivations and relations of causality, time and space.
stereotypes
A pattern of stereotypes is not neutral. It is not merely a way of substituting order for the great blooming, buzzing confusion of reality. It is not merely a short cut. It is all of these things and something more. It is the guarantee of our self-respect; it is the projection upon the world of our own sense of our own values, our own position and our own rights. The stereotypes are, therefore, highly charged with the feelings that are attached to them. They are the fortress of our tradition and behind its defenses we can continue to feel ourselves safe in the position we occupy. (Lippmann, 1934, p. 96).
conventionalisation
The process of plot construction, since conventions transform an event into a self-regulated, i.e. closed and self-maintained, structure.
narrative
A linear, non-interactive story. Also a specific genre (this may be a term that is better avoided)

Remarks & warnings:

System descriptions

This section describes the discourse aspects of the various systems developed within our group, using the terminology defined above.

Cuypers/Aria/SemInf (Jacco/Joost)

In the Cuypers ARIA demo, the world is the (relational, not RDF) Rijksmuseum ARIA database. The fabula is implicitly, hard-coded template that also determines the SQL query used: give me a description of encyclopedia term X, and all images of artifacts made by artist Y that are somehow related to encyclopedia term X. Note that the results include the URLs of some media items (the images), the complete text of others (the description of term X) and yet others are generated from the meta data (like the title and the captions). The fabula, as a set of SQL results, is first converted to XML by cocoon. The plot is then generated by an XSLT style sheet, which converts the SQL tables to a tree (originally inspired by RST). The root of the tree contains three elements: a "topic", a list of "examples" and an "elaboration" of the topic. A document-format independent version of the story is then generated when the Prolog rules converts the plot into an HFO tree, which is then converted by XSLT to SMIL 1.0, 2.0 or HTML+TIME. The emphasis of the demo is on finding layouts that fit a particular screen size but still communicate the plot. As a result, the role of the user is quite limited. During query formulation, the user can select the term and artist. Depending on the user's level of expertise, the plot will or will not contain the description of the term. Other adaptation dimensions include the available bandwidth and, the main focus, screen size.

In SemInf the world are digital libraries who have their material annotated using DC within the OAI framework (OAI defines a protocol to download DL). the fabula is determined by an SQL query which returns all entries which have a term X in their value field. This is still an unstructured collection of media items. The plot is constructed in two phases: first, the systems tries to discover relationships between media items (e.g. item A created item B). This result in a graph of relationships. The second phase involves relation pattern matching on the generated graph. An example pattern is: "X created Y, Z describes Y" these relationships are the mapped to spatial relationships such as X left-of Y, Z below Y. The story is then generated similarly to the above described Aria demo.

DISC (Joost/Stefano)

In DISC the world is the domain ontology (or more domain ontologies if there is a way to access them, like ontology mapping). There is no explicit fabula step in DISC, but the plot is filled with facts from the world. This happens with multiple queries determined by rules. The domain ontology contains also the media items, that limits the step from plot to story, because the 'rendering' process is not free to choose the media items. This could be different but this is the way we implemented it.

IWA/Vox Popoli (Stefano)

In IWA the world is the annotated raw video footage plus annotated pictures. Probably there will be an explicit fabula step, thus not like DISC. The step from plot to story will be based on Film Theory.

MediaStreams/Trip report generator (Joost)

The trip report generator basically has no automated discourse (yet). The world is therefore the (imagined) world of the author. The plot is also created by the author. The story is then generated by searching for material which meet the descriptions in the plot. If the material does not exist the process fails. One can argue the world is thus defined by the material in the database which is to some extent true of course. However there are no (explicit semantic) relationships within the material therefore I belief world is not an appropriate term for the video material database.

SampLe (Katya)

The world in SampLe is the CHIME domain ontology. A user is exploring the world during the topic selection phase. There is no explicit fabula. Implicit fabula: when the topic of the presentation is selected all concepts relevant to the topic can be considered as fabula. The plot is the chosen genre-dependent presentation structure adjusted to the concrete instance of the topic (Movement=De Stijl). Each section of the plot is connected to a domain concept. The story is the final multimedia presentation where the plot is filled in with the particular manifestation of the concepts. Currently: media items start to participate in the process of presentation building after the plot is built. Ideally: media items available for each section of the plot can initiate changes in the plot. The role of a user is quite substantial in SampLe. The user influences all the stages from fabula to story. By selecting the topic of the presentation the user defines the fabula. The selection of the particular genre leads to the automatic creation of the plot. For each section in the plot the user chooses media items and defines their order.

Topia (Martin)

Topia's world is the RDF-encoded Rijksmuseum Aria database. The fabula is a table with artifacts and their properties retrieved in response to a user query. The plot is a hierarchy of all groups of retrieved artifacts that have one or more common predicate-object pairs. The sequence of the cluster siblings in the hierarchy is in decreasing order of the cluster weights. The sequence of the artifact's siblings is in increasing order of their year of creation. The story is a hierarchical tree of the predicate-object pairs of the groups in the plot. Starting from the top level, a user goes down in the hierarchy by clicking on one of the predicate-object pairs in the hierarchy. The story shows the titles of all artifacts that are one or more levels below the last clicked predicate-object pair in the hierarchy. A text string indicates repeats of displays of artifacts. The predicate-objects pairs of all clusters that contain the last-clicked artifact title are marked in green. On clicking an artifact title, the story shows an image of that artifact with their properties, ans an indication of the match of the query with all of these properties. The user specifies a query string and the artifacts' properties of which the values should match the query. The user selects from a set of integer values a weight of each property, which are used for calculating the weights of the clusters in the hierarchical plot. Other user options concern the depth of the hierarchy in the story, selection of clusters with specific predicates (property types), including cluster weights, and assignment of artifacts to only the lowest-level cluster in which they occur.

$Id: discourse.html,v 1.26 2006/01/27 11:35:38 amin Exp $