Thesis Proposal
by Stefano Bocconi
Introduction
The research I will be doing in the multimedia group led by Lynda Hardman
will be in the field of Intelligent MultiMedia Presentation Systems (IMMPS).
This once hot topic still remains pretty central to the group activity and
has led to the development of a multimedia presentation generation system
named Cuypers as a way to research/experiment multimedia presentation issues.
Cuypers' intended goal is to make automatic the process of creating a presentation
from preexisting media items. The problem that Cuypers tries to solve is
that of breaking the monolithic structure of a multimedia presentation to
offer the user adaptability to his/her situation without human intervention.
A multimedia presentation is often a very specific-tailored product of a
human creator which fits well the target-user the creator has in mind, but
does not fit well all the other users and those, in a deeply information-connected
society, turn out to be the majority of the audience.
Factors that can compromise the effectivity of a presentation are as trivial
as different screen dimensions w.r.t. the design, low bandwidth, user language,
or more complex like user preferences (colors, fonts), user expertise and
user goals.
While text research has found solutions for many of the problems caused
by the above-mentioned factors by determining the building elements of a
text and their properties (e.g. headings, paragraphs, characters), multimedia
is very often handled as a whole, so that the user is forced to take a multimedia
presentation "as is" even if it is not suited for his/her situation.
We could then say that Cuypers tries to find the building blocks of a multimedia
presentation and their properties, as well as include explicitly the rules
a human creator would use in order to combine those building blocks in a
"sensible" (from the communication point of view) way.
Where we stand today
In my opinion the biggest progress has been booked by Cuypers in the
field of flexibility with respect to layout and bandwidth constraints, while
in general flexibility with respect to designer wishes, user goal and user
preferences still need to be achieved.
Motivation for the current research
As often said, the amount of information available is steadily growing
while the structure of it is gradually fading. If once information reached
people in a sort of monochannel and authoritative way, like via television
or newspapers, now with the internet the information-gathering process is such that a considerable and continous
user action is required in order to get the information, possibly hidden
in some unknown digital data repository on the Web.
This is why a lot of interest has been focussed on searching the right information,
and the current Semantic Web initiative can also be seen as (or should also
results in) a better support for information retrieval.
If we move one step further, i.e. we assume that information more or less
pertinent has been retrieved, the easiest strategy is to present the user
with a list of (links to) information items, trying to present on top of
the list the information that better matches the user request.
If we want to present the different information items in a more natural way
(so not with a list), we need some criteria to "tell our story". That
is what motivates the research of a discourse model that can guide the process
of creating a strucured presentation composed by the information retrieved.
Proposal
My research will be concerned with the problem of composing semantically
annotated media items in a abstract presentation structure, where the term
abstract means that the focus will not be on the layout of the presentation
but on the way the media items must be organized to meet the communicative
goal selected by the user.
Research Questions
Why do we need a discourse model?
- This is the problem statement, and, as said above, search for information
is just the first step, how do we present the information?
Did previous IMMPS use a discourse model? If not, why not?
- This could give an overview of the IMMPS history and if/how they used discourse
To what extent can a model of discourse be captured? To what extent can a
discourse model capture the presentation designer's intentions? How effective
is to use a discourse model to generate presentations? How
do different discourse models affect the generated presentation?
- The need for guidance in presenting the information alone does not
motivate the choice of a discourse model as a solution. Is a discourse model
an effective way of capturing a designer's intentions? Can we model a discourse
to such an extent that it can be used by a system generating presentations?
In what way is the final presentation dependent on the discourse?
Given a fixed source (fixed number of media items), is it possible to generate
different presentations in terms of discourse model? Can the media
items used in the presentation be completely independent from the discourse
model?
- This point raises the issue of whether we can couple discourse
models and data sets without having to worry if they fit together. What requirements
(if any) does the discourse set on the media items? What limitations (if any) do the media items set on the discourse?
Can a (theoretical) presentation generation system reason only in terms of
a discourse model? How domain independent is this approach? To what extent
does the nature of hypermedia influence the possible applicable discourse
model?
- How are the conclusions we found limited to hypermedia presentations generated in the Museum domain?
Application fields
The immediate field of application is in Cuypers within the proposed architecture.
I would also like to see in how far the result/ principles found with the
Museum Domain are also applicable to generating a presentation from annotated
video segments, e.g. with Interview with America. This could provide insight
into one of the research questions, namely how domain independent the conclusions
we find are.
No go areas
- No feature extraction from the media items: all the knowledge about an item is in its metadata
- No media (text included) item generation
- No database technology
Chapter outline of thesis
Chapter 1
Introduction
Problem statement/ Motivation
Research questions
Chapters Outline
Chapter 2 (Definition, Scope and Historical Background)
What do we mean with discourse model.
This is important because it will give the boundaries within which the research
will take place. In my opinion the discourse could include the research topics
now described as Presentation Abstraction and Presentation Flow in Cuypers,
but maybe some domain restrictions will be necessary.
Literature survey of discourse models
This will be the overview about what models are available.
Literature survey of modelling tools and languages
This is about the possible implementations, what kind of technologies we can use.
History of the (most important) IMMPS and whether they used discourse techniques, why, why not.
How the others did it.
Chapter 3 (How to use it, What discourse models to use, What technologies to use)
What a discourse model can do for us
The effectivness of a discourse model to serve as guidance in generating a presentation.
Models we consider suited for our scope/goal
From the discourse model examined in chapter 2, the following ones are suited for our purpose because ....
Modelling tools and languages we consider suited for our scope/goal
From the technologies examined in chapter 2, the following ones are suited for our purpose because ....
Examples (with Cuypers hopefully)
Example discourse and domain 1
Which discourse model?
Which method of incorporating it in the system?
It worked because, it didn't work because
Example discourse and domain 2
It worked because, it didn't work because
Chapter 4 (Generalisations vs Dependencies between discourse, data, domain)
Data and discourse
This chapter should make explicit all possibly existing requirements the
data items must satisfy so that we can use discourse techniques, and point
out whether these requirements are general for all discourse model or specifically
to each discourse model.
Examples are:
- Requirements on each media item, like for ex. richness of annotations (metadata)
- Requirements on the relations between media items, like presence/absence of particular
relations, or over the "amount" of relations available (maybe a critical
mass is needed to be able to use a discourse)
This should answer the questions: What requirements set the discourse on
the media items? What limitations set the media items on the discourse?
Discourse and Domain
Can we abstract our conclusions from the Museum domain? How about video (IWA)?
Can we still generate a presentation based from video fragments based on
discourse techniques?
Discourse and Multimedia
How specific to Multimedia are our results? How does the nature (or what aspects) of Multimedia influence the discourse?
Conclusions
How were research questions answered.
Future research directions
Literature
Some hints:
Multimedia
- Intelligent Multimedia Interfaces - Mark T. Maybury 1993
- Koegel Buford, John F. (1994). Multimedia Systems. Addison Wesley.
- The
International Journal on the Development and Application of Standards for
Computers, Data Communications and Interfaces. Volume 18, Numbers 6 and 7,
December 1997. Special Issue Intelligent Multimedia Presentation Systems.
- Davis, Marc E. - Media Streams: Rrepresenting Video for Retrieval and
Repurposing. Ph.D. Thesis February 1995. Massachusetts Institute of Technology.
Knowledge Representation:
- Sowa, John F. (2000) - Knowledge Representation - Logical, Philosophical and Computational Foundations.
From Frank on Narrative:
- Brooks KM (1999).
- Metalinear Cinematic Narrative: Theory, Process, and Tool.
<http://ic.media.mit.edu/icSite/icpublications/Thesis/brooksPHD.html>
MIT PhD Thesis
- Black, J. B., & Bower, G. H. (1980). Story understanding as problem
solving. Poetics, 9, 223 - 250.
- Black, J. B., & Wilensky, R. (1979). An evaluation of story grammars.
Cognitive Science, 3, 213 - 230.
- Bordwell, D. (1989). Making Meaning - Inference and Rhetoric in the
Interpretation of Cinema. Cambridge, Massachusetts: Harward University
Press.
- Chatman, S. (1978). Story and Discourse: Narrative Structure in Fiction
and Film. New York: Ithaca.
- Lehnert, W. G. (1983). Plot Units: A Narrative Summarization Strategy.
-
- In W. G. Lehnert &. M. H. Ringle (Eds.), Strategies for Natural Language
Processing (pp. 375 - 412). Hillsdale, New Jersey: Lawrence Erlbaum
Associates.
- Lehnert, W. G., Dyer, M. G., Johnson, P. N., Yang, C. J., & Harley, S.
(1983). BORIS - An Experiment in In-Depth Understanding of Narratives.
Artificial Intelligence, 20, 15 - 62.
- Propp, V. W. (1968). Morphology of the Folktale. University of Texas
Press.
- Ricoeur, P. (1985). Time and Narrative. Chicago: The University of
Chicago Press.Schank, R. C., & Abelson, R. (1977).
- Scripts, Plans, Goals And Understanding. Hillsdale, New Jersey: Lawrence
Earlbaum Associates.
- Schank, R. C., Kass, A., & Riesbeck, C. (1994). Inside Case-Based
Explanation. Hillsdale, N.J.: Lawrence Erlbaum Associates.
- Wilensky, R. (1983b). Points: A Theory of the Structure of Stories in
Memory. In W. G. Lehnert & M. H. Ringle (Eds.), Strategies for Natural
Language Processing (pp. 345 - 376). Hillsdale, New Jersey: Lawrence
Erlbaum Associates.
- Wilensky, R. (1983c). Story grammars versus story points. The Behavioral
and Brain Sciences, 6(4), 579 - 623.
- Wilensky, R. (1990). A Model for Planning in Complex Situations. In J.
Allen, J. Hendler, & A. Tate (Eds.), Readings in Planning (pp. 263 -
274). San Mateo: Morgan Kaufmann Publishers.
Personal interests
In this months I have seen that my interests are focused in general toward the "abstract reasoning" field. Some examples are:
- Discourse structure (how do I tell my story?)
- Knowledge representation and Semantics (also web-enabled)
- Reasoning on retrieved data semantics in order to compose a presentation
- Principle of Compositional Semantics (a la Marcos)
These points coincide more or less with the above layers of the proposed architecture for Cuypers
In my opinion my research will involve the following steps:
- Read about discourse and model its characteristics
- Investigate which abstract elements play a role when presenting a topic
to a user, and relate them to each other and to the user goal
- Investigate in what Presentation Flow structures these abstract elements can be organized to be presented to the user.
- Investigate the dependencies: from the data, from the domain and from Multimedia.
Of the above mentioned steps, my interest goes to the second one and, were I to choose among them, that would be it.
Required knowledge
Knowledge relative to Discourse and Narrative.
Knowledge relative to Knowledge Representation and Semantics
General knowledge about Multimedia , in particular IMMPS.
Devil's advocate
- Didn't we see all this before, given the fact that Discourse, knowledge representation and reasoning are old research topics?
- Yup.
However, your contribution is to take the "woolly" semantics of discourse,
narrow it down to something that can be applied computationally in the creation
of multimedia presentations. You then get to use existing tools for implementing
it. You aren't trying to create new discourse models (although I suspect
that the work may creep into that terrain). You aren't trying to
create new KR&R tools (see Frank van Harmelen :-), but you do want to find the most appropriate for your/our problem.
- What is new in here?
- Putting a discourse model in the system.
(You now about Eliza? The computer psychotherapist? "She" would fool people
into thinking that she was listening and talking to them. There was no explicit
dicourse model.)
Also, discourse model for dynamic time-based media along with hyperlinks. (My thesis was _only_ about adding links to time-based media...)
- What are the research questions?
- Hey - this was your job...
"To what extent can a model of discourse be captured?"
"How do different discourse models affect the generated presentation?"
"To what extent do media items need to be annotated with attributes from the discourse model?"
"To what extent does the nature of hypermedia influence the "flow" of the discourse model?"
Have a look at Hongjing's research questions.
Somewhere in /ufs/lynda/tmp/Hongjing/main2.pdf (It's my scratch disk - you probably need to log in to mensa first?)
They are not world shattering, but clearly stated, and answered within the thesis.
- Did not Frank do all this yet? Did not Frank do everything yet, but just would not tell us?
- Of course he has done all this and will not tell us. It is thus our job to reveal it to the world...
Firstly, which discourse model are we going to use to start with?
(Everyone "disses" RST..) Is there an existing model that we can pluck from
the shelf? (I doubt it.) What needs to be done to the models to improve
them.
- How am I going to find a real family-supporting job with this up-in-the-sky research
- Nae
chance. But seriously. What would you want to do afterwards? The chances
of doing a fun postdoc somewhere are fairly high (given research will still
be paid for 3 years from now). Have you already come across groups you would
be interested in? Europe? USA? Amsterdam?? Experience shows that getting
to know people and the opportunities around is vital.