SSMS 2006

SSMS 2006 Trip Report

September 2006, Chalkidiki (Greece)

Author: Alia, Michiel
CWI participants: Alia, Michiel, Lynda
# participants: around 78 phd students/researchers

Multimedia Signal Processing by Aggelos Katsaggelos , Northwestern University (day one)
Semantic Media Analysis Jan Nesvadba , Philips Research Eindhoven (day one)
- Realtime and Distributed Multimedia Analysis System (Part 1), Multimedia Content Analysis Service Unites in Distributed Content Analysis System (Part 2)
Knowledge for Multimedia Applications, Steffen Staab, University of Koblenz (day two)
Information Retrieval, Keith van Rijsbergen, University of Glasgow (day three)
- Introduction to Information Retrieval (Or IR in a nutshell)
Multimedia Information Retrieval, Alan Smeaton, Dublin City University (day three)
- Multimeida Information Retrieval
Text Analysis and Ontologies, Philipp Cimiano, Institute AIFB (day four)
- Text Analysis and Ontologies

From quantitative to symbolic features for still images and videos, Thorsten Hermes, University of Bremen (day four)
Multimedia Ontologies and Analysis: Extraction of multimedia information and construction of relevant ontologies, Milind Naphade, Watson Research Center(day four)
- Part A (Day four) and Part B (Day five)
Multimedia Retrieval and Management, Stephane Marchand-Maillet, Viper Group University of Geneva(day five)
- Content-based Retrieval and Management of Multimedia Data
Providing Flexible Interfaces to Annotated Multimedia Repositories, Lynda Hardman, CWI (day five)
Personal Summary and Notes (Alia)
Personal Summary and Notes (Michiel)

Personal Summary and Notes (Alia)

These notes are made for myself as a memory trigger. Feel free to ignore them.

The theme of the talk is to introduce image, video, audio signal processing, its usage and the technologies used. Many of the research he shows are done in the Northwestern University and some is a well known one.

image and video recovery
Video-Audio recognition
Segmentation
foreground/background substraction
foreground/background substraction

Multimedia Signal Process: Image and Video Recovery
by Aggelos K. Katsaggelos
Northwestern University

Image retoration problem: lost information / image degradation Can be to: quantization, low resolution of capturing device, motion when capturing, cross channel degradation, etc.

Applications: video printing, etc.

Types of degradation: motion, turbulence, etc.

Steps in restoration:

1. model the degradation
2. example in motion blur - filter high frequency.
3. sophistication process

summary:

prior knowledge is critical
blind restoration is still formidable problem
adaptivity is important

Video restoration problem

Audio/Video Processing

Motivation: hci question on perceptual intelligence.

Application:
-avatars for learning,
-identification (yes/no) and verification (analyze who the person is)
- tacticle interface for disabled persons

Technology needed: Face recognition (face shape and lip contour recognition.)
- shape based visual feature.
- post-processing, by normalization reason: need to syncronize visual and audio samples

Jan Nesvaba

Philips Research - Eindhoven

The talk was a bit about his research on building the Cassandra framework: a distributed systems prototype to support multimedia content analysis

Future of multimedia: content on demand
Trend 2: from uniform reception to personalized content creation and content consumption
- personalized content, (mpeg21)
Trend 3: from passive devices to ambient and smart environments and smart content
- smart environment and smart content

The vision of ambient intelligence
sensitive. adaptive and responsive to consumers and anticipatroiy to consumer desires
required:
- smart, personalized and interconected systems

the vision: the connected planet

distributed multimedia content anaylysis system -> aim at generic/domain agnostic systems

Part 1 media management with ambient DB.

vision:
-integrated media management (merge IHDN (inhome digital network (IHDN) BCN, and BAN))
-high level P2P meta data management

requirements:
- there is a need for integrated media/metadata management solutions

current implementations:
Philips purposed solution: real time and disitributed content analysis system.
content management in consumer systems:
ten years from now its needed 100TB memory.
the problem is how to find the content?
the answer is we need meta data
to do this content analysis is neccesary when: there is no metadata, to enrich the provided metadata, to enable searching, to modify content to improve the viewing experinence, to enable new feature.
enable "intelligent chaptering" of multimedia

to get intelligence chaptering we need content analysis -> philips purposed the cassandra framework

realtime and distributeed content analyssis system
feature of distributed content analysis system:
-extensible, scalable, flexible, upgradible
-ease to use software architecture
-efficiency
- reliability
- connectivity (tcp/ip backbone, wifi/QoS in the future)
- fusion and managment data sources

Part 2

Jan shows an vox populi version of philips. The domain is cinema shots. They construct a film grammar. They use a domain semantic rules of cinematography.

conclusion:
1. semantic poserful production metadata ->
currently lost due to unresolved business model
2. production follows rules: film grammar
-> prior knowledge of rules can be applied in case of domain sepcific solutions
3. metadata standardization

KNowledge for Multimedia Applications - Motivation and Agenda
by Steffen Staab
Uni Koblenz

Semantics types: organized from technical perspective to human perspective
content semantics (general knowledge) extracted from image data eg. beach, sea, boy
content semantics (private knowledge) captured during creation eg. nico digging
situational semantics feedback and auto clasify eg, turkey
retrieval semantics manual class and managment e.g cute
social semantics e.g. common vacation

core question in multimedia:
how to represent the ontology?
how to query the ontology?
which ontology to construct how?
how to populate the ontology?

Ontologies

People cant share knowledge if they do not speak a common language (davenport and prusak,1998)

An ontology is a formal spacification of a shared conceptualization of a domain of interest. (Gruber 93)
-> formal specifiation: thus, executable and discussable
-> shared: should be shared by groups of stakeholders
-> conceptualization: its about concepts
-> on an intereset domain: between application and single truth

Taxonomy is a segmentation, clasificiation and ordering of elements into a classification system accoridng to their relationships between each other.

The rest of the presentation is on rdf, owl, etc.

---- Multimedia Ontology
MPEG-7 was thought to solve the problem of 99% annotation, but no. The problem with MPEG-7 is the annotation which are given to us is not interoperable.(Bailer et al 2005)

Desired situation: loose coupling, stan dard vicabulary with predefined meaning automatic ad hoc coupleing of data and integration.

DOLCE is well suited for a high quality multimedia ontology because: 1. it has ontology of information objects (OIO), and Descriptions and definitions (D&S)

Metodology for design pattern definition:
1. identification of most important MPEG-7 functionalities.
- decomposition
- annotation
- general: describe digital data by digital data at an arbitary level of granularity.
2. definition of design patterns for decomposition and annotation based on D&S and OIO
3. Additional patterns are needed for : complex data types of MPEG-7 and semantic annotation by using domain ontologies

Multimedia Annotation
http://www.mindswap.org/2003/PhotoStuff/

---------------

Introduction to Information Retrieval C.J. "Keith" van Rijsbergen
Glasgow University

e.g. of search engine: google teoma, wisenut, altavista, alltheweb, lycos

quite effective (at some things)
highly visible (mostly)
commercially successful (some of them so far)

what is IR? retrieval of unstructured data.

Information Retrieval is a field concerned with the structure, analysis, organizationo, storage, searching and retrieval of information.

TREC and CLEAR is a competition where algorithm of retrieval are tested.

Experimental Methodology of IR
Cleverdon - Cranfield
Lancaster - Medlars
Keen - Cranfield/ Smart
Saravec - SWRU
Salton - Smart
Sparck Jones - Ideal Test Collection
Stairs
TREC (international) / INEX (europe)

Evaluation type of IR
ABNO/OBNA (only but not all) (Fairthrone)
Precision, Recall -> trade off (Cleverdon)
Probabilistic versions (Swets)
Measure-theoretic (Bollman)

precision
recall 6 retrieved doc /20 of all document

Somethings about IR that is different from other fields:
A posteriori vs. A Priori: IR research is based on the data as is. A contrast with semantic web which starts from the structure.
OWA vs, CWA: uses open world assumption
INformation vs. knowledge: based on information rather then knowledge
Data driven vs. theory driven
Contingency vs. necessity
ostensive vs. extensive

------
note to self: Check out importance model/ authority model from theory of IR mentioned by keith.

Topic - Databases - IR
Data - structured - unstructured
Fields - clear semantics - no fields
Queries - defined (relational algebra, SQL) - free text (natural language)
Recoverability - creitical (concurency control, recovery, atomic operations) - downplayed still an issue
Matching - exact - imprecise (need to measure effectiveness)

note to self/eculture: Maybe semantic retrieval should learn from the database retrieval research rather then traditional IR?

Useful refernces:

look up: Posulates of importences : swanson 1988
an information need can not be expressed independent of context
it is impossible to instruct a machine

What is striking to me is from CvR talks about the hypotesis that IR uses is the same with Presentation hypothesis:
association hypothesis
relation hypothesis
cluster hypothesis
deduction/induction

note to self: instead or modelling it in matrices format and solve the inverse matrices of it, how about doing it in fuzziness, or the probability?

Representation of information
discrimination without repreesnation (specificity)
representation with discrimination (exhaustivity)

type of evaluation in IR:
1. system evaluation
2. user evaluation
3. operational evaluation

SWIRL workship strategic workshop on information retrieval in lorne
http://www.cs.mu.oz.au/~alistair/swirl2004/

-----

Text Analysis and Ontologies
Philipp Cimiano
Institute AIFB
Karlsruhe Universitaet

Part 1. Information Extraction
classic
ada[tove
web-based
multimedia
merging redundant information - smushing

Part 2. Ontology Learning learning concepts hierarchies learning relations

http://www.smartweb-project.de/start_en.html

ontologies provides
taxonomic organization of concepts
relations between concepts (type and cardinality constraints)
instantiation relations

types of ontologies
top-level ontologies (general: time, event of a particular domain)
domain ontology (describe the covabulary to a generic domain)
task ontology
application ontology

SWInto: smartweb integrated ontology
sport-event-ontology
navigation ontology
multimedia ontology
discourseontology
linguistic information

Integration of the aboce domain via DOLCE and SUMO

Information extractions: the task of filling certain given target knowledge structures on the basis of text analysis.

NLU is not the same as IE
aims at understanding text
deep NLP techniques
requires knowledge representation
very difficult task
no system yet to preform NLU to a reasoable extent

IE
aims at only extracting in filling predefine template
shallow NLP

Classic IE
benchmarking campaigns: Message Understanding Conference (MUC)
sponsored by DARPA

Adaptive IE
pro: no handwritingr rules, turning to machine learning

Web IE
no training
new paradigm to overcome the annotation problem
unsupervised

challenges: cont3extual disambiguity, annotating relations. scalability, accurate recogtnition
extracting relations on the web

Multimedia IE
the task is to extract relevant information from differnrt media types and combine them in a reasonable way to a whole picture

input: multimedia reseource and an ontology or templage schema
output: KB represeenting the information extracted from the various reseources linked together in a meaningful way

requires:
processing different media
merging duplicates detection
detecting and handling inconsistencies

Semantic Features Extraction, Representatoin and LSCOM
by Milind R. Naphade
IBM Thomas J. Watson Research Center

The future of semantic media in 5 to 10 years:
1. There is a significant breakthrough and company can make money from it
2. They keep spinning the wheel for some time, realize there was not much they can do and then move to another topic.
very important: the real challenge: expectation management
-----------------------

Multimedia Information Retrieval and Management http://viper.unige.ch

machine learning
check out his slides on collection visit!!
interesting visual collection management
document vs. keyword

Personal Summary and Notes (Michiel)

My personal thoughts (which could be off-topic are prefixed with MH.

Knowledge for Multimedia Applications, Steffen Staab, University of Koblenz

Where do we need semantics for? In general we could say we want to make queries that have no fixed or known in advance type of semantics. Instead we want them to cover different forms of semantics.

general knowledge
private knowledge
retrieval semantics (what it means to you)
situational semantics
social semantics

Semantic levels can be stacked in a pyramid: from top to bottom this may among others include Social (manual annotation), retrieval (feedback), situational (capture), content (extract from data). Higher in the pyramid the human perspective is more important than the technical perspective.

At the moment meta-data is available in many data formats. For example xif. We need an unified model needed that covers all forms of meta-data of any semantic type. If we want to share knowledge we need a common language. MH: I wonder if we should be talking about a language here. A language is used for communication. Isn't an ontology for the representation of concepts?

Ontology evolution

Taxonomy, Thesaurus (adds specific relations narrower, synonym). Topic Map (add relation from concepts to documents). Now ontologies are in between the front (thesauri and information retrieval) and back-end (First Order Language and reasoning).

Introduction to ontology languages and querying

RDF/OWL and SPARQL introduction..

Multimedia Ontology

Replace MPEG-7 with high quality multimedia ontology. Is mpeg-7 a good basis for this? Start with a well designed foundational ontology for example, DOLCE.

Look at the top part of DOLCE. Main distinction between endurant (exist in time) and perdurant (events, processes)
OIO (ontology of information objects). Bases on Shannon communication theory and Jakobson communication elements.
Descriptions & Situations.

MH: Can these ontologies be useful to describe domain specific events, for example in cultural heritage, from top level events of this combination? D&S and OIO are below DOLCE, inherit the clear separation in objects, events from DOLCE. How about making an ontology for cultural heritage. What would it need? Events, situations??

MM annotation

From tagging to ontologies = from natural for humans to hard to comprehend = from no semantics to formal semantics.

Tagging, flickr, riya (MH: with face recognition I remembered that I started trying this but never finished. Maybe something for Zjlko and his Amico)
More formal, photostuff, used at ESWC for annotation images (done at MINDSwap).
M-OntoMat-Annotizer

Object recognition

Low level object recognition with high level consistency checking to see if (combination) of recognized stuff is allowed by the ontology. They use constraint programming for this.

Information Retrieval, Keith van Rijsbergen

The Retrieval Loop: Question - Answer - Feedback. MH: What about not just giving a direct answer but take possible feedback already into account. Present results in context.

Precision and Recall developed in 196x by Cleverdon. MH: P&R can only be measured in fixed test set, with mapping of predetermined queries and relevant document.

I asked if there are techniques to measure precision and recall without fixed dataset and query -- SIGIR 2006 Voorheas (does stuff with Wordnet)

Laura asked on how to measure more or less relevant -- reference: Jarvelin and Ingwersen. This I should really check.

Meta thoughts: in IR structure is a posteriori (find the structure in the data) vs. a priori, where we start with a structure. MH: Of course we should be in the middle. From an a priori structure we can try to find complex, a posteriori, structure in the data.

Comparison between DB (data retrieval) and IR

Is semantic web in between this? I made an on the fly comparison between a few properties he of IR and DR for retrieval on the Semantic Web.

data: structured (DR)
fields: clear semantics (DR)
queries: defined and free (DR, IR)
recoverability: .. (IR)
matching: direct and fuzzy (DB, IR)

Some key terms and issues in IR

Probabilistic inference: Modes ponus with unknown premises. What's the probability of the consequence? - He mentions two hypothesis (.., association). Check this on the slides.
IR Daemon: A metaphor to describe process of finding relevant details. Pass relevant block non-relevant.
IR Models: Relevance is known under different notions, you could adopt your own interpretation. IR models do not include the cognitive aspects. Jarvelin et al. do try this.
Clustering: Is done much from the 60s-80s. Got lost but is in revival for web retrieval. If you ever work on this do read historic literature.
Ostensive retrieval: They built system where pointing at a document changes the view on the other documents. Show really nice image of navigation in both T-Space and D-Space. Clusters at both levels can be used to further explore. Check this in literature!

Difficulties with Relevance

Goffman 1969: dynamics of relevance.
Maron: Relevance of document does not imply user judges it as relevant.

Multimedia Retrieval

Text documents wear semantics on their sleeve (the words inside the document). The IR depends heavily on the possibility to count the terms. Is there a correspondence in image retrieval? The relevance of the image depends on how you got there. For example, the path you traveled in browsing.

Rijsbergen's current work

Interaction logic: meaning emerges through interaction. Quantum logics: logic introduced into vector space. Heavy stuff, sounds really cool.