Author: Alia, Michiel
CWI participants: Alia, Michiel, Lynda
# participants: around 78 phd students/researchers
Image retoration problem: lost information / image degradation Can be to: quantization, low resolution of capturing device, motion when capturing, cross channel degradation, etc.
Applications: video printing, etc.
Types of degradation: motion, turbulence, etc.
Steps in restoration:
Motivation: hci question on perceptual intelligence.
Application:
-avatars for learning,
-identification (yes/no) and verification (analyze who the person is)
- tacticle interface for disabled persons
Technology needed: Face recognition (face shape and lip contour recognition.)
- shape based visual feature.
- post-processing, by normalization reason: need to syncronize visual and audio samples
Future of multimedia: content on demand
Trend 2: from uniform reception to personalized content creation and content consumption
- personalized content, (mpeg21)
Trend 3: from passive devices to ambient and smart environments and smart content
- smart environment and smart content
The vision of ambient intelligence
sensitive. adaptive and responsive to consumers and anticipatroiy to consumer desires
required:
- smart, personalized and interconected systems
the vision: the connected planet
Part 1 media management with ambient DB.
vision:
-integrated media management (merge IHDN (inhome digital network (IHDN) BCN, and BAN))
-high level P2P meta data management
current implementations:
Philips purposed solution: real time and disitributed content analysis system.
content management in consumer systems:
ten years from now its needed 100TB memory.
the problem is how to find the content?
the answer is we need meta data
to do this content analysis is neccesary when: there is no metadata, to enrich the provided metadata, to enable searching, to modify content to improve the viewing experinence, to enable new feature.
enable "intelligent chaptering" of multimedia
realtime and distributeed content analyssis system
feature of distributed content analysis system:
-extensible, scalable, flexible, upgradible
-ease to use software architecture
-efficiency
- reliability
- connectivity (tcp/ip backbone, wifi/QoS in the future)
- fusion and managment data sources
conclusion:
1. semantic poserful production metadata ->
currently lost due to unresolved business model
2. production follows rules: film grammar
-> prior knowledge of rules can be applied in case of domain sepcific solutions
3. metadata standardization
Semantics types: organized from technical perspective to human perspective
content semantics (general knowledge) extracted from image data eg. beach, sea, boy
content semantics (private knowledge) captured during creation eg. nico digging
situational semantics feedback and auto clasify eg, turkey
retrieval semantics manual class and managment e.g cute
social semantics e.g. common vacation
core question in multimedia:
how to represent the ontology?
how to query the ontology?
which ontology to construct how?
how to populate the ontology?
An ontology is a formal spacification of a shared conceptualization of a domain of interest. (Gruber 93)
Taxonomy is a segmentation, clasificiation and ordering of elements into a classification system accoridng to their relationships between each other.
The rest of the presentation is on rdf, owl, etc.
Desired situation: loose coupling, stan dard vicabulary with predefined meaning automatic ad hoc coupleing of data and integration.
DOLCE is well suited for a high quality multimedia ontology because: 1. it has ontology of information objects (OIO), and Descriptions and definitions (D&S)
Metodology for design pattern definition:
1. identification of most important MPEG-7 functionalities.
- decomposition
- annotation
- general: describe digital data by digital data at an arbitary level of granularity.
2. definition of design patterns for decomposition and annotation based on D&S and OIO
3. Additional patterns are needed for : complex data types of MPEG-7 and semantic annotation by using domain ontologies
Multimedia Annotation
http://www.mindswap.org/2003/PhotoStuff/
e.g. of search engine: google teoma, wisenut, altavista, alltheweb, lycos
quite effective (at some things)
highly visible (mostly)
commercially successful (some of them so far)
what is IR? retrieval of unstructured data.
Information Retrieval is a field concerned with the structure, analysis, organizationo, storage, searching and retrieval of information.
TREC and CLEAR is a competition where algorithm of retrieval are tested.
Experimental Methodology of IR
Cleverdon - Cranfield
Lancaster - Medlars
Keen - Cranfield/ Smart
Saravec - SWRU
Salton - Smart
Sparck Jones - Ideal Test Collection
Stairs
TREC (international) / INEX (europe)
Evaluation type of IR
ABNO/OBNA (only but not all) (Fairthrone)
Precision, Recall -> trade off (Cleverdon)
Probabilistic versions (Swets)
Measure-theoretic (Bollman)
precision
recall 6 retrieved doc /20 of all document
Somethings about IR that is different from other fields:
A posteriori vs. A Priori: IR research is based on the data as is. A contrast with semantic web which starts from the structure.
OWA vs, CWA: uses open world assumption
INformation vs. knowledge: based on information rather then knowledge
Data driven vs. theory driven
Contingency vs. necessity
ostensive vs. extensive
Topic - Databases - IR
Data - structured - unstructured
Fields - clear semantics - no fields
Queries - defined (relational algebra, SQL) - free text (natural language)
Recoverability - creitical (concurency control, recovery, atomic operations) - downplayed still an issue
Matching - exact - imprecise (need to measure effectiveness)
note to self/eculture: Maybe semantic retrieval should learn from the database retrieval research rather then traditional IR?
Useful refernces:
look up: Posulates of importences : swanson 1988
an information need can not be expressed independent of context
it is impossible to instruct a machine
What is striking to me is from CvR talks about the hypotesis that IR uses is the same with Presentation hypothesis:
association hypothesis
relation hypothesis
cluster hypothesis
deduction/induction
Representation of information
discrimination without repreesnation (specificity)
representation with discrimination (exhaustivity)
type of evaluation in IR:
1. system evaluation
2. user evaluation
3. operational evaluation
SWIRL workship strategic workshop on information retrieval in lorne
http://www.cs.mu.oz.au/~alistair/swirl2004/
Part 1. Information Extraction
classic
ada[tove
web-based
multimedia
merging redundant information - smushing
Part 2. Ontology Learning learning concepts hierarchies learning relations
http://www.smartweb-project.de/start_en.html
ontologies provides
taxonomic organization of concepts
relations between concepts (type and cardinality constraints)
instantiation relations
types of ontologies
top-level ontologies (general: time, event of a particular domain)
domain ontology (describe the covabulary to a generic domain)
task ontology
application ontology
SWInto: smartweb integrated ontology
sport-event-ontology
navigation ontology
multimedia ontology
discourseontology
linguistic information
Integration of the aboce domain via DOLCE and SUMO
Information extractions: the task of filling certain given target knowledge structures on the basis of text analysis.
NLU is not the same as IE
aims at understanding text
deep NLP techniques
requires knowledge representation
very difficult task
no system yet to preform NLU to a reasoable extent
IE
aims at only extracting in filling predefine template
shallow NLP
Classic IE
benchmarking campaigns: Message Understanding Conference (MUC)
sponsored by DARPA
Adaptive IE
pro: no handwritingr rules, turning to machine learning
challenges: cont3extual disambiguity, annotating relations. scalability, accurate recogtnition
extracting relations on the web
input: multimedia reseource and an ontology or templage schema
output: KB represeenting the information extracted from the various reseources linked together in a meaningful way
requires:
processing different media
merging duplicates detection
detecting and handling inconsistencies
Where do we need semantics for? In general we could say we want to make queries that have no fixed or known in advance type of semantics. Instead we want them to cover different forms of semantics.
Semantic levels can be stacked in a pyramid: from top to bottom this may among others include Social (manual annotation), retrieval (feedback), situational (capture), content (extract from data). Higher in the pyramid the human perspective is more important than the technical perspective.
At the moment meta-data is available in many data formats. For example xif. We need an unified model needed that covers all forms of meta-data of any semantic type. If we want to share knowledge we need a common language. MH: I wonder if we should be talking about a language here. A language is used for communication. Isn't an ontology for the representation of concepts?
Taxonomy, Thesaurus (adds specific relations narrower, synonym). Topic Map (add relation from concepts to documents). Now ontologies are in between the front (thesauri and information retrieval) and back-end (First Order Language and reasoning).
Replace MPEG-7 with high quality multimedia ontology. Is mpeg-7 a good basis for this? Start with a well designed foundational ontology for example, DOLCE.
MH: Can these ontologies be useful to describe domain specific events, for example in cultural heritage, from top level events of this combination? D&S and OIO are below DOLCE, inherit the clear separation in objects, events from DOLCE. How about making an ontology for cultural heritage. What would it need? Events, situations??
The Retrieval Loop: Question - Answer - Feedback. MH: What about not just giving a direct answer but take possible feedback already into account. Present results in context.
Precision and Recall developed in 196x by Cleverdon. MH: P&R can only be measured in fixed test set, with mapping of predetermined queries and relevant document.
I asked if there are techniques to measure precision and recall without fixed dataset and query -- SIGIR 2006 Voorheas (does stuff with Wordnet)
Laura asked on how to measure more or less relevant -- reference: Jarvelin and Ingwersen. This I should really check.
Meta thoughts: in IR structure is a posteriori (find the structure in the data) vs. a priori, where we start with a structure. MH: Of course we should be in the middle. From an a priori structure we can try to find complex, a posteriori, structure in the data.