ESWC'2006 - Trip report by Raphaël

3rd European Semantic Web Conference (ESWC 2006)

11-14 June 2006, Budva, Montenegro

3rd European Semantic Web Conference (ESWC 2006)
Author: Raphael, Véronique, Lora
CWI participants: Raphael, Lora
# participants: around 300

Overall impression

A nice conference, with more attendants than the previous year (around 300 this time). What I will name a "Core European Semantic Community" was here, and I personally feel at home with them :-) 181 papers have been submitted, for a final 26% acceptance rate.

All the workshops, poster/demo proceedings are available electronically here.

What you Mean is What you Match: Multimedia and the Semantic Web - Tutorial

Workshop URL: http://gate.ac.uk/conferences/eswc2006/multimedia-tutorial/
Participants: #20 (CWI: Raphael, Lora)
Personal opinion: Overal disapoiting!
The signal analysis guy just enumerate all that they can extract ... but it is so far away from what the user expect ! The SW guy talk only about text ! Michael did a nice talk, where we can begin to see some integration between multimedia and SW technologies. But its talk was too short. He will give us an extended talk during his visit in CWI July 13th-14th.
Globally, once again, we see perfectly how difficult it is to find people who has a vision and original ideas of how to combine the user needs, the computer analysis results, and the SW technologies! The gap is not only semantic ! It is between the communities, they don't talk to each other.

Stefan Rüger, Imperial College, London, UK.
Multimedia information retrieval background. He is in the Scientific Board of K-Space.
Content-based search: query by example. Goals: querying with a sound, image, keywords, to retrieve text, video, images, sounds, etc.
Automatic analysis: allows to automatically annotate images, with a bag of words. No interpretation, the Semantic Gap! Ontologies can define these terms and turn them into semantically defined concepts. He doesn't say a word how that can be done!
Related projects for bridging the Semantic Gap (K-Space, MMKM, Mesh)
Polysemy: an image = 1000 words! Use approximation and some relevance feedback to narrow the interpretation. Towards a recommendation system, very much like what Lloyd is doing in CHIP.
Quick state of the art on feature analysis, RGB colour model: what we can extract from images (colour histograms, structure maps, texture descriptors, text index, localisation feature, shape analysis, etc.).
Summary: many different low-level features can be computed, why did they do this?

to describe a complex multimedia object
to automatically annotate mm objects with sallient properties
to be able to compute similarities of mm objects
high-level features are not (yet) accessible

Raphael question: where is the user ? You just give us an enumeration of what you could automatically extract from an image or a video. That is fine for the TrecVid competition, but this is not what the user needs, nor what he will use !
Answer: correct ! The user has to be back in the loop. He really think that the future is then in providing (semantic) interfaces allowing relevance feedback. New interaction means so that the user can select what is looking for from a snapshot / browsing system of a collection.
Complement from aceMedia: Raphael is right! In aceMedia, we draw 100 user scenarios, but the content analysis people thought this scenario were unrealistic, so they just do what they know what are able to do. Today, one year before the end of the project, there is NO overlap between the user needs and the technologies offered. None of the senarios are satisfied ! Moral: DON'T DO THAT in K-Space ! Spend time to properly define the user scenario ...

Valentin Tablan, University of Sheffield, UK.
Semantic Web side, but only text-oriented:-( Talk only about the SEKT project
Semantic annotation, why ? Enable SW technologies to aggregate search results, summarize, inference ... better retrieve through conceptual search!
Extracting Semantics from Text: based on Information Extraction (named entity recognition, relation extraction, event detection, etc.). Not only bag of words but look for the structure in the text.
HLT tools could be used for segmenting multimedia content (split video into segments). Use of the ASR transcript: not precise. Correct the ASR with the Web (good quality entities)!

Michael Hausenblas, Joanneum Research, Graz, Austria.
The EU NM2 project: creation of technologies for non linear interactive narrative-based movie production. Demo on the web site.

Semantic Web Personalization Workshop

Programme: http://www.kbs.uni-hannover.de/~henze/swp06/swp06programm.html
Proceedings: http://www.kbs.uni-hannover.de/~henze/swp06/swp.pdf
Lora has attended to this workshop so I went to the Industry Forum.

ESWC 2006 Industry Forum

Programme: http://www.eswc2006.org/industry_programme.html
Participants: #50

Natasa Milic-Frayling, Microsoft - Issues and strategies for digitization and preservation of the digital content

KeyNote speaker, from Microsoft Cambridge.
Driving force for digitization:

Summer 2005, Google aggrement for selected content digitization (virtual libraries).
October 2005, Open Content Alliance formed.
Mid-October 2005, Microsoft joined the Open Content Alliance. MSN and British library partner on digitization. 100 000 books, out of copyright selected.
September 2005: EU to coordinate the effort: "i2010: Digital Libraries"
March 2006: EU published a report from online consultation and formed a High Level Expert Group

HUGE volume: i2010 Digital Library initiative = 2 533 893 879 books in 2001 ! 1 million hours of film, 1.6 million hours of video recordings, 2 million hours of audio recordings.
Several issues for online consultation: languages, promote private investments, copyrighted material, transparency (people do not give their work to Google, MSN or Yahoo!), priority measures (it is costly!), European legislation. How could research contribute ?
Some Microsoft solutions: Tie the selection content with the content exploitation models (enable re-investment), implements DRM solutions for copyright material, etc.
Projects: PLANETS (EU project start May 1st 2006, 4 years, Digital Content Preservation), MS Open Office XML Formats (Promotion of the MS Office format), IBM Virtual Computer (storage of data and better view on data).
Conclusions: preservation of scientific data pushes the boundaries of state of the art in information management: tools and best practices; Innovation required in interoperable tools abd development platforms for domain experts.

Chris Wroe, BT Labs Health care, UK - Is Semantic Web technology ready for Healthcare?

Tools for Health care: SNOMED resource. SNOMED-CT: can be represented in OWL. The DL reasoner and classification procedure are useful to people who use SNOMED. The talk shows that they need the expressive power of OWL in the health care domain!
Challenges: SW tools have difficulty with the scale of SNOMED-CT (around 400 000 classes!). They are willing to adopt RDF, only if the tools prove to be a mature technology, scalable for their problem, otherwise, they would go for traditional relational data model.

David Norheim, Computas, Norway - Knowledge Management in the Petroleum Industry

Creation of a drilling ontology in the field of drilling for finding OIL. Ontology-enabled active search (normal keyword search + query expansion tactics using the concept specialization and generalization, or the relations). OWL Full is used since they do not need the reasoning part. They just want to have a common and shared vocabulary usable for annotating documents.
What they need ? Semi-automated annotation, using domain ontologies and visualization of annotated knowledge. It seems that some collaborative work involving oil companies, suppliers, research and supporting undustries are being setting up (Semantic Web Collaboration project).

Massimo Paolucci, NTT DoCoMo, Germany - Contextual Intelligence for Mobile Services through Semantic Web Technology

DoCoMo: biggest mobile operator in Japan, 3rd operator in the world.
Voice and data market are saturated. Mobile operators need to offer new services to the users (multimedia content, ubiquitous / contextual / personalized services). The machine should take care of interoperability issues! Semantic Web is a key technology for them.
Innovative applications: ContextWatcher. There is a context ontology in OWL DL.
Example scenario: a very funny video, where we see a big line up for buying transport tickets in front of an automatic machine. A guy has just 50 euros and the machine just refuses this amount of money. A smart guy arrives with his phone, put his phone in front of the metro map (camera embedded) where he is, click, put the cell phone on the destination, click, and the payment is done, the ticket goes out of the machine or is written on the phone ! This is what they want to do ...
Conclusion: contextual intelligence, strong ties with the SW.

Alexander Polonsky, Cognium sa, France - Semantic Laboratory Notebook: Managing biomedical research notes and experimental data

The time for discovery and development of a new molecule entity (NME) has gone from approximately 4 years in the 60's to 14 years now !

Roberta Cuel, Univ trento, Italy Semantic web technology RoadMap

Knowledge Web talk ...

Stéphane Guérillot, AFP, France News Agency needs : XML News

Use of Semantic Web technologies at AFP.
A News Agency is a Content Provider. It could be for general news, specialized, national perspective, international coverage, different media, etc. AFP is one of the 3 worldwide news agencies. Work in different languages (French, English, German, Spanish, Portuguese, and Arabic). Work daily, 1200 journalists, 200 photo-reporters, and 2000 stringers in 165 countries. Provide text, photos and videos.

Facts: we are all under a huge flow of information, more difficult for professionals, a jungle for the layman (archives = 7M news stories per managed languages).
IPTC: founded in 1965. Standards are: NITF, IIM, NewsML, SportsML, NewsCodes
Link IPTC / W3C ? Is it us ?

Use of IPTC latest standards for our multimedia standards but: THIS IS NOT ENOUGH! We need new tools to automate this enrichment in a new environment by implementing automatic taxonomy, ontology and pre-indexing.
Challenge: A News Agency's buisnes has changed today to: adding value to content and deliver to customers personalized and contextualized content.
Objectives: search (active mode), alerts (passive mode), browse (interactive mode) ! Cluster text-photo for a specific event, improve access to archives, etc.
Current problems: full text search = too noisy; sort/rating = inefficient; multi-criteria: not mastered; keyword based. They would like to have Semantic Navigation ! Ex: how to get the story and the associated images/illustrations/videos associated for a particular event ? Rely on the caption is not enough!

Current search parameters:

Subjects = concepts
Themes = IPTC News Codes
Named entities: Persons, Organizations, Locations, Events, Work, Places
Named entities in 1300 themes, 5000 organizations, 230000 Persons, etc. No links between them!

They have developped a small ontology between the 5 top concepts: person, location, organization, event, work.
Conclusion: traditional "push content" from the news provider to the customers is replaced by interactive push/pull services: customers select and retrieve what they need. Have to implement those our features on our different Media, but requirements for multi-media news make things difficult! Semantics is a maze, we need HELP ! Don't forget, journalists have to work to produce these news services AND the key will be in the links between NewsItems of the same nature or of different nature leading to Multimedia content management as early as possible in the prodcution chain.

Question: what tools do you use ? Internal developped tools, that allow to manually annotate the news stories used by journalists.

Jean Delahousse, Mondeca, France - Use of Ontology for production of access systems on Legislation, Jurisprudence and Comments

Mondeca presentation. Mention of their 3 PhD thesis (and Laurence too :-)
Presentations of the Wolters Kluwer use case scenario.

Where Does It Break? Or: Why the Semantic Web is Not Just 'Research as Usual'

Keynote - Frank van Harmelen http://www.eswc2006.org/keynote-frank-van-harmelen.pdf

My abstract is really ambitious, don't read it !
Which Semantic Web ?

Semantic Web as a Web of data. Expose databases on the web, use RDF to integrate them.
Enrichment of the current web. Annotate, classify, index. Goals: personalization, search, browse ...

Different use cases, techniques, and users!
Semantic Web: Science or Technology ?
Usual arguments: better search and browse, personalization, semantic linking, semantic web services, etc. 4 examples of "where does it break" ? Hold asumptions that no longer hold.

Traditional complexity measures: decidability (= guarantee to find an answer or tell you if it doesn't exist); undecidability (= not always guaranteed to find an answer); complexity (= worst case, asymptotic, ignores constants). What to do instead ? Practical observations: it works for most of the cases. We should think hard about having "average case complexity".
Hard in theory, easy in practice. New formal notions are needed (accepted, rejected, overdetermined, undetermined)
Context-specific nature of knowledge: use background knowledge for mapping ontologies, since meaning is heavily relying on the context!
logic versus statistics: DB versus AI. How to combine the two worlds ?
Put statistics in the logic: see Fuzzy DL
Use statistics to control the logic: use the "Google" distance (co-occurence)
Use statistics to define the logic: whispering game

See the conclusion here.

Ontology Engineering

Encoding Classifications into Lightweight Ontologies - Fausto Giunchiglia, Maurizio Marchese, Ilya Zaihrayeu

Goal: build a bridge between lightweight classification system and ontologies. 4 steps process:
1/ disambiguating labels (convert natural language labels to propositional DL labels, use WordNet);
2/ disambiguating edges;
3/ understanding classification alternatives;
4/ making classification choices.

A Method to Convert Thesauri to SKOS - Mark van Assem, Véronique Malaisé, Alistair Miles, Guus Schreiber

Motivation: Many thesauri existing. SKOS: a standard RDF Schema.
Research question: can step-wise method be developed to assist conversion to SKOS?
All results, conversion tools available at: http://thesauri.cs.vu.nl/eswc06/
Methods steps: 1/ Thesaurus analysis; 2/ Mapping to SKOS (table); 3/ Conversion program (develop the code). This is more or less what George has done for the NewsML IPTC codes.

Use case: MeSH, 23000 descriptors, polyhierarchy
Lessons learned: things are easy to convert; others are not in SKOS (qualifiers, compound concepts, etc.). They provide feedback to SKOS for adding support for compounds, for considering skos:Term (multilingual, terms should have an identity)

Ontology Engineering Revisited: An Iterative Case Study - Christoph Tempich, H. Sofia Pinto, Steffen Staab

Scientific problem: Ontologies often need to be built in a decentralized way, ontologies must be given to a community in a way such that individuals have partial autonomy over them (need of local control and local updates) and ontologies have a life cycle that involves an iteration back and forth between construction/modification and use. While recently there have been some initial proposals to consider these issues, they lack the appropriate rigor of mature approaches. i.e. these recent proposals lack the appropriate depth of methodological description, which makes the methodology usable, and they lack a proof of concept by a long-lived case study.
Proposal: distributed ontology modelling (former publication by Pinto et al.), tested their methodology in real-life application, in the travelling domain.
Usecase: ontology to categorize documents, sharing and retrieving knowledge.

build core ontology in collaboration
users refine, make local changes (customize)
analysis to merge the views: what to merge? Contradictions?
revision of the common ontology
back to users to locally update etc.

One of the lessons learned: non expert users (in KR) used only the concepts (<=> keyword approach of annotation), using the whole model with properties etc. was too complex for them.

Semantic Web Mining and Personalisation

Semantic Network Analysis of Ontologies - Bettina Hoser, Andreas Hotho, Robert Jäschke, Christoph Schmitz, Gerd Stumme

Social Network Analysis (SNA) emerged in the late 70's. Presentation of the benefits of SNA for analyzing ontologies (Semantic Network Analysis, SemNA).
Eigensystem analysis for structural analysis of a graph.

Dynamic Assembly of Personalized Learning Content on the Semantic Web - Jelena Jovanovic, Dragan Gasevic, Vladan Devedzic

Presentation of the TANGRAM project: http://iis.fon.bg.ac.yu/TANGRAM/home.html a learning environment for the domain of Intelligent Information Systems, and the TANGRAM application: http://ariadne.fon.bg.ac.yu/TANGRAM/app/

Interactive Ontology-Based User Knowledge Acquisition: A Case Study - Lora Aroyo, Ronald Denaux, Vania Dimitrova, Michael Pye

Two relationships between SW and personalization: personalizations technologies to enhance usability of SW applications; SW technologies to enhance user-adaptive applications.
E-learning domain.

Toward large-scale shallow semantics for higher-quality NLP

Keynote - Eduard Hovy ( Presentation)

Different levels of semantic: what is semantic, how to represent it, the interrelations between different semantic levels/units, where to get the semantics from?

Reasoning

Toward Multi-Viewpoint Reasoning with OWL Ontologies Heiner Stuckenschmidt

Basic idea is that different users have different needs and viewpoints about one domain, applicaion dependent viewpoint. There are some big ontologies that gather different viewpoints, his idea is to use the logic definition of concepts in the "big" ontology to create local (and smaller) consistent ontologies. Multi-viewpoint reasoning, automatic selection of the features of the definition to be selected to create the sub-ontology (features that are common to different concept definitions). New logic notation needed.
Problem: a global ontology has to exist to generate the different viewpoints, as the method creates the new subsumption relations from a global ontology gathering different viewpoints. Check in the paper the evaluation and the usecases.

Variable-Strength Conditional Preferences for Ranking Objects in Ontologies - Thomas Lukasiewicz, Jörg Schellhase

Linked to OntoRank
Conditional preferences to express statements, conditional preference base. Ranging the statements: Yuppies express a 70% prference for red cars (not a yes/no statement).

Visual Ontology Cleaning: Cognitive Principles and Applicability - Joaquin Borrego-Diaz, Antonia Chavez-Gonzales

Goal : visual tools for extending /reparing ontologies. Information Visualisation tool to visually clean an ontology.
Method: spatial reasoning to clean ontologies.
They make a visual interpretation of DL definition of concepts, and do the cleaning using mereotopological axioms: visual representation of the KB instances and checking consistency with mereotopological axioms (?? the presentation was not very clear...). Method to represent graphically an ontology, including cases where membership of an instance to a class is uncertain.

Ontology alignment (2/2)

An Iterative Algorithm for Ontology Mapping Capable of Using Training Data - Andreas Hess

Background of the speaker: PhD in Dublin, automatic annotation of Web Services. Uses two sets of features for mapping 2 ontologie's concepts:

Instrinsic view: inherent to concept
Extrinsinc: linked to external knowledge (relationships between ontology's concepts)

Tested the mapping with and without bg knowledge (known mapping to a third ontology), with only one type of feature or both, with different similarity measures. The results show that relevant methods are closely related to the type of ontology one wants to map: if they contain lots of concepts but few structure, lexically-based methods will perform better, taking into account the structure of the ontologies rises the results only in case of ontologies containing lots of links/similar structure.
Lessons learned: No best mapping ontology algorithm, independent from lexical content and structural similarity
Lexical similarity usually carries the heavy weight
Tried to make better lexical mapping by using WordNet similarity, but did not improve that much, maybe because of the algorithm used as a blackbox.

Reconciling concepts and relations in heterogeneous ontologies - Chiara Ghidini, Luciano Serafini

DRAGO tool: distributed DL as a framework, distributed reasoning architecture for the Semantic Web (see former paper).
Mapping concepts is often taken into account, but these are not the only "things" in ontologies (more and more people are interested in relations!!!). Wedding can be a concept/a relation -> has to be mapped! -> extend languages for ontology mappings to map ELEMENTS of ontologies, not only concepts. Presented different usecases of KR that leads to heterogeneous ontology elements mapping (triple to a concept, concept to a triple). Defined the constructs: more general than/more specific than (equivalent is described by both relations), applied to ontology elements. They modeled rules for concept to concept, role into role, concept into role and role into concept mapping. Some heterogeneous mapping are not taken into account in the DDL. Different ways of mapping heterogeneous elements taken into account in the paper.

Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the oMAP Framework - Raphael Troncy, Umberto Straccia

More and more structured data on the Web: Swoogle 1.5M SW documents. Need and new possibilities to map different sources of information.
oMAP ontology alignment tool, combines different classifiers, including one based on OWL definitions of entities. Inspiration: GLUE system: combine several specialized components for finding the best set of data. Terminological classifier: based on distance metrics computed from WordNet (use WN synonyms).
... the rest was too fast for Véronique :-)

Poster / Demo session

Several posters that I have took in photo: