Author: Raphael, Zeljko
CWI participants: Raphael, Zeljko
# participants: around 35
Overall the reviewers were very impressed with the technical and administration progress of the project.
WP1 issues:We should encourage longer terms exchanges, and quality exchanges (more productive in terms of publications, collaboration activities, etc. Do not forget to report any exchange!
Rough estimation according to Craig: 120 000 euros budget for the K-Space exchanges. Only 20 000 euros spent so far!
Would it be possible for Alia to initiate an exchange with DCU before June? Do we talk about a 2 weeks exchange or longer than 2 weeks?
Preliminary feedback from the review for WP3:
Spatio/temporal structuring, focus on feature point tracking is highly recommended ;
Content description: focus beyond MPEG-7 descriptors, but WP is
doing that already since the WP focuses on algorithms that are already beyond MPEG-7,
misunderstanding of term 'descriptor' and WP3.4 'content description'.
Main objective: structuring based on multimodal low-level feature analysis. Initial software modules for multimedia content structuring. Segmenting TV stream. Much better results of multimodal (video + audio) drum transcription (compared to the case when only visual or audio source is used).
Detecting Salient Events in Football Videos.
Salient feature detection, salient temporal modeling and pattern mining, highlighting detection.
Video collection: Smartweb Data, World Cup 2002/2006, World Cup 2006 from Irish broadcasting.
Developing salient feature extraction tools and for mining temporal pattern between salient and
content events.
Developed Baseline AST system. Processing of conference transcriptions, followed by Language Model
Adaptation, Increasing the vocabulary. Future plans include French.
Music video genre classification: Preliminary SVM classification experiment GET and DCU
Music instrument classification: Preliminary experiments, 60% accuracy on duets
Recognition of camera motion and motion-based video structuring:
Camera motion types
Moving object segmentation: spatio-temporal segmentation of moving objects in image sequences
D3.4 due by Month 18. Present the MPEG-7 profiles. Investigate the version 2 and 3 of MPEG-7.
The TOC will be distributed soon.
New activity lead by CWI: set up a (semantic) wiki for gathering practical experiences for using
MPEG-7 (examples, descriptions, profiles, etc.) and listing relevant tools.
Only minor comments from the reviewers. COMM (Core Ontology of MultiMedia) is available at: http://multimedia.semanticweb.org/ontology/. Work now on Java API for COMM (COMMAPI). Some changes of the model based on API implementation. New proposal for handling of datatypes (simplifies MPEG-7 in line with DOLCE)
Action Point for Paul: to organize a meeting between DFKI, CWI, KU, ITI and UEP to harmonize all the approaches.
CWI will improve the communication of COMM. The idea is to give practical examples of how to use the multimedia ontology, reference the API, on http://multimedia.semanticweb.org/ontology/
CWI will investigate the use of Semantic Web technologies and extensions in the news domain (liaison with WP5.3). I have talked with Krishna (QMUL), and I have shown him the images with AFP (from the World Cup 2006). Interestingly, we have identified all what automatic analysis could detect: among others, the presence of the ball or not (an action), the stadium or not, the nets (for the goal), player versus crowd versus spectators, the grass, the various flags, etc. I will send Krishna a subset of the photos we have (10% about 600 photos) so that he can start some experiences.
Pression for having all tools from WP5.3 compatible with K-Sems. There is no reason and no resources to have /facet on K-Sems, thus no plan is made for that.
Zeljko presented what he would like to do in Multimodal Interaction using results from other
WP partners. Is there some concrete plans?
Talk with Krishna (QMUL) about what to extract from the AFP images (see above).
Talk to Jana Urban (GU) about what they do with the BBC news programs from TrecVid. It seems we can have
a ground truth annotation from the 2005 test set of TrecVid (news program).
Decisions: general architecture, playing video within the tool
Problems with precise positioning, Web based video players, control over scripts (VLC, RealPlayer…), but
all tried players have a problem with precision (QuickTime also an option)
Ontology browser: Java solution or other, still not resolved; Not a problem, more functional requirements
COMM API almost finished
Interfacing with analysis tools: existing prototype available
Platform independence – only if we do not play video!? The main issue: who will do the AJAX programming?
Identify potential problems in Web based applications in next few weeks
Decision on player and architecture not achieved
AJAX definitely more desirable, but lack of expertise, lots of implementation work, and little manpower (e.g. most of the partners that are for AJAX will not actually do the implementation)
Rewiewers happy to see we are carrying on in 2007. We can do feature detection again. We have to show interactive retrieval as year 2 of a 2-year plan. For year 3, we may do summarization of rushes.
36 tools have been promised (based on the questionnaires) but 21 are referenced in the repository.
Need to know what are the missing ones and understand why there are not collected yet.
Furthermore, does this list still reflect what is going on? Are lists of existing/planned tools
still valid?
CWI has provided the IWA corpus. Partners would be interested in doing feature analysis on this corpus.
CWI should identify a subset of the last year TrecVid features (39 features for TrecVid 2006) that
would be interesting to have. All K-Space partners will run then the analysis around May, as part of
the TrecVid features detection task.
GET has agreed to do the speech transcription on the corpus.
The corpus will be composed of 400 hours of news magazine, science news, news reports from the Sounds and Vision Institute (NL) + 200 additional hours of non-commercial news! Topics will express the need for video concerning people, things, events, locations, etc.
Planned participation: high-level feature extraction, search task, rushes summarization
Clear from the review that there are many cross-WP activities going on. What are they? Who are
involved?
Proposal: based on review slides, DCU to generate master list of cross-WP activities, identify a
"champion" of each activity, list becomes a living document reviewed at each meeting.
The TOC of the third newsletter has been finalized. There will be a big headline about the Multimedia Semantics XG for reporting on its 1 year activity.
Decision: The association will be registered in Germany, by DFKI. It will a Verein e.V.
BIG fight on the first K-Space Book!
2nd book: Vassilis feels that there are missing chapters from the work in WP4 ... I volunteer to start a discussion with Vassilis and Simon to identify what is missing!