I came back last week from Japan, where I presented at Multimedia
Modeling 2000 and also gave talks visiting at two labs.  At Multimedia
Modeling I presented our full paper "Inter-dimensional Hypermedia
Communicative Devices for Rhetorical Structure", the product of Jim
Davis's several month visit.  I also presented the SMIL tutorial and
chaired a session.  Most importantly, I meet with the committee to
discuss Multimedia Modeling 2001, which I am co-chairing with Dick
Bulterman, and which will be held here at CWI on November 5,6 and 7.

I also visited and gave the MMM2000 paper talk at the Lab. of AI &
Knowl. Comp.  at the University of Electro-Communications in Chofu,
Tokyo.  The reason was to meet with Alexandra Christea, a postdoc
there who will soon be moving to the Netherlands and has expressed
interested in working with INS2 at the CWI.  And finally, I gave a
last-minute invitation to present the SMIL tutorial and get the full
tour of ATR just outside Kyoto, roughly the Japanese equivalent of the
CWI.

I also exchanged a lot of business cards and ate a lot of (surprisingly
inexpensive) sushi.

ATR-MIC
-------

The division at ATR I visited was the ATR Media Integration
&Communications Research Laboratories
(http://www.mic.atr.co.jp/index.html), which has mostly MIT Media
Lab-like projects.  The director is Ryohei Nakatsu, who gave the
keynote speech that the conference -- basically an overview of ATR MIC
projects.  Lynda and I attended an workshop with him and a colleague
at Multimedia 98 in England on Interactive Stories, where his
colleague presented the speech recognition Virtual Reality interactive
story "Romeo and Juliet in Hades", the sequel to Shakespeare's "Romeo
and Juliet" we've waited 500 years for but, like most sequels, with
better visual effects.

Two papers at Multimedia Modeling 2001 were by ATR MIC folk.  One was
"Transmitting Visual Information: Icons become words" by Bernard
Champoux.  It was about using "intuitive" icons as words for
international communication.  He integrated the demo with the
presentation well, and it was a well-running and visually impressive
demo.  He is a graphic artist with good skills in making good icons.
The idea itself, however, didn't float for me.  It had no
investigation into linguistic, for example.  And while the icons may
work for vocabulary, there was little examination of grammar and how
the icon setup could handle grammar and more complex sentence.  But in
his defense, the system was tested on linguistically disabled
students, and shown to help them communicate, so it has at least one
helpful application.

The other paper was "Modeling Complex Systems for Interactive Art on
the Internet" by Crista Sommerrer.  She is primarily a visual artist
who applied computer to art to explore what computer can, and would
need to, do for art.  One project of hers turns text communication
into graphics, and a conversation is a single, expanding graphic
image: basically a 3D growing plant-like entity.  You can click on
graphic components to get back that part of the conversation.  Another
is a shared work with artificial life forms that evolve and also "eat"
the text characters of chat-box conversations, which allow users to
what their conversation turn into an evolving ecosystem.  Perhaps the
most "short-term applicable" of her projects is a wall-sized touch
sensitive screen with speech recognition microphones that do image
searches on the Web.  You sit in front with no interface but your
head-mounted microphone and your fingertips.  You say a word or
phrase, and the system does a text-based search for images matching
it.  As the images are found, that appear from the side of the screen
and float around on it.  Eventually, many matching images swirl around
on the screen, around and on top of each other.  If you see an image
you like, you can put your fingertips around it and "hold it still".

The lab as a whole has projects along this vein.  One was a "personal
robot" that explored having robots form emotional bonds with human.
It features a meter-high robot with touch sensors on its head,
shoulders, belly and "hands".  It responded "affectionately" to touch,
would complain if you covered its eyes, and would visually recognize
humans and go to them, seeking "affection".  This weekend I say a toy
of some friends that performed most of these functions, but cost only
f30.

The nicest demo was from the "Reality Enhancement" lab.  It has a
physical child's book that was read through computer screen goggles.
Each left page of the book had a large graphic icon that the computer
camera would recognize and replace, in the goggles, with a mini VR.
You could move the book, or yourself, around to see the mini-world,
which always stays on the left page, from all angles.  Click a button,
and you're *in* the world, moving around immersed in it.  Click again,
and your back out, looking at the world from above.  Flip the page and
a new world pops up.

MMM 2000
--------

The meeting to discuss Multimedia Modeling 2001 was primarily about
setting up the dates and setting up the tone we wish to set.  Setting
the dates of Nov 5-7 was easy.  Setting a tone, of course, is a much
longer process.  Several in the committee wanted more emphasis on the
human, and modeling, aspect of multimedia.  A preliminary CFP will go
out in the next month, followed shortly by the full CFP setting the
new, revived tone.  Muriel Jourdan has agreed to be the Program
Committee chair for MMM2001.

The tutorial went smoothly, though at 1-1/2 hours it was shorter than
most.  Given the conference, the questions where more theoretical than
most.  Patric Senac of ENSICA, France, asked a lot of network-oriented
questions, such as how SMIL handles network communication, and how
SMIL compared with the packaging capabilities of MPEG-4.

The paper presentation went smoothly, though at the allotted 20
minutes, it was also shorter than most.  One question was about
whether our setup was too complicated.  The questioner compared our
setup to a reputed feature in a recent release of Microsoft
PowerPoint, but Microsoft's was much simpler.  It was questioned
whether our setup would be brought to quick implementation and market
acceptance.  I said "Sure, give us 7 years."  The longer response I
gave was that this paper was an exploration of one of many possible
directions of generating hypermedia presentations from some type of
higher level abstraction, that authoring for generation is a very
large problem with many facets that will take a long time to develop,
that this is research, not business, etc.  Another question was why we
chose Mann and Thompson's RST over this questioners particular pet
rhetorical model set up by his colleague.  The response was that the
particular rhetorical model use didn't matter, that they were for the
most part interchangeable.

The best paper IMHO was from our friends at INRIA Grenoble, on "A
Proposal for a Video Modeling for Composing Multimedia Documents".  It
was about structuring video and (de)composing it, and then integrating
these components in a presentation.  Perhaps of interest to our
MPEG-7(-ish) work.

Our buddies from Brazil did another SMIL-ish paper called "Improving
the Expressiveness of XML-based Hypermedia Authoring Languages".
Another usual suspect, Patric Senac, presented a Frank C-oriented
paper "Time in Multimedia Transport Protocol".

Being in Japan, many papers focussed on themes around VR and the
recreation of human expressiveness within it, resulting in a lot of
Monty Python-esque rendering of facial human communication.  One
project took videos and accompanying speech, recognized the speech,
translated the speech to another language, resynchronized the
translation and -- here's where Monty comes it -- made animations of
the new lip movements corresponding with the old and superimposed them
on the old lips.  The lip animation was actually pretty convincing.
The animation of the teeth exposed by opening lips, however, was not
in the scope of this paper, and it's animation appeared as two square
image plates that moved up and down at right angles -- very very
Monty.  Perhaps tooth reanimation will be presented next year.  But
despite the visual silliness these facial recreation projects have
historically had at MMM, this year has shown a noticeable improvement
in quality.

Overall, this conference, and the series as a whole, is rewarding for
us because it presents the more "human", graphic and sometimes
abstract side of multimedia, and many more authoring concerns than
other MM conferences such as ACM MM and IEEE MM, which tend to be more
"plumber" and "quick to market".

Alexandra Christea
------------------

Under the pretext of giving a talk at her lab, I got to meet and know
Alexandra Christea, who is moving this Spring to the Netherlands and
has expressed interested in working in INS2.  The talk itself went
well, as did the demos (thanks FrankC and Joost).  The audience was
rather quiet, and at the end of the talk the lab head, Prof. Okamoto,
gave a summary in Japanese.  Alexandra confessed later that most of
the audience understood little English and had no idea what I was
saying -- they were just there to hear a native English speaker speak
English.  The more expressive members of the audience ask questions
about the applicability of SMIL and generation to education, being a
major theme of the lab.  I described SMIL adaptivity constructs and
discussed how user models in our generation process could account for
level of education and topic familiarity.

Alexandra (http://www.ai.is.uec.ac.jp/u/alex/) has an AI background
and is interested in how AI can be use for education with multimedia.
She was raised in Romania, schooled in the German language, speaks and
writes excellent English, and is quite capable in Japanese as well (to
the extent that I could judge -- I have a summary of our research she
wrote in Japanese, if anyone's interested).  She's also quite
personable and a hard worker.  Her interest in tailored presentation
for education and the use of AI for it complements our research very
well.

She's moving to the Netherlands regardless of employment to be with
her Dutch boyfriend.  Since our initial communications a few months
back, her prospects for a professor-track position in Eindhoven have
become more promising -- which would be a better position for her than
a time-limited postdoc at CWI.  Her being at Eindhoven still puts her
in a good position to work with us -- perhaps even better since we
don't have to pay her ;).  I'll keep in touch with her and see what
her plans are.

-Lloyd