I came back last week from Japan, where I presented at Multimedia Modeling 2000 and also gave talks visiting at two labs. At Multimedia Modeling I presented our full paper "Inter-dimensional Hypermedia Communicative Devices for Rhetorical Structure", the product of Jim Davis's several month visit. I also presented the SMIL tutorial and chaired a session. Most importantly, I meet with the committee to discuss Multimedia Modeling 2001, which I am co-chairing with Dick Bulterman, and which will be held here at CWI on November 5,6 and 7. I also visited and gave the MMM2000 paper talk at the Lab. of AI & Knowl. Comp. at the University of Electro-Communications in Chofu, Tokyo. The reason was to meet with Alexandra Christea, a postdoc there who will soon be moving to the Netherlands and has expressed interested in working with INS2 at the CWI. And finally, I gave a last-minute invitation to present the SMIL tutorial and get the full tour of ATR just outside Kyoto, roughly the Japanese equivalent of the CWI. I also exchanged a lot of business cards and ate a lot of (surprisingly inexpensive) sushi. ATR-MIC ------- The division at ATR I visited was the ATR Media Integration &Communications Research Laboratories (http://www.mic.atr.co.jp/index.html), which has mostly MIT Media Lab-like projects. The director is Ryohei Nakatsu, who gave the keynote speech that the conference -- basically an overview of ATR MIC projects. Lynda and I attended an workshop with him and a colleague at Multimedia 98 in England on Interactive Stories, where his colleague presented the speech recognition Virtual Reality interactive story "Romeo and Juliet in Hades", the sequel to Shakespeare's "Romeo and Juliet" we've waited 500 years for but, like most sequels, with better visual effects. Two papers at Multimedia Modeling 2001 were by ATR MIC folk. One was "Transmitting Visual Information: Icons become words" by Bernard Champoux. It was about using "intuitive" icons as words for international communication. He integrated the demo with the presentation well, and it was a well-running and visually impressive demo. He is a graphic artist with good skills in making good icons. The idea itself, however, didn't float for me. It had no investigation into linguistic, for example. And while the icons may work for vocabulary, there was little examination of grammar and how the icon setup could handle grammar and more complex sentence. But in his defense, the system was tested on linguistically disabled students, and shown to help them communicate, so it has at least one helpful application. The other paper was "Modeling Complex Systems for Interactive Art on the Internet" by Crista Sommerrer. She is primarily a visual artist who applied computer to art to explore what computer can, and would need to, do for art. One project of hers turns text communication into graphics, and a conversation is a single, expanding graphic image: basically a 3D growing plant-like entity. You can click on graphic components to get back that part of the conversation. Another is a shared work with artificial life forms that evolve and also "eat" the text characters of chat-box conversations, which allow users to what their conversation turn into an evolving ecosystem. Perhaps the most "short-term applicable" of her projects is a wall-sized touch sensitive screen with speech recognition microphones that do image searches on the Web. You sit in front with no interface but your head-mounted microphone and your fingertips. You say a word or phrase, and the system does a text-based search for images matching it. As the images are found, that appear from the side of the screen and float around on it. Eventually, many matching images swirl around on the screen, around and on top of each other. If you see an image you like, you can put your fingertips around it and "hold it still". The lab as a whole has projects along this vein. One was a "personal robot" that explored having robots form emotional bonds with human. It features a meter-high robot with touch sensors on its head, shoulders, belly and "hands". It responded "affectionately" to touch, would complain if you covered its eyes, and would visually recognize humans and go to them, seeking "affection". This weekend I say a toy of some friends that performed most of these functions, but cost only f30. The nicest demo was from the "Reality Enhancement" lab. It has a physical child's book that was read through computer screen goggles. Each left page of the book had a large graphic icon that the computer camera would recognize and replace, in the goggles, with a mini VR. You could move the book, or yourself, around to see the mini-world, which always stays on the left page, from all angles. Click a button, and you're *in* the world, moving around immersed in it. Click again, and your back out, looking at the world from above. Flip the page and a new world pops up. MMM 2000 -------- The meeting to discuss Multimedia Modeling 2001 was primarily about setting up the dates and setting up the tone we wish to set. Setting the dates of Nov 5-7 was easy. Setting a tone, of course, is a much longer process. Several in the committee wanted more emphasis on the human, and modeling, aspect of multimedia. A preliminary CFP will go out in the next month, followed shortly by the full CFP setting the new, revived tone. Muriel Jourdan has agreed to be the Program Committee chair for MMM2001. The tutorial went smoothly, though at 1-1/2 hours it was shorter than most. Given the conference, the questions where more theoretical than most. Patric Senac of ENSICA, France, asked a lot of network-oriented questions, such as how SMIL handles network communication, and how SMIL compared with the packaging capabilities of MPEG-4. The paper presentation went smoothly, though at the allotted 20 minutes, it was also shorter than most. One question was about whether our setup was too complicated. The questioner compared our setup to a reputed feature in a recent release of Microsoft PowerPoint, but Microsoft's was much simpler. It was questioned whether our setup would be brought to quick implementation and market acceptance. I said "Sure, give us 7 years." The longer response I gave was that this paper was an exploration of one of many possible directions of generating hypermedia presentations from some type of higher level abstraction, that authoring for generation is a very large problem with many facets that will take a long time to develop, that this is research, not business, etc. Another question was why we chose Mann and Thompson's RST over this questioners particular pet rhetorical model set up by his colleague. The response was that the particular rhetorical model use didn't matter, that they were for the most part interchangeable. The best paper IMHO was from our friends at INRIA Grenoble, on "A Proposal for a Video Modeling for Composing Multimedia Documents". It was about structuring video and (de)composing it, and then integrating these components in a presentation. Perhaps of interest to our MPEG-7(-ish) work. Our buddies from Brazil did another SMIL-ish paper called "Improving the Expressiveness of XML-based Hypermedia Authoring Languages". Another usual suspect, Patric Senac, presented a Frank C-oriented paper "Time in Multimedia Transport Protocol". Being in Japan, many papers focussed on themes around VR and the recreation of human expressiveness within it, resulting in a lot of Monty Python-esque rendering of facial human communication. One project took videos and accompanying speech, recognized the speech, translated the speech to another language, resynchronized the translation and -- here's where Monty comes it -- made animations of the new lip movements corresponding with the old and superimposed them on the old lips. The lip animation was actually pretty convincing. The animation of the teeth exposed by opening lips, however, was not in the scope of this paper, and it's animation appeared as two square image plates that moved up and down at right angles -- very very Monty. Perhaps tooth reanimation will be presented next year. But despite the visual silliness these facial recreation projects have historically had at MMM, this year has shown a noticeable improvement in quality. Overall, this conference, and the series as a whole, is rewarding for us because it presents the more "human", graphic and sometimes abstract side of multimedia, and many more authoring concerns than other MM conferences such as ACM MM and IEEE MM, which tend to be more "plumber" and "quick to market". Alexandra Christea ------------------ Under the pretext of giving a talk at her lab, I got to meet and know Alexandra Christea, who is moving this Spring to the Netherlands and has expressed interested in working in INS2. The talk itself went well, as did the demos (thanks FrankC and Joost). The audience was rather quiet, and at the end of the talk the lab head, Prof. Okamoto, gave a summary in Japanese. Alexandra confessed later that most of the audience understood little English and had no idea what I was saying -- they were just there to hear a native English speaker speak English. The more expressive members of the audience ask questions about the applicability of SMIL and generation to education, being a major theme of the lab. I described SMIL adaptivity constructs and discussed how user models in our generation process could account for level of education and topic familiarity. Alexandra (http://www.ai.is.uec.ac.jp/u/alex/) has an AI background and is interested in how AI can be use for education with multimedia. She was raised in Romania, schooled in the German language, speaks and writes excellent English, and is quite capable in Japanese as well (to the extent that I could judge -- I have a summary of our research she wrote in Japanese, if anyone's interested). She's also quite personable and a hard worker. Her interest in tailored presentation for education and the use of AI for it complements our research very well. She's moving to the Netherlands regardless of employment to be with her Dutch boyfriend. Since our initial communications a few months back, her prospects for a professor-track position in Eindhoven have become more promising -- which would be a better position for her than a time-limited postdoc at CWI. Her being at Eindhoven still puts her in a good position to work with us -- perhaps even better since we don't have to pay her ;). I'll keep in touch with her and see what her plans are. -Lloyd