Blue note 3: On DOMs and SYMM

Jacco van Ossenbruggen

I wrote this blue note because I'm not sure what to think about the DOM stuff and want to start a discussion within the group. The key problem I have with the current DOM terminology is that it assumes one single interface to

  1. the information in the document,
  2. the data structures in the player, and
  3. the functionality of the player.

I did not see any convincing arguments why this should be realized into one API. I'm no DOM expert, so please correct me if I'm wrong.

If I read the DOM level 1 Core stuff, I get the impression this was originally designed as a read-only API to the document itself (point 1 above), as parsed by an XML parser, and that the API was extended to also allow modification of the document tree (because this what the DOM "level 0" was used for in the 3.0 generation HTML browsers). By making the DOM writable, the clear distinction between 1) and 2) disappears. I also think 2) implicitly includes 1). I would prefer to keep these two things separate, but it seems that I am one of only a few.

If I read the Level 1 HTML DOM stuff, I see an API which was clearly intended to standardize the level 0 habits of IE 3.0 and NS 3.0, and it looks more like something which was designed to be writable from the beginning. This is clearly and API to the document as it is currently displayed in the browser (point 2 above).

Finally, if I read the DOM Level 2 stuff and the SYMM charter, than we are talking about DOMs that combine 1), 2) and 3). I have the feeling that giving scripts the opportunity to access and manipulate all this information in a multimedia environment (with multiple threads, active media objects, timers etc) is asking for trouble.


Response from Jack:

I sort of agree with Jacco. stuffing the API to the document being played and the player showing it together is ugly for static (HTML) document, but for timed documents like SMIL it is not only ugly but possibly undoable.

We should raise this point when the WG gets around to the SMIL DOM. A simple solution would be to disallow modifying a document while it is playing. It would be more elegant to remove the player-API from the DOM, but I think that current HTML practice makes that impossible.


Note from symm teleconference: DOM provides an API to the player not only for scripts, but also for exteral applications.


Response from Frank:

At AIdministrator we work with DOM extensively (both the XML core and the
HTML extension). I'd be interested in discussing this with you.
You asked for comments on your note, so I venture an uninvited reply.

You wrote:

> the current DOM terminology [...] assumes one single interface to
> 
>   1. the information in the document,
>   2. the data structures in the player, and
>   3. the functionality of the player.
> 
> I did not see any convincing arguments why this should be realized into one
> API.

I agree (of course) with you that DOM is meant for (1). (And, as you state
correctly, both read and write access).

As it turns out, browsers have also decided to adopt DOM for their internal
document representation. This is presumably what you mean by (2)? But in fact,
(2) is just an example instance/application of (1): browsers/players using
DOM to represent (1). You seem to suggest that (1) and (2) are different
things. I disagree: (2) is just an example of how to use (1). Notice also
that browsers/players only use DOM to represent the >*document*< (since
that's all that DOM is good for). All kind of other browser/player
data-structures are not represented via DOM (nor could they be).

Now concerning (3): Lynda showed me some of the comments on the plans for
SMIL-DOM. Some of these make good sense, such as developing a SMIL specific
DOM (which allow access to the document in terms that makes sense in SMIL,
and not just in terms of the XML that is used to represent SMIL). But some
other things really struck me as a BAD IDEA, and presumably this is what you
mean under (3). As an example, the charter mentions using DOM to represent
whether a document is currently playing/finished/paused in the browser (and
if this is both read and write, to also affect pausing the playing of a
document). It seems strange to me to use DOM for this (and I guess we agree
on this). The playing status of a document is not a property of the document,
but a property of the player. And since DOM is just what is means, such a
property should not be represented in DOM.
The analogy that I can think of with HTML is the status of whether to
display images or not (a mode-switch in most browsers). This is not a
property stored with a document, nor should it be, so it should not (and
indeed is not) part of HTML-DOM.
If people are really suggesting to abuse DOM to represent player-status of a
document they are in my understanding abusing/corrupting the idea behind DOM. 

Summary:
- On (1) we agree (that was the easy one),
- on (2) we agree that this is done, but we disagree if you suggest that this
  is different from (1): it is just an example of (1), so nothing to complain
  about.
- on (3) we agree that this would be a very bad idea. 
  
Perhaps this helps,
perhaps this is all open doors to you.

Anyway, feel free to do with all this what you want:
add it to your BB-note, distribute in your group or the W3C working group,
or simply read and remove!

Groetjes,

Frank.
    

Our comments to Nabil's first DOM/SMIL draft

CWI's position on the relationship between DOM and SMIL

A standardized API supporting access, modification, deletion and addition of all information in a SMIL document is an important requirement. However, we feel that this requirement is sufficiently covered by the DOM Core specification. We feel that the SYMM WG should _not_ develop a SMIL specific "convenience interface", as has been done for HTML by the DOM (HTML) Level 1 specification. The HTML interface was developed to ensure backward compatibility with existing browsers, which is irrelevant for current SMIL browsers. Additionally, there should be no need for a new DOM for each new XML-based language. Implementing a SMIL-specific DOM is not without additional cost, which reduces the likelihood of acceptance.

We support DOM core for SMIL, but unrestricted modification of the DOM of a playing time-based document could result in many problems. To solve these problems, a transaction model should be developed as soon as possible. Such a transaction model, however, is not specific to SMIL, and the SYMM WG should try to convince the DOM WG of the urgency of including a transactions model in the DOM Level 2 specification. Additionally, the SYMM WG should specify which parts of the DOM of a SMIL document are "locked" during document play-out.

An application should be able to monitor and influence the play-out of a SMIL document. To support this behavior, the current event sets specified by the DOM Events draft should be extended with an event set for time-based media objects (e.g. events such as on_begin as specified by Nabil's draft). Again, we feel that this event set is _not_ SMIL specific (it also applies, for instance, to applets and other dynamic objects in HTML documents) and should therefore be part of DOM Level 2. The same applies to related methods on elements (such as elem.start() and elem.stop()). Such methods are also useful to media elements in other XML languages, including HTML.

Conclusion

CWI supports the development of DOM-based interfaces to SMIL applications. We are not convinced, however, that SMIL specific extensions to the DOM are required. Additionally, we feel that, given the current workload of the working group, the specification of a plug-in interface should be deferred to future versions of SMIL.

Reaction on Nabil's draft.

http://www.inrialpes.fr/opera/people/Nabil.Layaida/smil/smil-dom.html

Nabil, your draft gave us some good insights in the advantages and problems related to the use of both DOM level 1 and the draft level 2 spec in the context of SMIL - good work!

A more detailed reaction is included below.

Cheers, Jacco

Introduction: Despite that we are not (yet) convinced that a SMIL-specific DOM needs to be developed, we do feel that SYMM should state explicitly the specific requirements of SMIL when it comes to implementations of the current DOM Core Level 1 by SMIL applications, or proposed extensions to DOM Level 2. Because of this, and the impact these requirement will have on the rest of the draft, we like to see that the "Requirements" section is given top priority in the next version of the draft.

DOM Level 1: The current section is mainly about controlling the presentation, which is an important issues, but should probably be discussed in another section because it is beyond the scope of DOM Level 1.

Plug-ins: Although we do not want to specify a full plug-in API, we would like to make all assumptions made by the DOM about the functionality of the plug-in explicit (e.g. whether a plug-in is expected to expose a play/stop/pause/rewind interface, etc).

Basic event flow in DOM vs SMIL: We think Nabil raises a very fundamental issue here in that the DOM draft seem to suggest (correct me if I'm wrong) that nodes in the tree are either passive or reactive, but not active in that they can generate events. We should get this straight with the people from the DOM WG. As Nabil already states we like to add event types to inform the application about media (pre)loading. I'm not sure whether "load" and similar events in the HTML Event Set can be used for this purpose.

SMIL Node Element Interface: The current text seems to mix up the terminology for "Node" and "Element". I agree with Philippe (http://lists.w3.org/Archives/Member/symm/1999JanMar/0250.html) that several attributes are redundant. The duration, beginDate and endDate attributes are more interesting, especially if these attributes refer to the values computed by the SMIL engine, and not to those supplied by the document. Note that it may be confusing to have a "duration" attribute in the SMIL-specific DOM which returns a different value that the "duration" attribute of the DOM Core interface. On the other hand, returning the same value would make the duration attribute redundant in the SMIL specification.

In fact, the attributes and functions which return information which is only available from a SMIL engine (i.e. an engine that implements the SMIL timing model) are for me the only valid ingredients of a SMIL-specific DOM. I'm still not sure whether we should call such an API a "DOM" or a "SMIL Player API" (SPA ?! :-).

$Id: dom_symm.html,v 1.1 2001/02/21 19:28:36 lynda Exp $