Issue |
Article |
Vol.28 No.2, April 1996 |
Article |
Issue |
In the spring of 1994, the CHI '95 Conference Committee decided to produce an electronic Conference Proceedings and Companion, to be delivered on CD-ROM(1). The CD-ROM version of the Proceedings and Companion were delivered to attendees of the CHI '95 Conference. Soon after the conference, the fourth author created the World Wide Web, or "Web" version based on the CD-ROM contents, which is accessible via: http://www.acm.org/sigchi/chi95/Electronic/chi95cd.htm. This report describes the rationale and development process for the CD-ROM, and introduces the ACM/SIGCHI experiment in electronic, Web-based Conference publication.
We assume the reader is familiar with the World Wide Web, HTML, the Hypertext Markup Language used to create Web documents, and the interest in academic and corporate circles, among others, in Web-based electronic publishing. The Web is an hypertextual (actually hypermedia) interface to the Internet, where information is represented in the form of hyperlinked documents (Web pages) containing text, graphics and hyperlinks. The Web realizes a vision of world-wide electronic and hyperlinked information that began with Vannevar Bush in 1945 (Bush, 1945), and was later articulated in more contemporary terms by Ted Nelson in 1965 (see Barrett, 1989, and Nelson, 1990 for more recent discussions of this vision).
The CHI '95 Committee decided at the outset to use HTML as the electronic document format, and World Wide Web browsers as the means of viewing documents. The decision to use HTML and Web browsers was not casual for two reasons. First and foremost, it represented a different strategy from two other successful ACM electronic publications delivered on CD-ROM. SIGGRAPH produced CD-ROM based electronic Proceedings in 1993 and 1994 (Steve Cunningham was the production editor), using Adobe Acrobat readers and Adobe's Portable Document Format (PDF) document format, derived from PostScript document files(2). Second, we knew at the outset that HTML provided limited formatting capabilities. For example, in 1994 and 1995 HTML did not display multiple columns, mathematical symbols or tabular information. Nonetheless, we chose this strategy for two reasons.
In fact, the use of PDF and HTML are not incompatible. Web browsers typically are used in conjunction with so-called "helper" applications that can display GIF images or QuickTime videos in separate viewers, launched from hyperlinks in HTML Web documents. Similarly, an Acrobat reader can be launched as a "helper" application to view PDF documents. A program called Adobe Exchange can be used to create links within PDF documents. However, in our opinion, HTML authoring remains more accessible currently than PDF authoring tools.
We remain convinced that these reasons for using Web-based documents and HTML are valid. However, we did not appreciate at the outset the relative difficulty of creating all the links necessary, and the problems dealing with richer formats, especially mathematical symbols in some CHI '95 submissions, problems we will discuss in the Problems section.
In addition to using HTML documents, we decided to ship a default suite of document and image viewing software for both the Macintosh and DOS/Windows personal computers. This is not necessary for CD-ROM users who are connected to the Web, because they can use existing Web browsers to access the CD-ROM Home Page as a local file. However, at the time we planned the CD-ROM, we believed that many people in the SIGCHI community would not have access to the Web. The production editors, for example, did not have ready access to Web browsers for either PC or Macintosh machines when we started the production process. We had access to NCSA Mosaic clients in our corporate environment, but these machines did not have CD-ROM players. We had also done an informal survey of CHI '94 authors, which suggested that at that time, Web awareness and access were limited (at least within the SIGCHI community).
We chose MacWeb and WinWeb HTML document browsers because the Enterprise Integration Network (EINET) gave the conference an affordable licensing arrangement. We tried to license other Web browsers, but found their licensing fees unacceptable. We began the production process shortly after NCSA Mosaic, in particular, decided to license Mosaic for commercial use, and consequently did not wish to make it available for non-commercial distribution on CD-ROM. We licensed image viewers available as ShareWare on the Web for a nominal fee. We also included QuickTime Players from Apple Computer, Inc., for both platforms, and a FreeWare version of Adobe's Acrobat PostScript reader for reasons we will discuss below.
We also decided not to require authors to produce "Web-ready" HTML documents. We did strongly urge authors to submit HTML, and we provided a template in the Author Kit. But we did not require authors to do this. None of the production editors, nor many of the CHI '95 Committee had authored HTML documents at that point either. We had also included a short survey with the Conference submission cover sheet of authors' familiarity with the Internet and HTML, and their ability to submit electronic documents in various formats. About half of the potential authors claimed to be unfamiliar with, and/or unable to produce HTML documents. As a consequence of all these facts, it did not seem appropriate to demand that our colleagues in the SIGCHI community be required to author HTML documents (again, a decision made at the beginning of 1994).
The Electronic Proceedings was organized around a Home Page (or Cover Page) intended to resemble the form and content of the hardcopy Proceedings (and Companion) front and back matter. Figures 1 and 2 show the top sections of this Home Page. A slightly modified version of this Home Page can also be accessed via the URL http://www.acm.org/sigchi/chi95/Electronic/chi95cd.htm. Comparing this to the printed Proceedings, it is obvious that there are differences, but the major elements of the print Proceedings front and back matter are accessible as links in the electronic version. At the top of the Home Page is a graphic of the CHI '95 Conference. Immediately following this are four buttons, and a link to ACM Copyright notice. We assumed that users familiar with the contents would want to be able to link to the Table of Contents and Indexes immediately. This contrasts with the print Proceedings, in that the Table of Contents are preceded by various greetings and acknowledgments, and the indexes are at the back of the print volume. In the electronic Proceedings, this material appears as links further down the Home Page document. These items are in roughly the same order as they appear in the printed volume.
Figure 1: Top section of the Home (or "Cover") Page
Figure 2: Section of the Home Page showing items corresponding to "front and back matter"
Table 1 summarizes the number of papers and images in each category of submission for the Proceedings and Companion. It also summarizes the 29 documents corresponding to front and back matter, including Table of Contents, Author and Keyword Indexes, and so on. There were 66 Technical Papers in the Proceedings, and an additional 10 Design Briefings, both categories of papers 8 pages or less in length. In the printed Companion, there were 66 Short Papers, each two pages long (one paper was not submitted to the CD-ROM), and two page summaries of events in other Conference venues. The numbers in parentheses for the "Total Submitted" column are the totals that appeared in the printed Conference Proceedings and Companion.
The CD-ROM contents were almost complete compared to the corresponding print volumes. The numbers in the "GIF Images" column are the images sets submitted by authors in GIF form. The numbers in parentheses for the "GIF Images" column are the total GIF images for the submission category (i.e., a single set of images could correspond to multiple images). Finally, the HTML and GIF images occupied roughly 24 megabytes of the CD-ROM, and the QuickTime movies 128 megabytes. The total 152 megabytes is considerably less than the 640 megabyte capacity of the CD-ROM media
The production editors converted the files used to create printed camera-ready versions of the front and back matter to HTML. Much of the formatting for the camera-ready version was lost in this conversion, but the paper titles, authors and affiliations should match exactly the print volumes. We also included the printed session organization of the papers in each submission category, including time and place of the paper presentations. This information is probably not useful in the long run except for conference attendees, but we did not want to reorganize the electronic Proceedings and Companion any more than necessary without taking more time than we had to think through the issues. Figures 3 and 4 show the top portion of the Proceedings Table of Contents.
Figure 3: Top portion of the Proceedings Table of Contents. Papers were organized by session, and paper citations by standard title and author(s) and brief affiliation(s).
Figure 5 shows the top sections of a Paper in the Proceedings. Again, the essential form and content of the papers is very similar to the printed volume (and the CHI '95 Conference Format Guidelines), with a couple of exceptions. As we described above, documents formatted with HTML, and viewed with Web browsers are essentially a linear sequence of text and images, without tabs or column formatting. The electronic documents therefore do not have two column text, text that flows around embedded graphics, or author names and affiliations formatted as separate columns under the title.
Many authors included URLs to Web homes pages at their University or elsewhere, and URLs to papers they cited that were already in electronic form on the Web. We allowed this, on the assumption that the electronic Proceedings would eventually be available on the Web, and such links were one major rationale for using the Web as the basis for electronic publishing. Most Web browsers handle unresolvable links in a graceful way, so we did not believe these URLs would create problems for CD-ROM users without Internet access. There are larger policy issues relating to linking archival documents to other Web sites, which we discuss in the section on Future Plans. From a formatting perspective, embedded URLs to Web sites are simply links that do not lead anywhere on the CD-ROM version of the electronic Proceedings.
The Proceedings contained full-length Technical Papers and Design Briefings. The Companion contains two-page Short Papers, and two-page summaries of various Conference venues (e.g., Workshops and Panels). The numbers in parentheses for the "Total" column are total submissions in the printed Conference Proceedings and Companion. The numbers in each column are number of authors who submitted one or more items in the respective columns (e.g., files for the document bodies or images). The numbers in parentheses for the "GIF Images" column are the total GIF images in the CD-ROM for the respective submission category.
Total HTML Images Submission Method Submitted No Yes GIF other Email Disk FTP Front/Back (e.g., Table 29 29 0 0 (33) 33 of Contents, Indexes) Papers (8 page) 66 (66) 49 17 32 (445) 26 24 17 25 Design Briefings (8 page) 10 (10) 8 2 3 (73) 6 2 9 0 Short Papers (2 page) 65 (60) 44 24 27 (134) 23 33 19 16 Panels 13 (13) 12 1 0 (2) 2 8 4 1 Workshops 12 (13) 12 0 0 0 7 4 1 Demonstrations 18 (18) 10 8 6 (29) 3 7 9 3 Videos 15 (15) 10 6 6 (28) 2 3 3 9 Organization Overviews 8 (8) 7 1 1 (1) 0 6 2 0 Doctoral Consortium 19 (19) 12 7 4 (25) 4 10 3 6 Interactive Posters 29 (29) 20 9 7 (45) 4 21 5 3 Social Action Posters 3 (3) 2 1 0 0 3 0 0 Interactive Experience 7 (7) 6 1 2 (19) 3 1 3 3 SIG Proposal 12 (15) 9 3 0 0 10 2 0 Tutorial Summaries 29 (29) 23 6 2 (8) 1 10 11 1 Total 340 224 85 89 (842) 75 151 91 68
Footnotes were represented as links in Web documents. Since superscripts are not available, and single numbers do not make prominent links, the production editors generally put footnote references in parentheses, and including the word "Footnote". We were not consistent in making this modification for all papers with footnotes, but when we made it, this is how we did it. We put footnotes in a separate section after the References section. We put back links from the footnote text to the footnote reference in the body of the text.
Images in Web documents can be handled as embedded or inline images that open up when the document opens up, or as hyperlinks that open up in separate image viewing "helper" application windows, launched when the user clicks on the links for the images. We asked all authors to treat images as links to images that would be viewed in separate viewers. However, the choice should really be determined by the size and number of images in the document. In-line images take time to open when a document as a whole is opened. Large images and numerous images take longer to open than smaller and fewer images. We could not experiment with these issues for every paper. Authors who submitted HTML, and chose to include in-line images were allowed to do so. In a few cases where images were relatively small, and integral to the flow of text, the production editors made editorial decisions to convert linked images to in-line images. This editorial decision took time, and we made it in only a few cases when we were working on documents for other reasons.
In general, we did not link references to the corresponding citation in the References section due to effort this would have required. However, the use of HTML certainly calls for doing so. Authors who submitted HTML sometimes did this, and we obviously accepted these links. Links to citations in the Reference section should also have a backlink to the original reference in the text body, if possible. Multiple references to the same paper cannot have a unique link to the text body, however.
Two papers made substantial use of mathematical symbols. Equations treated as separate paragraphs were straightforward to handle. In one case, an author submitted the body of the paper in RTF, but extracted a separate file of equations to make it easier to create embedded images. The production editors converted these equations into GIF images, and embedded them in the body of the document in appropriate locations.
Several other papers made less substantial use of symbols. In these cases the production editors also created separate GIF images from screen dumps, and embedded them in the documents. Unfortunately, text flow around images is not supported, so in some of these cases the images with symbols do not flow with text as the authors intended.
An alternative strategy for handling text that HTML cannot format is to convert the documents to PostScript, and then Adobe's Portable Document Format or PDF format, and use Adobe's Acrobat PDF viewer to view these documents. We included freeware versions of Acrobat reader, with the intent to convert the few papers we had with mathematical symbols, but in the end did not convert any documents to PDF.
We included buttons in HTML documents that linked readers to consistent landmarks in documents (see Figure 2). For technical papers (or paper summaries in the Companion venues), buttons labelled "Introduction", "Conclusion", and "References" provided intra-document links which, when clicked, brought the corresponding document section to the top of the browser window. A "Cover Page" button returned the reader to the "Home" or "Cover" page of the Proceedings. Other buttons labelled "Proceedings Table of Contents" or "Author Index" opened up the corresponding web pages.
Technically, the document buttons are not necessary because readers can scroll to these sections, and Web browsers have a readily available "Home" button. However, we thought that the buttons provided a convenient shortcut to these sections of the paper. Another, and more effective way to quickly navigate to these sections is by having a Table of Contents for each paper, linked to the corresponding sections (with backlinks to the Table of Contents). A couple of authors included a Table of Contents linked in this way. However, in general, we did not ask authors to do this, nor did we have the time to do create these tables. Also, for two-page Short Papers, a Table of Contents is unnecessary.
Figure 4: Additional detail on the Proceedings Table of Contents.
Papers were organized by session, and paper citations by standard title and author(s) and brief affiliation(s). The title of each paper was a hyperlink for the HTML version of the paper (highlighting indicating hyperlink does not show in the figure).
Figure 5: Top section of a Proceedings Paper, showing title, author and affiliations, navigation buttons, and copyright pointer.
The CD-ROM contents were developed using a DOS/Windows PC, a Quadra 650 Macintosh computer, and an OS/2 machine connected to the Internet and a Unix machine via X-Windows emulator. We used Microsoft Word for editing HTML files, and Photoshop and PhotoStyler on the PC and Macintosh, respectively, for editing images. Originally we set up DOS/Windows machines at the first author's worksite for both archiving files and converting to HTML. Over time, however, it become more useful to work at home in the evenings, so most of the production work on the integrated CD-ROM archives was done on DOS/Windows and a Macintosh (Quadra 650) machines at home. PCs at neither site were connected via the network, but both machines had 128 megabyte optical drives intended for archiving files, and transferring large files (e.g., 10 megabyte QuickTime movie files). The Macintosh had a 200 megabyte SyQuest drive for archival purposes, a modem-style PC to Macintosh cable link using DynaLink software, and StuffIt Deluxe software for decoding encoded Macintosh files.
Table 2 displays key milestones in the production process, and some high-level dependencies. Author kits were distributed with paper acceptances at the end of November, 1994. Proposals for different conference venues were due at different times, but two deadlines were paramount: January 5 for Papers, Design Briefings, and summaries of Tutorials, Workshops, Panels, Videos and so on, and February 15 for Short Papers, and summaries of Posters and Special Interest Groups proposals.
Table 2: Timeline of key production milestones.
The 2600 finished CD-ROM disks had to be delivered to the Denver Convention Center Friday, May 5 for early registrants. CD-ROM booklets and plastic case inserts required two weeks to print, and the CD-ROMs required one week for production, beginning with delivery of the printed matter. The CD-ROM manufacturer requested an additional week margin for potential problems which we could not accommodate. Working backwards, this meant that artwork for the booklets and insert had to be completed and delivered to the printer early April. Nominally, we had a full four months to produce the HTML contents of the CD-ROM, acquire the necessary viewing software, design the artwork, develop install instructions, and electronic equivalents of the front and back matter for the printed Proceedings and Companion (e.g., Table of Contents). We will discuss the reality of this timeline in our Problems section.
The Author Kit authors of accepted submissions requested authors to submit electronic submissions to the CD-ROM production team via electronic mail, diskettes, and/or FTP transfer. The first author managed all file input. The distribution of submission methods by author is shown in Table 1. The electronic mail system available could only handle ASCII text. ASCII text, RTF (Rich-Text Format), PostScript or MIF (FrameMaker Interchange Format) files were generally reliably transferred via E-mail. GIF or other kinds of images had to be uuencoded. Many authors also used compression software, tar archives for collections of files (document body, images, etc.), and binhex encoding of text and non-text files. E-mail files were downloaded to a hard drive archive. Many other authors sent files on diskettes. In some cases, files were also compressed (e.g., using pkzip or StuffIt on the Macintosh) or encoded using binhex or uuencode on the diskettes. Other authors submitted files by making FTP sites available to the first author who then transferred files to a hard drive archive (a corporate firewall and resources precluded authors using FTP to transfer files directly).
We organized files into subdirectories on a hard drive corresponding to Conference venues: e.g., Papers, Design Briefings, Workshops, and so on, and backed this up on an 128 Mb optical cartridge on a DOS/Windows PC. Files received on Macintosh formatted diskettes were stored separately on a Macintosh, and backed up on an 88 Mb SyQuest cartridge. This is the same organization of files that appears on the CD-ROM.
Once files were organized and logged, they had to be converted to source document format. Many files were received as native RTF, MIF, LaTeX, or ASCII, generally requiring only the stripping out of electronic mail headers. In other cases, files had to be decoded from binhex, uuencoded, or compressed file formats, using a Unix machine to which the first author had access, or conversion software on a Macintosh.
Once documents were available in source format (RTF, MIF, LaTeX, ASCII) the files had to be converted to HTML. Many authors submitted HTML source files. Table 1 shows the distribution for various submissions categories: for example, somewhat less than a third of the Technical Papers authors submitted HTML (17 out of 66). All HTML files (whether sent by authors or generated by the production team) then had to be edited to include:
Non-HTML source files also had to be converted to HTML formatting at the level of paragraphs, section headings, title, and authors and affiliations. We had software tools available to convert RTF and MIF to HTML file formats, and HTML editors to convert ASCII text to HTML. However, in the end, we did not use these tools as extensively as we thought we would, for reasons we discuss below. The first author, in particular, generally converted RTF or MIF files to ASCII format, and manually edited HTML tags. We used Microsoft Word for editing RTF files and HTML files, and FrameMaker for MIF formatted files. Manually converting files took varying times to complete depending on format, ease of rendering into HTML, and so on. The first author estimates that each paper required between 15 and 30 minutes to convert, less time of course for papers submitted as HTML, and more time for papers with mathematical symbols and/or tables. Mathematical symbols and tables were captured as screen images, edited and converted to GIF image format using PhotoStyler.
The issues here were discussed in the previous "Form and Content" section, and pertain to handling of footnotes, text body references to papers cited in the References section, images as in-line or as links, and so on. To summarize briefly: When the production editors converted non-HTML documents to HTML, we did minimal formatting of these items. When authors submitted HTML we could use, we generally preserved whatever they did. For example, we did not link paper references to citations in the References section if the authors did not do so. We did link footnotes to footnotes text, and put in back links. We did attempt to handle mathematical symbolism that could not be rendered with HTML by making images of this material. Images were generally included as links to GIF viewers rather than embedded as in-line images, with a few exceptions.
Images came in various forms from GIF, Macintosh PICT files, PC BMP (Bitmap) files, PostScript, TIFF, and RTF specifications embedded in the composite documents. GIF images could theoretically be used as received. Other formats had to be converted to GIF format. The first author did this using PhotoStyler on a DOS/Windows machine or PhotoShop on a Macintosh. As Table 1 indicates, about half (35 of 78) of the authors of Technical Papers and Design Briefings, provided GIF figures directly. Many of RTF, MIF, or PostScript graphics were authored in the corresponding word processor, and these had to be extracted in these same formats in separate files. We asked authors to do this for us, and most did. In these cases, the separate files containing these figures or tables were displayed and captured as screen bitmaps, opened in PhotoStyler, and converted to GIF images. Often clipping was required to frame the figure or table appropriately.
As we discussed in the earlier "Form and Content" section, images can be presented to readers as embedded in-line images that appear when documents are displayed, or as links that display images in separate windows corresponding to a separate image viewing "helper" application. Generally, when the production editors converted non-HTML documents to HTML, we treated images as links. Authors who submitted HTML documents handled images in both ways based on authors' own editorial preferences, and we left these decisions intact. In a few cases, the production editors converted linked images to in-line images. The criteria for doing so is that embedded images should be small enough not to disrupt the flow of text, or seriously slow down the opening of the document. In a few cases, authors submitted very large images that required four screens to view all of the image. Such images take a long time to open, require a lot of scrolling to view, and in our opinion, make it very difficult to maintain a sense of place in the document body. In these cases, opening the image in a separate window, at the viewer's discretion seems essential.
QuickTime videos were generally produced for the Macintosh. To work on Microsoft Windows, QuickTime files have to be "flattened" to eliminate the Macintosh "resource fork" in the Macintosh file format. We hired a QuickTime consultant to convert these files, create an identifying icon, and to create a file identifier that would allow Web browsers to recognize the files as QuickTime formats, and launch the appropriate QuickTime player. QuickTime (and other videos) always open as separate windows corresponding to QuickTime (or other) video player applications.
The main editorial control we exercised over videos was to require QuickTime (MOV) files and to limit people to 10 megabytes of file size. We did not have resources to convert video tapes to QuickTime videos, and we limited video file size (hence video length, and quality) out of concern for CD-ROM capacity. As it turned out, we did not have as many videos as we anticipated, and had plenty of room on the CD-ROM. In the end, we also accepted larger videos than 10 megabytes if authors asked for more space.
Making digitized video files is difficult, especially given the size limitations of storage media (even CD-ROM disks), and bandwidth limitations of the Internet. As a result, digitized video stored on CD-ROMs and available over the Web is currently limited in size, and requires trade-offs among three attributes of the video: window size, frame rate and number of colors used.
The CD-ROM manufacturing process is relatively well-defined by CD-ROM manufacturers, with a couple of exceptions discussed in the "Problems" sections below. We delivered the CD-ROM files on two 88 megabyte SyQuest cartridges, formatted for the Macintosh, and organized in a folder structure exactly as we intended them to be organized on the CD-ROM for the Macintosh. DOS/Windows software was delivered on diskettes. We created a hybrid CD-ROM in the ISO 9660 format, and the Macintosh HFS file system. The ISO 9660 standard means that most CD-ROM players can read the files on the CD-ROM. The HFS formatting means that Macintosh users are able to view files in the Macintosh file and folder system.
The ISO 9660 standard requires that file names conform to the DOS file naming convention of eight character file names and three character file extension. This requirement is the reason we asked authors to provide files in the DOS file naming convention, a requirement that was not popular with Macintosh or Unix users (and was often ignored). File names also should be capitalized in both the file system, and in the HTML link path specifications. On Macintosh and MS-DOS/Windows PC systems, file name case is not an attribute Web browsers distinguish, so differences in case are ignored. However, on Unix systems, case matters, and case differences between link path specifications and file names in the file system can render HTML links inoperative.
As we described in the Time Line section, the CD-ROM manufacturing process begins with the delivery of the CD-ROM file archive, and the printed matter which includes CD-ROM booklet and the cardboard insert for the plastic case (sometimes called "Jewel Box"). We had the CD-ROM manufacturer print this material as well as press the CD-ROM disks. Once a CD-ROM image has been mastered, a so-called "validation copy" is pressed for review. When the contents of this disk are verified, the CD-ROM production process is initiated.
The production editors contributed the textual material for the CD-ROM booklet and insert, and hired a graphic designer to create the artwork. The first author wrote text and did preliminary layout of the booklet and plastic case insert text, created placeholder booklet cover graphics using MacDraw, and reviewed wording and legal issues with ACM and CHI '95 Committee members. The Conference consulting firm assisted us in modifying existing CHI '95 Conference graphic material for the CD-ROM booklet format, and disk label. (See Acknowledgments for details of this help.)
Proofing in this context means ensuring that all links, HTML formatting and image quality of all the HTML documents is correct. We proofed many but not all documents, and attempted to correct numerous HTML tagging and image problems. We will describe proofing problems below. In general, proofing is a very time-consuming process when done en masse. Correcting problems with images required working with authors, which was time-consuming, and not always successful. We discuss these problems below in the "Images" subsection of the "Problems" section.
There were problems in every stage of the development process, and we discuss them in terms of the production tasks described in the previous section. Table 3 summarizes some relevant statistics which we cite below. The estimates of problems in this table are approximate and conservative because not all problems were documented in the tracking method the first author (and production editor) used. In most cases, a problem was recorded when it required contacting an author about files. But some idea of the scope of problems can be gained from the summary. Qualitative discussion of these problems is more relevant.
Quantitative summary of major categories of problems for all Proceedings and Companion paper categories. The counting is approximate and very coarse-grained. A "problem"represented difficulties that involved working with an author to obtain fresh files and/or resolve problems with files. The unit of counting is "an author's document body, image set, submission episode". Problems varied within a category, and a single counted problem may have entailed multiple sub-problems.
Total Submission Submitted Content Images method Late Papers (8 page) 66 6 26 6 5 Design Briefings (8 page) 10 2 3 2 0 Short Papers 68 5 13 12 7 Panels 13 2 0 0 3 Workshops 12 0 0 0 0 Demonstrations 18 1 3 3 1 Videos 14 0 2 1 2 Organization Overviews 8 1 0 0 3 Doctoral Consortium 19 0 2 1 3 Interactive Posters 29 1 3 3 0 Social Action Posters 3 3 0 1 0 Interactive Experience 7 1 1 0 1 SIG Proposal 15 0 0 0 4 Tutorial Summaries 29 1 1 1 5 Total 311 23 54 30 34
The computers we used were not as integrated as we would have desired, and become less so over time with equipment failures and incompatibilities. This resulted in a lot of diskette shuffling, and file compressing and decompressing from the machine on which files were received via e-mail or FTP, and the DOS/Windows PC and Macintosh in another location. We had optical disk drives on all the PC computers, but the cartridge formats proved to be incompatible, and two of three drives broke down over the course of the project. In the end, we ended up using diskettes to transfer files. For HTML document files and GIF images, this was easy enough, but for the original source files (e.g., in RTF or MIF) and QuickTime videos, diskette transfer was more of a problem. The first author edited HTML files, converted images to GIF format, and compiled the CD-ROM file archive on a MS-DOS/Windows PC.
Between January 5 and May 5 we nominally had four months. However, due to the problems we discuss below, and the fact that the production editors had full-time careers with typical work-related deadlines, this four month period turned out to be far from sufficient.
There were a surprising number of problems with E-mail transfer of files, fewer problems with diskettes, and fewer still problems with FTP. The frequency of these problems is summarized in Table 3. All forms of compression and encoding presented problems uncompressing and decoding. tar archives of file collections were especially problematic, most likely due to incompatibilities of versions of Unix available to authors and the production team. Essentially none of the tar archives sent to the author could be decoded on the first attempt. Several iterations were required, and generally the first author ended up after several iterations with authors requesting an FTP access to unarchived file collections, or use of E-mail transfer using simpler UUENCODING of individual files.
BINHEX and UUENCODING generally worked, if the latter was done using Macintosh software. But in a few cases even these mechanisms did not work and we had to request a fresh set of files from authors.
Electronic mail generally worked, although the files received might be unusable for other reasons. It is possible that the mail system corrupted these files, but we do not believe this to be case. The mail system did not always handle MIME attached documents. Such documents, even when sent as RTF or MIF ascii text files, were "encoded" in a way we could not decode. Whenever possible, we would find an FTP alternative for acquiring these files. Even completely unformatted text documents were sometimes problematic, as the text was received as long single lines, truncated by the mail system we used. Such files had to be resent with line returns.
Problems with diskette transfer of files involved apparently "bad" diskettes which had to be resent, or diskettes formatted for Sun Workstations, which are not readable on MS-DOS/Windows PC machines (or Macintosh). The latter format problem was soluble using software available on the Internet which allows the Macintosh to read Sun formatted diskettes. In two cases, diskettes were readable but completely blank. In other cases, one or more image files in a set were missing, and had to be resent.
We also had problems with file names. We had also asked people to use DOS file naming conventions because of the ISO 9660 requirements, and the use of a PC workstation with a file format that could only handle files so named. But many authors sent files using Macintosh or Unix file naming conventions anyway. We had also asked authors to use a naming scheme involving their initials so we could organize and track files. For example, a paper from the first author should have been named "rlm_bdy.htm" and an image in this paper named "rlm_fg1.gif". Many authors sent files that did not conform to DOS file naming conventions, e.g., "chi95.paper.tar." Such files had to be renamed (and the encoded files renamed, once unpacked). In the aggregate, especially for authors who sent many GIF files, this consumed a large amount of time.
For large numbers of GIF images, the DOS file naming conventions are awkward. A few authors sent two versions of large images, a so-called "thumbnail" embedded as an in-line image in the paper, and a larger version for detail in a image window, opened as a link in the document. Expressing multiple image names, and distinguishing image versions with eight characters is difficult.
A final category of problem with receiving document input was late submissions. In fact, barely half the papers came in on time, and the rest dribbled in over the entire production process. Several authors asked for more time, and this was always granted. Tracking down late papers consumed time. As Table 3 indicates, by April 1 we still had roughly 34 papers outstanding (about 10 percent).
The heart of the production process was creating HTML documents, with links to GIF images and QuickTime videos. Table 1 shows the number of submissions in HTML form for various submission categories. But as indicated above, even HTML submissions had to be edited for additional elements including copyright notice and navigation buttons the production team believed were useful.
Non-HTML files had to be converted to HTML. We tried using tools to convert RTF and MIF to HTML but they created as many problems as they solved. In particular, they produced spurious headings, character formatting tags that enclosed no text, and spurious links to non-existent GIF images. These problems may have resulted from our failure to establish a well-defined mapping between the RTF or MIF formats authors used, and HTML tags we wanted these formats to map into.
As a result of these problems the production editors generally manually edited ASCII files derived from RTF or MIF (as well as LATEX in a few cases). This required paragraph and header tags, reformatting title, author and author affiliations (typically multicolumn in the submitted source files), and tags for bulleted lists, figure and table references (and captions), and word and phrase emphasis (italic or bold).
The front and back matter for the Proceedings and Companion came in as source files with the formatting used to produce the camera-ready copy stripped out. We then manually converted the files to HTML, with links to other documents. Readers interested in the details of this formatting, can look at the source files for all these HTML documents on the CD-ROM or the Web version.
Handling image formats turned out to be quite difficult. Nearly all the TIFF images (6 sets of all submitted image sets) failed to open correctly in either PhotoStyler or PhotoShop. GIF images were routinely distorted or failed to open (54 or 33% of all submissions). This was quite vexing because GIF image problems often came on top of problems simply getting uncompressed or decoded files in the first place. Even GIF images submitted on diskette were sometimes corrupted (not viewable). In some cases, we iterated with authors on three transfer methods -- diskettes, e-mail transfer and FTP -- before getting a set of GIF images that worked. In many cases, we simply ran out of time to iterate, and these problems remain on both the CD-ROM and the Web version.
Figures or Tables in RTF, MIF or PostScript format were transformed into GIF images by capturing these images on screen, editing the bitmaps in PhotoStyler, and converting to GIF. Several of these images were bigger than the 17 inch PC display (XGA resolution), and had to be pieced together from more than one screen dump. These large images also require scrolling to view in Web browsers.
PICT files (a standard Macintosh graphic format) generally could be easily viewed and converted even with PhotoStyler on the PC. But in a few cases, they did not show up on PhotoStyler on the PC, even though they did so in PhotoShop on a Macintosh. In general, we used DOS/Windows PC as a our main archiving and editing system because it had more tools, and was faster than on the Macintosh. But problems like these with PICT files required us to shift some work to the Macintosh.
No one set of image problems was particularly difficult, but in the aggregate the problems consumed considerable time.
The main problems the production editors had with QuickTime was getting them to the Macintosh to be played and converted to dual platform format. A few QuickTime files came compressed on Macintosh formatted diskettes, and were uncompressed using BINHEX decoding tools on the Macintosh. Most files came via FTP and had to be transferred to a PC, compressed on diskettes, physically transferred to another site, decompressed, and then transferred via a modem-capacity phone cable to the Macintosh machine. Files transferred in this way lost their resource fork, and could not be played or their quality verified without editing the resource file. We hired a QuickTime consultant (identified on the CD-ROM insert) who transformed the source files along lines we discussed above. However, a couple of files that we were not able to verify before sending to him (via mail services) were so corrupted by the various transfer processes (or at the source) that we had to iterate on transferring files to him.
Proofing in this context means making sure the links to documents and images worked, and that the images were intact. We did not proof papers consistently as we went along because we anticipated help on this from our CD-ROM preview volunteers, and we anticipated doing a final proofing when all the papers and images were assembled, and the Table of Contents, Indexes etc. were all in place. Unfortunately, preview copies were harder to produce than we anticipated, and several key production processes were delayed (see the next section 5.8), so we were not able to take advantage of colleagues' help to do all the proofing needed. As a result, some papers on the CD-ROM have formatting problems, and/or problems with the GIF images authors sent. These problems were transferred to the Web version, of course.
There were two problems with the CD-ROM manufacturing process. The ISO 9660 CD-ROM format standard requires that file names be capitalized. On DOS/Windows or Macintosh machines, lower case and upper case files are treated the same by Web applications. We had defined all file names to be lower case in the file system and in the HTML document links. These file name cases were consistent of course, but unfortunately, the software used in the CD-ROM mastering process converted file names to upper case, but did not affect file names referred to in the link specifications embedded in HTML documents. Again, the difference in case does not matter for DOS/Windows or Macintosh machines, but it does matter for Unix machines. Hence the CD-ROM links will not work when used in a drive controlled by a Unix machine.
The second problem is the cost of pre-mastering. CD-ROM manufacturers create a CD-ROM archive used to create a mastering CD-ROM, and to create a so-called validation copy. When this copy is "validated" by the customer, the mastering CD-ROM is used to press the full run of CD-ROM copies. This mastering process is relatively expensive. It is not expected that production editors will iterate on the contents of the CD-ROM. We did, since we found several problems with the validation CD-ROM copy, and this iterative process added unexpected expense to the CD-ROM manufacturing process.
We had intended to do substantial iterative design with the CHI '95 Conference Committee by producing preview CD-ROM copies, as well as have committee members proof subsets of documents, and provide feedback about overall form and content, installation procedures, etc. We did make a few preview CD-ROM copies, but had too many problems to do this on the scale we had hoped to. First, we had to find a volunteer with CD-ROM publishing equipment to help us. As we suggested above, professional CD-ROM manufacturers are not set up to do multiple preview copies or iteration, at acceptable cost. We did find a volunteer to help, and we produced a handful of nominally dual platform CD-ROMs playable on both PC and Macintosh machines. However, the preview copies would only work on DOS/Windows, and many of our volunteers did not have ready access to PC computers. The reason for this is that the ISO 9660 CD-ROM publishing standard, which allows CD-ROM files to be read on most computer CD-ROM drives, appends version numbers to files on different platforms. In effect, the CD-ROM publishing software renames all files: e.g., "rlm_bdy.htm" or "rlm_fig1.gif" become "rlm_bdy.htm';1" or "rlm_fig1.gif;1". There are undoubtedly historical and technical reasons for doing this, but it wreaks havoc with hypertext link file references embedded in documents, because these are not versioned ("renamed"). Most CD-ROM publishing software has a way to suppress versioning. Unfortunately, the software we used for mastering did not allow this.
Faced with this problem, we found another volunteer with CD-ROM publishing equipment who could suppress versioning (see the Acknowledgments). At this point we were running out of time, and could only produce a couple of preview CD-ROM copies readable on both Macintosh and PC computers. At this point we discovered another serious problem. During the production of the CD-ROM file archives, the first author of this report had used Microsoft Word to do editing. This is because dragging and dropping tags from an HTML template to an HTML document was convenient. However, at some point the author had apparently inadvertently turned on the so-called "Smart Quote" feature when entering hypertext link specifications. Smart quotes look like this: "Smart Quote" in contrast to ordinary quotes, like this: "Smart Quote". Unfortunately, smart quotes are not translated into quotes on the Macintosh. Two weeks before the final deadline for shipping CD-ROM file archives, we discovered that the Macintosh version of our file archives were all "corrupted" with smart quotes. Another volunteer (see Acknowledgments) solved this problem by manually converting smart quotes to ordinary quotes for the entire collection of 340 HTML files.
Many of the problems we discussed reflect limitations in time and experience, and could be solved with more of both. However, two perspectives perhaps elevate these problems to something of more general interest to the HCI research community.
First, while we believe that with this year's experience we will be able to solve many of the problems we experienced, we also believe that the state of the art in electronic publishing, and especially Web publishing is also simply primitive. Problems with images in particular are vexing. Discussions with others experienced in Web publishing suggest that problems with GIF images across platforms are real, and not trivial to resolve. Problems with incompatibilities in certain text characters across computer platforms should also not exist.
Note that many of the problems we cited are not hard to resolve on a small scale. It may be tedious to do so, but otherwise straightforward. The problem with these problems occurs when tackled in the aggregate, on the scale that we faced. Problems like doing thorough proofing of links and HTML tags, can be resolved by changing the production process (see Recommendations section below). Notably, we believe individual authors should now be in a position to do their own HTML authoring, and we will recommend this in the future.
The second general observation is that a few problems we observed bear similarities to generalizations about end user difficulties often reported by HCI researchers and practitioners, including early work by the first author and colleagues (e.g., Mack, Lewis, and Carroll, 1983). In particular, as we noted above, many authors did not use the MS-DOS file naming conventions or file identifying conventions we requested. We also got numerous phone calls and notes asking us how to prepare electronic submissions, questions which were answered in the instructions. We attribute this in most cases to authors not reading our instructions, a seeming universal of computer use. Of course, the guidelines for electronic submissions were also new, and relatively detailed. As professionals committed to user-centered design, we naturally draw the conclusion the form and content of these authoring guidelines needs to be improved.
One reason we were unable to do sufficient proofing is that many significant tasks unexpectedly bunched up at the end of the production process (i.e., beginning of the Conference). For example, the software for the CD-ROM arrived the last week we had, and we had no margin for all the tailoring that had to be done. Another example: The artwork for the CD-ROM "Home Page" arrived two days before the CD-ROM files had to be sent out, in the form of an encapsulated PostScript file that was problematic to edit as a GIF image.
Setting aside for the next section the larger question of whether SIGCHI should continue pursuing electronic publication, and to do using HTML and Web technology, we believe two things will make electronic publication vastly easier in the future for both authors and production editors.
First, and foremost, is the fact that both parties have access to an example. Whatever problems may exist with the CHI '95 electronic Proceedings and Companion, the publication provides a wealth of examples and lessons for future authors and production editors. This report is a first attempt to identify these lessons. The Web version of the proceedings makes the examples available to anyone with Web access. Second, we believe the production process can be vastly simplified by asking individual authors to write and proof their own papers. Just as authors are expected to do print camera-ready manuscripts, so also authors should be responsible for creating "web-ready" documents. This shifts many of the problems we experienced to authors, but also distributes the problems in a manageable way.
In fact, shifting the authoring burden to authors is not uncontroversial, and we will discuss this in more detail in the next, and final section.
Several other recommendations pertain to providing production support for production editors. For example, it would help if production editors had access to CD-ROM publishing software to create preview CD-ROMs for review and work out problems of form and content, and for delivering final contents to CD-ROM manufacturers. The Conference could have reduced production mastering cost had we had access to a CD-ROM publishing system.
The CHI '95 Conference Electronic Proceedings and Companion is an experiment in electronic Conference related publication. We delivered it on CD-ROM for technical and policy reasons, but the intent is to start a larger experiment. It should be appreciated that this experiment ultimately involves everybody with an interest in, and need for participating in publications about HCI-related research and applications.
ACM is committed to electronic publication, and has approved experiments in electronic publication by various special interest groups (e.g., SIGGRAPH, SIGCHI, SIGPLAN OOPSLA conferences, etc.). ACM is exploring a broad set of issues relating to archival document formats, copyright issues, business models for payment and access, relation between print and electronic publications, and so on. ACM policy on, and plans for electronic publication have been outlined in the Communications of the ACM (ACM, 1995). In particular, ACM plans to support a variety of document formats, with long-term plans to archive documents in SGML format. ACM also plans to provide various SGML related document templates, conversion and viewing tools for authors and production editors. Electronic publication services began rolling out in 1995.
Based on our experience producing the CHI '95 Conference electronic publication, the production editors have made preliminary recommendations for solving some of the key problems discussed in this report. The main recommendation is have ask authors to submit "web-ready" documents. As we indicated, 25% (85 of 340) authors already did so. Several others indicated that they would have done so with a little more time or incentive. We also do not recommend shipping HTML document viewing software with the CD-ROM, on the assumption that most computer and HCI professionals should have access to the Internet, and the Web in 1996 and beyond. However, this decision will undoubtedly leave some readers unable to view the contents. For that matter, we also do not know how many people have access to CD-ROM players on their computers for the CHI '95 CD-ROM.
To be clear, these recommendations are under review by ACM/SIGCHI and the CHI 96 Conference Committee. The recommendations are not uncontroversial. We appreciate as past authors of CHI Conference-related publications, that authors are typically pressed for time, and do not need another authoring task. The only way this will be acceptable is if authors (and the SIGCHI community at large) experience the value of electronic publication, and have better tools emerge to support authoring. Both issues require an active debate by the SIGCHI community, both at the level of elected representatives, and at the grass-roots level. We plan to provide an electronic forum for this debate via a bulletin board (chi95cdrom@acm.org, or send a note to the first author at mack.chi@xerox.com). We also plan to evaluate the usage of the CD-ROM and Web version. And we expect to host some activities at CHI 96 relating to electronic publication and larger issues of using the Web to support research communication.
As for tools, we identified several in discussing the production process. We also plan to provide Web pointers to tools, and improved authoring guidelines for future CHI conferences. Software vendors will make Web publishing easier also. There already exist add-ons for Microsoft Word and Word Perfect that will save documents as HTML files, and similar functions should soon be available for other word processing systems.
Image editing remains the most difficult problem, because image editing software is expensive, and many individuals simply will not have access to necessary tools. Similarly for producing QuickTime (or MPEG) digitized videos. As a consequence, CHI conferences, if not ACM or SIGCHI, will likely have to provide support for authors for image editing at least, if not for producing videos.
The CD-ROM contents were transferred to the Web during the conference by the fourth author. The problems with the CD-ROM contents were of course transferred to the Web contents. The publication team has been working with authors to "proof" these uploaded electronic papers.
More interesting perhaps, are the changes in the Web version of the electronic publication. For example, authors may access papers directly from other Web sites, without necessarily using the Table of Contents or Index that we provided in the CD-ROM. Consequently, each paper needs to have information about its origins in the CHI '95 Conference. We have had to add this information to each electronic document. The front and back matter on the CD-ROM, like the Table of Contents, contained information closely paralleling the content of the print Conference Proceedings and Companion. Examples include session information about when papers where presented, and how papers were grouped in presentations. This information is irrelevant beyond the Conference. Consequently, new organizing schemes are under consideration, focused more on content of papers, and not the episodic organization of the conference.
Still more significant are new opportunities the Web version of the publication provides for linking Conference publications to other information on the Web, reflecting authors' larger research programs.
A key rationale for using Web publishing is the belief that network hyperlinking will be an important capability for HCI researchers in the SIGCHI community. The CD-ROM of course did not allow authors (or users) to take advantage of this, but the establishment of the Web version of the CD-ROM contents does so. As we indicated above, authors already begin to include URL to other Web sites with University Home Pages and electronic versions of papers cited in the Reference sections of papers. Authors did this in anticipation of Web access. Figure 5 actually shows examples of one author's paper with links to the author's personal home page. Other papers included links to papers cited in the reference section which were already available in electronic forms.
As of this publication date, ACM is working out its policies regarding how authors should take advantage of Web linking capabilities, and the potential for modifying archival but electronic versions of Conference publications. Some examples of issues: To what extent should authors be able to modify published papers? To what extent can authors take advantage of Home Pages, and other archival material related to broader research initiatives of which the Conference paper may be only one result? Can authors working for Corporations include links to Corporate Home Pages? For purposes of the CHI '95 electronic publishing experiment, ACM and SIGCHI have agreed to allow authors to provide links to other Home Pages, papers and data connected to the research cited in the paper. Content editing will, however, be restricted to fixing HTML formatting problems incurred in the production process.
The value of electronic publication in general will probably not be realized for a few years, as a critical mass of publications cumulates and enables research tasks like searching to be done a large scale. Several papers presented at the Conference discuss other capabilities that will make the added effort of electronic publication worthwhile to authors (annotating Web documents, see Roscheisen & Winograd, 1995; search and clustering of related documents, Andrews, Kappe, & Maurer, 1995). Also, we expect authors will come to more routinely exploit the use of multimedia to communicate ideas and results. Static images presented on a computer screen often are of higher quality than their print counterparts, at least where production quality does not permit color or high resolution hardcopy printing. Several short videos that accompanied papers in the CD-ROM, and/or the live conference presentations (but not shipped on the CD-ROM) are actually quite useful in quickly conveying interactions that take time to envision with conventional text and static images.
These are a few quick examples of how the production editors believe electronic publication, and Web publication, in particular, will come to have value to researchers in the SIGCHI community. It is also clear that printed publications have many advantages also. We expect that both will co-exist for some time.
Questions of the long-term use and utility of electronic publications will require discussion and empirical evaluation by various interest groups. Activity along these lines is already underway as of writing this report. Volunteers in SIGCHI and the CHI '95 and CHI 96 Conference Committees are planning at least two evaluations of CD-ROM and Web publication usage. One is aimed at surveying a sample of CHI '95 Conference participants and authors at some point to assess their use and views of the utility of the CD-ROM and Web versions of the Proceedings. The Publishing Board of ACM and SIGCHI also plans to evaluate longer-term Web usage in order to help develop a longer-term business model and assessment of the utility of electronic, and especially Web-based electronic publication. Finally, we have set up an ACM supported bulletin board for sharing problems, solutions, and views of electronic publication (send a note to chi95cdrom@acm.org, or send a note to the first author at mack.chi@xerox.com).
Many people contributed to various aspects of the CD-ROM production, and its establishment on the Web, and we would like to take this opportunity to thank them. Peter Polson lent encouragement and brought the first author into conversations with various people within ACM who helped CHI '95 plan CD-ROM production strategies. These people included Steve Cunningham of UCLA, who produced CD-ROM Conference Proceedings for ACM SIGGRAPH in 1993 and 1994, and various people involved with publications in the ACM, including Mark Mandelbaum, Nora Cortes-Comerer and later Susan Siedun, who helped enormously with shepherding software licensing terms and conditions, among other things. Thanks to Jakob Nielsen and colleagues at Sun Microsystems, and Irv Katz of Educational Testing Service, for feedback on "beta" versions of the CD-ROM, and also for providing advice on a wide range of issues that came up in planning and implementation of the CD-ROM. Special thanks are also owed to Larry Diamond who helped size the production process early on, Scott Robertson of US WEST, and CHI '95 Conference Co-Chair, for filling in on the print publications with Irv Katz, and for advice on various technical and budgeting matters, Mark Wilkes of IBM Boulder for advice, including the decision to use IBM Software Manufacturing to produce the CD-ROM, Scott Schweitzer, then of IBM Research who volunteered several days and equipment to produce "beta" versions of the CD-ROM for early review, Irv Katz again for pressing a Macintosh preview CD-ROM, Rick Gondella of Conference Logistics and Consulting for producing unexpected and last minute CD-ROM booklet artwork, and Beth Maier, of Prodigy Services, Inc., for helping fix the "Smart Quote" formatting problems with the HTML documents. Finally, of course, we thank all the authors who contributed to the CD-ROM, and especially the authors who contributed HTML documents and GIF images.
Robert Mack is a Research Staff Member at the IBM Watson Research Center in Yorktown Heights, NY. Bob has been done HCI research since 1981 when he joined IBM Research. Bob is currently manager of an HCI group working on user interface approaches for digital library technology. He and Jakob Nielsen co-edited the recently published book "Usability Inspection Methods" (published by J. Wiley, 1994). Bob can be reached at maier@watson.ibm.com
IBM Research, 30 Saw Mill River Road, Hawthorne, NY 10598, USA
Linn Marks is a multimedia designer/research at Cognitive Design Studio. She is currently writing a book and developing a CD-ROM on multimedia design. She has been designing multimedia applications since 1989, first at MIT's Project Athena, and later at IBM Research. She has given tutorials on multimedia design at several conferences, including CHI conferences. The focus of her research is discourse structures in interactive media. She can be reached at lmarks@cdstudio.com.
Cognitive Design Studio, 53 Manville Road, Pleasantville, New York USA; lmarks@cdstudio.com
Dave Collins is President of Outback Software, Ltd., a consulting company specializing in client-server systems written in Smalltalk. He spent 26 years with IBM, most recently at the IBM Watson Research Center, where he worked on distributed multimedia systems. Prior to that he taught object-oriented programming and user interface design at the IBM Corporate Education Center, and worked as a systems engineer on telecommunications and database systems. He is the author of "Designing Object-Oriented User Interfaces" (published by Benjamin/Cummings, 1995). Dave can be reached at dcollins@acm.org
Outback Software, 53 Manville Road, Pleasantville, New York, USA; dcollins@acm.org.
Keith Instone is the Research Associate with the Computer Science Department at Bowling Green University, Ohio, USA. He is the SIGLINK Newsletter Editor and CHI96 Hypermedia Support Chair. He can be reached at instone@cs.bgsu.edu (and http://www.cs.bgsu.edu/~instone/)
Research Staff, Computer Science Department, Bowling Green University, Bowling Green, Ohio, USA
Issue |
Article |
Vol.28 No.2, April 1996 |
Article |
Issue |