No earlier issue with same topic
Issue
Previous article
Article
SIGCHI Bulletin
Vol.28 No.3, July 1996
Next article
Article
No later issue with same topic
Issue

HCI in Italy: A Visual Approach to HCI

P.Bottoni, M.F.Costabile, S.Levialdi, P.Mussio

The Pictorial Computing Laboratory (PCL) evolves its past experience in image processing and pattern recognition to the design of interactive systems. In the last ten years, a model for visual interactive computing has been developed based on the following abstraction: in interactive activities human beings communicate with computers through digital messages representing (part of) the state of the computation. Such a model, called Com2, is described, and an interaction with a system, developed following it, is shown as example, thus highlighting the use of visual languages for human-computer interaction.

Introduction

The Pictorial Computing Laboratory (PCL) evolves its past experience in image processing and pattern recognition into the design of interactive systems. In the last ten years, a model for visual interactive computing has been developed based on the following abstraction: in interactive activities human beings communicate with computers through digital messages representing (part of) the state of the computation [1,2]. The basic operations in human-computer interaction (HCI) become the recognition and generation of symbolic patterns characterising the exchanged message.

The human and the computer exchange these messages through the interface of the computer. Due to the current technologies, in most cases the messages are the images appearing on the screen. Such images are mainly composed of text, sketches, and pictures. However, in the PCL perspective the design of the interface does not include merely the pictorial aspect of the screen, rather, it requires considering a whole model for the expected HCI activity, situated in the specific cultural context of the user, who has to accomplish some tasks by means of the computer.

The pictorial nature of information is particularly relevant in fields such as biomedicine, astronomy, engineering, remote sensing, etc., so that pattern recognition and image processing became well-established and important disciplines. In present time of world wide interconnections via the Web, images play the additional role of supporting human communication through computers as well as communication between humans and computers. This activity generally takes place by means of iconic environments like visual interfaces, window systems, etc.

The images contained in the exchanged messages have a nature similar to words in a natural language, thus generating a visual language, i.e. a language based on the use of visual representations. Such languages are defined in our visual interactive computing activity as formal languages. This formalisation is useful in checking the consistency between the meanings associated with the images by the user and the meanings associated by the system. Problems such as semantic ambiguity of icons, choice of adequate visual metaphors [3], etc., can be addressed in the framework provided by the formal definition of visual language.

In HCI, we consider both the user and the computer program as two entities communicating through images, thus making a symmetrically closed loop through which information flows (communication aspect), and reasoning and computation activities are performed in turn by the machine and by the user (computational aspect). Hence, the proposed model, called Com2 (Communication-Computation model), is the basis of our approach to interactive computing. We describe such a model in the next section, and then we show, as example, an interaction with a system developed following this model. The proposed example also highlights the use of visual languages for HCI. Finally we draws some conclusions.

The Com2 model

Human users need the computer to perform, simplify or support some tasks in their daily activities. To this aim, they must communicate their requirements to the computer and properly interpret its outputs messages. In the past era of batch systems, users provided all the input information to the computer at once, and left the machine perform the computation. This was a very poor type of interaction between users and computers, which did not support many tasks well. Emphasis was given to the computational process, while the communication aspect was almost completely neglected. With the use of the highly interactive devices available today, users are constantly providing instructions to the computer and receiving feedback. The interface is the means through which information flows from the user to the program run by the machine and back to the user.

In HCI, the two main participants, namely the human user H and the computer program C, must be explicitly taken into account. In the case of visual interfaces, the message M exchanged between users and computer is an image visible on the computer screen.


Figure 1: The Com2 Model


The screen becomes the communication surface between users and programs, thus assuming the role of a bi-directional channel in the transmission of the messages.

Both participants perform a kind of computation, so that the overall process is described by the symmetrical closed loop depicted in Figure 1, that represents the Com2 (Communication-Computation) model at the basis of our approach for visual HCI.

The user H, who is performing a task with the computer C, looks at the screen and interprets the shown image to understand the state of the current process, decides which action to perform next (computational activity performed by the user) and instructs C to have the action executed. Since communication occurs through a visual interface, in order to tell C the instructions to perform, H composes a visual message on the screen, i.e. materialises his or her thinking as a proper image M on the screen. The message M generated by H is captured and interpreted by C, which executes the actions required by H (computational activity performed by C). Then C materialises the results of its computation as an image M' on the screen.

For each message M two interpretations exists: the first is the one intended by the generator and the second is associated by the receiver. A correct communication occurs if the two meanings evoke similar reactions from the two communicators. The machine should interpret a given message and react to it in a way which is isomorphic to the interpretation and reaction of the user. In other words, when the human refers to a real object in his or her world, the computer must refer to the corresponding virtual object in the virtual world, and when the user refers to a real activity in his or her world, the computer must refer to the corresponding activity in its virtual world and vice versa. For example, if the user is updating a document, the result of an updating computation is a virtual document which, when printed, is the same as the human would have obtained by updating a real document in the real world.

The Com2 model imposes a discipline on the interaction, exploiting the concept of state. To this end, M, produced by C, specifies to H not only the results of the computation, but also the set of the possible actions that can be performed next by the user (i.e. that are legal in the current state of the interaction). This discipline facilitates an interaction in which the user is implicitly prevented from producing faulty states, achieving a conversational style of interaction in the sense advocated in [4].

The Com2 model introduces an evolution with respect to the model proposed by Abowd and Beale in [5], and schematised in Figure 2. The latter extends Norman's classical model of interaction [6], making the role of the system explicit. In contrast, Norman concentrated wholly on the user's view of the interaction cycle, divided into two major phases: execution and evaluation. Execution involves going from user's intention to action; evaluation refers to user's interpretation of the perceived messages coming through the interface. In Abowd and Beale's model, system and user play symmetrical roles. Indeed, such a model considers four main components in an interactive system: User, System, Input, and Output. Each component has its own language; an interaction cycle is seen as a language translation from one component to the next in the cycle, as shown by the labelled arcs in Figure 2 Input and Output together constitute the Interface.

The Com2 model further stresses the symmetrical roles of the user and system components: it subsumes Input and Output under the concept of message exchanged in both directions, and highlights the abstract equal structure of the activities carried on by the two components. Moreover, as computer scientists, we are interested in the computational aspects of interaction, as well as in deriving from the interaction model criteria for the design of anthropocentric systems [7].


Figure 2: A Schematisation of the Abowd-Beale's Model of HCI. For Each Activity the Corresponding One in Com2 is Indicated in Parentheses


Messages are of visual type, and they are materialised on the screen. The set of images which can be generated on the screen during an interaction is called Pictorial Language (PL). The activities of interpretation and materialisation of these images are carried out in different ways by the two participants. However, the two must associate a same meaning to a same message in order to communicate. Hence, it is necessary to formalise the set of possible meanings associated with the images. In the Com2 model, such set of meanings is formalised as the set of possible descriptions of the images in PL. These descriptions are in turn strings of attributed symbols, which constitute a description language DL. Moreover, it is also necessary to univocally relate elements of the image to elements in the description to avoid mistakes and misunderstandings.

In summary, the set of messages exchanged in visual HCI is characterised by the triple VL=<PL, DL, Co>, where Co is the set of pairs of relations defining the interpretation of the images and the materialisation of the descriptions. This structure defines a visual language; it can be seen as a set of visual sentences [8]. Visual sentences are defined as triples <i,d,<int,mat>>, where i is an element of PL, d is an element of DL and int and mat are the interpretation and materialisation functions. This definition allows: a) the precise derivation of the meaning of every possible message which can be generated during an interactive session; b) the development of an architecture for adequate visual interaction [7,9]. In the next section, an example of the use of these concepts is presented. This approach can be extended to situations where several persons must communicate for a profitable interaction to achieve a goal. like in collaborative work; a visual language designed for this purpose is described in [10].

Visual languages in HCI

To better grasp the concept of visual language for HCI, let us describe a typical interactive session with a HCI system for biopsy analysis to detect liver diseases. The physicians exploit an interactive system (called ABI, Assistant for Biopsy Interpretation), developed on the basis of the Com2 model. Histologists examine biopsies to find out the characteristic structures, i.e. liver cells (hepatocytes) and their components (nuclei). The sketched case has a general validity: in many fields of science and technology, human experts analyse pictorial data and use them for communication and reasoning according to certain rules which are not universal, but situated in a specific context, i.e. established by the community of human experts working in that context. Experts collectively determine the types of structures significant for certain tasks when appearing in a certain class of images. Significant structures are called characteristic structures of the image. For each type of characteristic structure, the community agrees on a name to denote it, and on the features necessary to characterise an instance of it (i.e. a characteristic structure). Human experts then use pairs composed of a characteristic structure and its description to reason on images, synthesise as images their conclusions, and communicate them. A characteristic pattern is the association of a characteristic structure and an attributed symbol, defining name and properties of the structure. Note that such concept of characteristic pattern can be used to interpret text as well as images.

Figure 3 shows a screen dump of the ABI system during a process in which a histologist interactively interprets a biopsy. The image on the screen materialises the current state of the interpretation, as recorded by ABI. Two windows are presented to the histologist. The first (Image: Biopsy 13) materialises a grey level image together with a set of tools which help its reading. The second (Structure Type List) materialises the list of the characteristic structure types. Nuclei and cells are the types of structure the histologist is interested in.


Figure 3: A Screen Dump of the ABI system.

From the histologist's point of view, the interpretation of a biopsy is a two-step interactive activity. First, he or she has to select the type of structure of interest for the next actions. To this end, the histologist points and clicks a name in the Structure Type List by a first mouse gesture (communication). Then he or she has to identify structures of this type present in the image (computation). For each structure, the histologist steers the mouse with a continuous gesture following what he or she perceives to be the closed boundary of the structure on Image:Biopsy 13 (communication).

On ABI side, the system computes, on the basis of the first user action (selection of a type name), the type of the structures selected by the user and communicates it back to the user by highlighting the selected type in Structure Type List. Each time the user draws a boundary in Image:Biopsy 13, the system computes the description of the identified structure, and communicates it back by materialising the boundary in Image:Biopsy 13 in the style associated with the selected type in Structure Type List.

In the image shown in the main window in Figure 3, the histologist has identified, contoured and described four cell nuclei and one hepatocyte cell (enclosing one of the nuclei). The current description of the whole image, seen as a structure in itself, is <Biopsy,13,4,1> where the symbol "Biopsy" denotes the type of structure, 13 is the value of the attribute Image Identifier, 4 is the value of the attribute Current # of Hepatocyte Nuclei and 1 the value of the attribute Current # of Classified Hepatocytes. Other image structures are similarly described. In fact, a hepatologist describes the rightmost Hepatocyte as <Hepatocyte, 1, (240,260), 4473, 483, normal, `HNucleus3'>, where `normal' is the nominal value of the attribute State and `HNucleus3' the nominal component of the attribute Substructures, pointing at the description of the third instance of a hepatocyte nucleus. This description is in turn an attributed symbol, possibly with an attribute Substructures. The concatenation of the descriptions of all the identified (sub)structures forms the description d of the image. A function int is defined which associates with each recognised structure in the image of Figure 3 an attributed symbol in d.

Often, a histologist schematises his or her results by diagrams as the one shown in Figure 4. In order to associate an image as in Figure 4 with the histologist's description d specified above, a function mat_icon is defined assigning an iconic representation to each element in d. In this way, a visual sentence is defined as vs=<i,d,<int,mat_icon>>. The formalisation proposed in [8] defines the components of a visual sentence as bidimensional (the image i) and one-dimensional (the description d) strings, and mappings between symbols and substrings (the pair int and mat). In this way, techniques developed for the definition of formal languages can be extended to the definition of visual languages.

Note that the materialisation of a description does not necessarily reproduce the original image the description has been derived from. In the example of Figure 4, the results of the classification are schematised by conventional icons in a notation familiar to the physician, which clarifies the meaning of the analysed image, the one in the main window in Figure 3.


Figure 4: a) An Iconic Materialisation of the Interpretation of the Biopsy in Figure3;
b) The Set of Icons Used for the Materialisation

The above discussion shows some of the visual languages exploited in the interaction. The histologist uses a first visual language VL1 to interact with the classification program (see Figure 3). An example of a visual sentence in such a language has the following components: the image i is the image of the whole screen shown in Figure 3, the description d is the program managing i, int is computed by the programs controlling the input from the user, mat visualises the results of the classification program with respect to i. The system allows the histologist to restate the data in a familiar notation, so that some characteristics of the observed data can be highlighted.

This new image (Figure 4) is a statement from a second visual language VL2, materialising in a different way the same set of data. This second representation can be the support of a simulation visual language VL3 which can animate the document, allowing the physicians to interact with their model of the population of cells (see [2] for details).

From the point of view of the system usability, it is noticeable that in all these interactive activities, the physician interacts with ABI by using notations that are familiar to him or her, feeding data and obtaining results in forms that they can immediately understand, also exploiting the new capabilities of document animation that the computer offers to them.

From the machine side, the formal definition allows uniform design of the interface as a system of visual languages, each one tailored to management of an activity. The definition does not constrain the visual language to any programming paradigm, so that the developed systems offer capabilities of multiparadigm interaction. The adoption of O-O techniques allows the proper definition of incremental compilers and interpreters for all the languages in the system [9]. This way, the computer controls the syntactic correctness of messages exchanged during interaction, and acts as a symbol crunching machine, i.e. it performs the symbolic computations dictated by the meaning of these images. It thus becomes the support of the reasoning activity of the human user, who remains the main responsible of the ongoing process.

Returning from the example of interaction with ABI to the general problem of visual HCI, there are plenty of examples of interactive computing through visual languages. In the perspective of the Com2 model, a visual language is the set of visual sentences produced during a user-computer dialogue. One of the experiences developed in the PCL group regards visual query systems, i.e. systems using visual representations to interact with databases. A multiparadigm approach has been proposed in [11] to allow querying through different visual languages sharing a same description language: a query to a database can be formulated by using different visualisation paradigms (see also [12, 13]). The user can compose his or her query on the screen by filling the cells of a form in a form-based paradigm, or by composing appropriate icons in an icon-based paradigm, thus generating different images on the screen. The user has thus available different visual sentences to provide the meaning of a query, and can adopt the pictorial language he or she finds more expressive, or with which is more acquainted. The proposed formalisation of visual languages clearly expresses the different components (image and int function) one has to manage to allow multirepresentation of meanings, as necessary in this case. On the other hand, the query result retrieved by the computer can also be visualised on the screen with different representations, as shown in [11], thus requiring a different mat function for each visualisation.

The properties of the int and mat functions and of the sets of characteristic patterns have been studied to formally characterise visual sentences and hence visual languages. Such a study translates in formal terms requirements and properties of the visual languages with respect to interaction. Families of rewriting systems can also be defined which specify visual languages with the desired properties [7].

Conclusions

The possibility of an easy and natural interaction adds value to the processed data, since they may be visualised, explored, backtracked and re-computed as the users' needs arise. In everyday practice there is a growing need for end-user computing [14], in the sense of making available to users different tools for creation and use of programs according to the person who is using them; one possible way to tailor the programs to individuals is by means of interaction where users may drive the application according to their motivation, personal tastes and expectations, and select the visual languages most suited to their culture.

The PCL has proposed the Com2 model for HCI, focusing on the communication and computation issues. As to the communication aspect, the Com2 model allows adequate interaction between human and computer adopting user defined visual languages whose alphabet symbols are either those traditionally used by humans for describing their tasks or new ones. According to Postman, the language forged by the users to describe task execution encompasses knowledge about the application domain as well as specific information on the objects that must be manipulated [15]. Hence, Com2 encompasses and makes explicit this knowledge through the use of visual languages. Regarding the computation aspect, the visual sentences in the visual languages are interactive programs, and can be used in simulation and in data manipulation activities.

On the basis of the Com2 model, the PCL has developed: a) methodology for tailoring each visual language to a specific environment [7], b) an architecture to implement the tailored Com2 interactive environments [9].

Acknowledgements

Work partially supported by Italian National Research Council grants: 95.00485.CT12 and 94.00070.CT01.

References

[1]
P.Mussio, M.Finadri, P.Gentini, F.Colombo, "A bootstrap approach to visual user-interface design and development", The Visual Computer, 8(2), 1992, 75-93.
[2]
P.Mussio, M.Pietrogrande, M.Protti, "Simulation of hepatological models: a study in visual interactive exploration of scientific problems", J.Vis. Lang. and Comp., 2(1), 1991, 75-95.
[3]
T.Catarci, M.F.Costabile, M.Matera, "Visual Metaphors for Interacting with Databases", ACM-SIGCHI Bulletin, 27(2), 1995, 15-17.
[4]
S.E.Brennan, "Conversation as direct manipulation: An iconoclastic view", in B.Laurel, ed., The art of human-computer interface design, pp.393-404, Addison-Wesley, 1990.
[5]
A. Dix, J. Finlay, G. Abowd, R. Beale, Human Computer Interaction, Prentice Hall, 1993.
[6]
D. Norman, The Design of Everyday Things, Penguin, 1991.
[7]
G.Biella, P.Bottoni, M.Mariotto, P.Mussio, "The design of Anthropocentric Cooperative Visual Environments", Proc. IEEE Symp. Vis. Lang. '95, IEEE CS Press, 1995, 45-52.
[8]
P.Bottoni, M.F.Costabile, S.Levialdi, P.Mussio, "Formalizing visual languages", Proc. IEEE Symp. Vis. Lang. '95, IEEE CS Press, 1995, 334-341.
[9]
N.Bianchi, P.Bottoni, P.Mussio, M.Protti, "Cooperative Visual Environments for the design of Effective Visual Systems", J.Vis. Lang. and Comp., 4(4), 1993, 357-382.
[10]
K.D.Swenson, "A Visual Language to Describe Collaborative Work", Proc.IEEE Symp. Vis.Lang. '93, IEEE CS Press, 1993, 298-303.
[11]
S.K.Chang, M.F.Costabile, S.Levialdi, "Reality Bites-Progressive Querying and Result Visualization in Logical and VR Spaces", Proc.IEEE Symp. Vis. Lang.'93, IEEE CS Press, 1993, 100-109.
[12]
T.Catarci, S.K.Chang, M.F.Costabile, S.Levialdi, G.Santucci, "A Graph-based Framework for Multiparadigmatic Visual Access to Databases", IEEE Transactions on Knowledge and Data Engineering, to appear.
[13]
T.Catarci, M.F.Costabile, A.Massari, L.Saladini, G.Santucci, "A Multiparadigmatic Environment for Interacting with Databases", ACM-SIGCHI Bulletin, Vol. 28, No. 3, July 1996.
[14]
J.C.Brancheau, C.V.Brown, "The Management of End-User Computing: Status and Direction", ACM Comp. Surveys, 25(4), 1993.
[15]
N.Postman, Conscientious Objections: Stirring Up Trouble About Language, Technology, and Education, Vintage Books, New York, 1988.

About the PCL group

Paolo Bottoni, Maria Francesca Costabile, Stefano Levialdi (Director) and Piero Mussio are members of the Pictorial Computing Laboratory (PCL) at the University of Rome "La Sapienza". Other members are Luigi Cinque, Roberta Mancini, Marilena De Marsico, and Carlo Bernardelli.

The group's primary theme of research is the analysis and use of images in human-computer communication, providing an integrated view of a number of disciplines like pattern recognition, human-computer interaction and visual interfaces which, until now, were separately considered. Since interactivity between user and programs generally takes place within iconic environments, the need to formalise visual languages has arisen in order to manage the execution of programs by non-skilled people. Other research lines in the PCL include pattern description by multiresolution, visual language definition and implementation, visual interactive environments, visual querying, image indexing and retrieval, metaphor analysis and synthesis, usability studies. Protoypes of systems in these fields have been implemented.

The PCL opened in 1993 and since then, a number of links with other Universities and research groups worldwide have been established. PCL, jointly with the database and user-interface group of the Dipartimento di Informatica e Sistemistica of the University of Rome, runs bi-annually the International Workshops on Advanced Visual Interfaces (AVI), sponsored by ACM. They constitute major events in which new ideas and experimental systems are presented. The last workshop was held in May 1996 in Gubbio, Italy, and the proceedings are published by the ACM Press.

Authors' Addresses

P.Bottoni, S.Levialdi, P.Mussio
Dipartimento di Scienze dell'Informazione, Università di Roma "La Sapienza", Via Salaria, 113, 00198 Roma, Italy

bottoni@dsi.uniroma1.it
levialdi@dsi.uniroma1.it
mussio@dsi.uniroma1.it

M. F. Costabile
Dipartimento di Informatica, Università di Bari, Via Orabona 4l, 70126 Bari, Italy

fcosta@iesi.ba.cnr.it

No earlier issue with same topic
Issue
Previous article
Article
SIGCHI Bulletin
Vol.28 No.3, July 1996
Next article
Article
No later issue with same topic
Issue