The Proximal User Interface
Imagine you're in a cave in the dark, said philosopher, Michael Polanyi.
You've a stick in your hand which you wave about in front of you to find
out if there's a hole in the floor in front of you, or a passage to right
or left, or you're just about to bump your head. At first you are aware
of the impressions of the stick on your hand and interpret them into
knowledge of your surroundings.
But soon you forget them and start interpreting them immediately.
What has happened is that
the stick has become 'part' of you; it has become 'proximal' rather than
'distal'. (Polanyi, 1967, 'The Tacit Dimension'.)
The same happens with the steering wheel when you learn to drive.
A 'distal' tool is cognitively distant from the user while a 'proximal'
tool is cognitively in close proximity to the user. The distal tool
demands and absorbs the user's attention as s/he handles it; the
proximal tool does not, but allows the user to concentrate on the
real task for which the tool is being used.
Just as an architect 'thinks with his pencil' so the user
should be able to 'think with his software' - without interruption.
The proximal user interface is one that has so much become part of you that
you are not aware of using a user interface, but you just get on with the
task you are engaged in.
Imagine you are drawing a box and arrows
diagram that expresses a causal model, using a modelling package,
and you wish to redirect an arrow from one box to another.
A typical sequence of user actions is:
- Click arrow, to select it.
- Up to toolbar to find 'Modify Link' tool. Requester appears (helpfully
obscuring part of your diagram).
- Click 'Redirect'.
- Choose whether you want to redirect the source end or the destination end.
- Hit 'OK'; requester disappears.
- Click the new destination (or source) box; arrow is redrawn.
Six separate actions for what is one single action in the mind of the user.
All this time the user's attention is on handling the user interface
rather than on the taks s/he was doing.
Wouldn't it be much better for if the user could nust 'pick up' the
end of the arrow and drop it into the new box?
(If you were using a general putpose drawing package rather than
a modelling package, which understands that arrows link to boxes,
then it would have been even worse: you would have had to
carefully align the arrow with the box.)
While, with months of intensive practice, almost
any user interface can become second-nature, we reserve the term 'proximal'
for those that become second-nature very quickly. For this to happen some
important criteria must be met. Mainly, it boils down to the user
interface not 'getting in the way' between the user and the task being
carried out.
The problem with the distal user interface, which demands a shift
in your attention away from your main task is not just that it
takes more time, nor just that it 'feels' more clumsy, though
these are bad enough. It is that it interrupts your flow of thinking
of the main task. And, especially when engaged in creative activity,
thoughts are extremely transient and easily forgotten if you
are interrupted. (The phone rings; drat! I've forgotten that
wonderful idea I had.)
The standard approach to GUI often results in too distal a user interface.
This is partly because software designers do not realise the importance
of proximality. But it is also more fundamental, and many GUI tools
tend to create distal UIs by their very nature. This is because
they tend to emphasize the syntax of user actions.
There are three levels (actually more) of user action:
- Lexical - low-level actions like position mouse over a certain
rectangle of screen, click mouse, drag, hit a key.
- Syntactic - in terms of symbols: hit a tool on tool-bar, operate
a slider, hit a radio button, enter a number or text.
- Semantic - what you are trying to achieve with the actions:
- "I'll link smoking to cancer-risk."
- "No, that's wrong; smoking links to lung-irritation."
- "I'll redirect the link to lung-irritation instead."
If one one sees a user action as an event, one must specify parameters
of the event:
- What type of action,
- What to take action on,
- Parameters of the action.
and indeed this is what we do when we rename (type of action) file
'fred' (item to take action on) to 'joe' (parameter of the action)
using a command on a command line interface. We focus on the
syntax of the command, which is often {verb, item, parameters}
but could be anything you like as long as you are consistent.
Treating syntax like this is very powerful, allowing a wide
range of types of action on an even wider range of items and
allowing any amount of qualification of the action to any
degree of complexity you chose. Just think of the number
of parameters the List command has on the Amiga's CLI, or the
ls command has on Unix.
The standard GUI likewise emphasizes the syntax (maybe because it
emerged largely as a graphical equivalent of the command
line interface). It merely offered a different lexical level
- mouse clicks and movements replace key-hits. So people who
were nurtured in the CLI era - and had reacted against it -
could find an equivalent that was much easier to learn and
remember, and more direct.
So, in the past (led by Apple Mac and followed by Windoze) GUIs
(usually) tended to separate item selection from item action, and to select
the action by means of choice from menu or requester. Except
where the range of action choice was very small, this often
led to three steps for the user:
- Select item or other parameters of action (e.g. click on it).
- Activate action choice (e.g. hit toolbar icon).
- Choose action (from the requester that appears and hit 'OK').
Thus the User Interface textbooks tend to teach us.
But a new type of GUI is emerging, one that is found to some extent
in computer games and also in pachages like LightWave
that were built somewhat outside the mainstream of conventional UI teaching
and research. Everything is more direct, more 'proximal'. Interestingly,
many of these packages were developed by the
Amiga
community, which remained outside the mainstream of academic
research.
A new paradigm is emerging for user interfaces, and led not by
the academics and their theories of syntax but by the practitioners
and what they find useful and good. It is one of involvement rather
than of control.
Winograd and Flores, in their (1986) book,
Understanding Computers
and Cognition, argue that the way infomation technology has
developed has been based on positivist, rationalist world-view,
which sees our relationship with the world as one of distant
observers and controllers. The prototype 'rational man' distances
himself from the situation, gathers all the facts about it, reasons
carefully about it and then seeks to exercise control over it.
They then argue that this has led to many problems, and that we
need to move towards another paradigm, based more on Heidegger's
existentialism. While I myself do not agree fully with existentialism
some points it makes have some validity. Winograd and Flores
emphasise 'thrownness' - that is, we are 'thrown' into situations
as involved actors, rather than being separate, distant, neutral
observers and controllers.
Management science of the 1970s and 1980s saw the neutral, distant
observer-controller as the ideal, and so business software
tended to be of this ilk. But there is trend in management
towards greater involvement, and in software (especially
outside business software - think of software to aid painting,
video, and also games) there is a trend towards greater involvement.
In traditional business software the user is assumed to be
a distant observer-controller, who asks the software to give
information and then gives commands to the software to take action.
Thus the user is seen as separate from the data, and
the software is a distal tool to allow access to and control
of the data.
In more recent software - even some business software - the user is
seen as intimiately involved with the world that is the data.
The conventional GUI, which separates item, action and parameters
in its syntax, suits the distal approach of conventional business
software, and that is why it is suited to Windoze. But when the
user is to be closely involved the software tool must become more
proximal; item, action and parameters must not be separated and
syntax must be de-emphasized.
Note that much games software tends towards proximal user interface
because of the need for close involvement of the user with the
software - resulting in close mapping between lexical actions and
semantic effect in the field of play. The conventional community
could learn a lot from games UIs.
What is needed for proximal software is a direct link between
lexical and semantic. That is, the low level user actions must
map directly into meaningful actions at the knowledge level. The
action required to redirect a link must be as simple and direct
as the concept of redirection itself.
Proximal user interface has a lot in common with 'direct manipulation'.
Recently people like Shneiderman, the 'father' of direct manipulation,
have saying that there must be this direct link from lexics to
semantics. This is the way to make it ultra-direct - that is,
towards proximality.
But the proximal user interface goes beyond direct manipulation. It
embraces a number of principles. It
says something about how to achieve a wide variety of actions
on a wide variety of items:
- It takes note not just of what type of object the mouse is over
but also where on the object the mouse is clicked. e.g. Start mouse
operation at edge of box to either start drawing a new link
or redirecting an existing one.
- It uses the left hand to qualify the mouse operation, such as
using left-shift to mean 'redirect link' as opposed to 'draw new
link'. (Principle of the left hand)
- It treats the mouse operation not just as an event but as a
process. During a process other things can happen.
- For instance, the mouse
operation, e.g. hit space bar and a bend is inserted in the
link.
- Note that the left hand can 'hover' over one or two important
qualification keys, so the user gets used to them and it quickly
becomes second-nature (proximal).
- The use of requesters is allowed, but mainly for filling in
detail.
- Note that there is an increasing cognitive effort involved as we
move up the following scale:
- the simple unqualified operation
- using a qualifying key at the start
- hitting a key during an operation
- bringing up a requester and hitting 'OK'
- Principle of graded effort The amount of cognitive effort in an operation should be
related to various characteristics of the operation:
- The frequency of the operation; the more frequent ones are more direct.
- The importance of the operation; creating structure of knowledge
is more important than filling in detail and therefore should
be more direct; you can use requesters to fill in detail.
- The 'danger' or irreversibility of the operation, especially with
regard to losing information. See the irreversibility
principle, below.
- The irreversibility principle says that
we must give more cognitive effort to operations that destroy
more information. Drawing a new link loses no information,
redirecting a link loses a little, but deleting a link loses a lot
(including the shape of the link and all semantic information like
the weight of the link). Look at your favourite drawing software, and
you'll find in most that it takes less effort to delete a link
than to redirect one - the wrong way round.
- The clutter principle recognises that in
representing real life knowledge there will be lots of it, so
clutter is inevitable. So there must be ways of handling it
intelligently.
- The principle of visual cues says that as
we draw on the easel we retain a memory of where we have drawn
what, and that a number of visual cues build up. This means
in particular that automated layout algorithms should be used
with care since they destroy visual cues.
- The principle of large easels recognises the
need for large easel on which to draw knowledge. Larger even than
a 2000 by 2000 screen. How do we scroll around it? As explained
in the paper by Basden, Brown, Tetlow and Hibberd,
the Amiga's hardware assisted
smooth scrolling mechanism offers a lot of advantages.
There are several examples of 'proximal' user interface on
the
Amiga
platform. Why on the Amiga? I think because:
- The Amiga has developed a different 'culture' from the
mainstream, less affected by the standard GUI fashions because
it has been outside the mainstream academic discussions
that have assumed distal UI is OK.
- The Amiga was designed for movement and animation, and
most early software that used it was games - which required
proximal involvement, even though the term had not then
been applied to user interfaces.
The 3D modeller, Lightwave, has a user interface that is
extremely easy to use, in places. Especially for the
frequent user actions of moving the scene around to view
it from different angles, and moving objects around the scene.
It fulfils to some extent the principles of graded effort
especially.
The multimedia authoring software is noted for its ease of
use. (Indeed I now use it at the
University of Luleå in
Sweden for teaching newcomers to information technology
how to control computers.)
Many operations are highly proximal, including re-ordering
screens in the script, moving text around screens, etc.
The Istar software
was designed specifically with the principles of proximality in mind.
In fact, some were in the intuition rather than in the mind
while it was being designed. It allows the construction of
complex knowledge bases (inference nets, semantic nets and
bayesian nets) by drawing box and arrows diagrams. See
the Basden, Brown, Tetlow and Hibberd below for details.
None of these softwares are perfect; indeed the principles mentioned
above are not in their final form. Much more work needs to be done
to arrive at truly proximal user interfaces.
But it is probably worth
setting these ideas in their historical context.
- First we had
command line interfaces. These could very seldom be direct
manipulation, let alone proximal. They used key-hitting as
their lexical medium, and this translated into textual words
used as symbols. Each was an event, ended with the CarriageReturn key,
The big
issue when this type of user interface was being developed was
the syntax: what syntax is best for what purposes? Do we
put the item first or the action first, and where do the
parameters go? Answer: normally put the action first, the
item second and then add parameters using the '-' sign or
keywords. The Amiga CLI (based on Tripos) is a sophisticated
version of this.
- Next we had the standard GUI we know today. As mentioned above,
this has often taken the form of a graphical equivalent of the
CLI. The idea is the same: events in which we specify item, action
and parameters. The item is now identified, not by name, but by icon.
The action is identified either by mouse gesture or by menu. The
parameters are then requested interactively via requesters. The main
difference is the use of graphical point-and-click. Each part of
the event is a sub-event, and the famous
GOMS model deals with this well.
Some elements of proximality started to emerge, such as drag and
drop, but usually in an ad-hoc manner.
- Now that we have graphical user interface, we can move away
from the linear, discrete event, to the continuous process -
the proximal user interface. This could not have happened without
the graphical user interface developing first. It is a further
step in the evolution of user interface ideas.
Notice what has been left out of the discussion - some of those
issues that have been uppermost in the minds of user interface
researchers for the last two decades:
- Standardization of user interface. Because user interface
standards have grown up in an era in which the distal-proximal
issue was not recognised, they are not suited to the
proximal user interface. Also, the standards tend
to cover only the syntactic and lexical levels, not the semantic,
which is of especial concern to proximal working.
So the proximal user interface must set its own, new standards,
in some respects different from those that pertain today.
The principles outlined above are an attempt to move
towards that.
- Learnability and the novice user. During the 1980s computers
spread to all and sundry, after a decade in which they were
esoteric powerful wonderful machines. Only an elite could
make them do what was needed; they were just too difficult to
use. Standardization helped new people, but also there was
an emphasis on learnability. Help systems. Wizards. etc.
The proximal user interface does not address the issue of learnability.
It addresses the issue of what the user finds once they have learned
to use their tool. There is good reason why it should do so. Today,
as we enter the 21st century, there is a growing population of
people who have already been through a learning process, and
are facing this very question: now that I've learned it, I find
it rather cumbersome; what now? Also, since people are much less
afraid of computers and already have the rough ideas about them
they find it more easy to learn new ways of using the computer.
Now, it might be that proximality is an issue that runs counter
to these. That is, the more proximal, the less standard
and the less easy to learn. But I do not believe so.
I believe that, with some creative thinking, researchers will
be able to come up with something that is both proximal and
easy to learn.
Current user interfaces do contain some elements of proximality but,
as suggested above, these tend to be introduced in an ad-hoc manner
since priniples of proximality have not been given attention
until recently. Here are a number of them:
- Drag and Drop. This is a reasonably proximal way of 'giving'
one item to another, perhaps to act on in. One item can be a
program and the other a file containing a picture, sound sample,
text document or whatever. Drag and drop for such purposes is
proximal in that there is a direct mapping between the lexical
actions and the semantic.
- Action buttons. The user 'hits' a button with the mouse (mouse
click when its cursor is over the button's area). There is a direct
link from the click to the action primitive of activating the action.
- Activating icons. Double-click on icons is also reasonably
proximal because the quick repitition of a simple action (click)
is, to the user who has mastered it, just a single action -
and it doesn't take long to master it.
But there are also elements of current graphical user interfaces that
are not proximal.
- Icon selection followed by menu 'Open'. This is often an alternative
to the double-click. It is less proximal since it involves two separate
actions, on two separate areas of the screen.
- Resizing a shape. In most drawing pacakges to resize a shape one
must take two separate actions - first select (handles appear) then
carefully aim the mouse at the appropriate handle and drag it.
So, how can we bring proximal user interface about? There are two
steps we must take:
- More research is needed. For instance, what are the principles?
- those above are not a complete set.
- How do we implement it? Some thoughts on this are discussed
in a
digest of an email discussion I've had.
Here are some references for you to follow up:
- Basden A, Brown A J, (1996),
"Istar - a tool for creative design of
Šknowledge bases", Expert Systems, v.13, n.4, pp.259-276, November 1996.
- Basden A, Brown A J, Tetlow S D A, Hibberd P R, (1996),
"Design of a user
Šinterface for a knowledge refinement tool", éInt. J. Human Computer Studiesé,
Šv.45, pp.157-183.
- Basden A, Hibberd P R, (1996),
"User interface issues raised by knowledge
Šrefinement", éInt. J. Human Computer Studiesé, v.45, pp.135-155.
- Card SK, Moran TP, Newell A, (1983), é
The Psychology of Human-Computer Interactioné,
Lawrence Erlbaum Associates, Hillsdale, NJ, USA.
- Polanyi M, (1967), éThe Tacit Dimensioné, Routledge and Kegan Paul.
- Winograd T, Flores F, (1986), é
Understanding Computers and Cognitioné,
ŠAddison-Wesley.
Copyright (c)
Andrew Basden, 1997.
Comments and questions, via "R @ basden . u-net . com",
will be gratefully received and taken seriously.
Page last updated: 30 June 1997.