Information Capacity of Full-Body Movements
Antti Oulasvirta (acm) (google)
If you’re going to remember one(?) thing:
A modern take of Fitts. Rather than just looking at distance to target, this paper looks at the information carrying capacity of movements. And applies this to areas as diverse as gestures and rehersed dance moves.
The problem of designing interfaces with full-body control is that the number of possible movements is too enormous to study empirically. Our solution is to ask a user to produce an overlearned motor act, such as signing one’s name, in each condition. The overlearned motor act is a surrogate for any complex movement that a user could produce with practice.
The user could express genuine information with two-hand interaction: Throughput was 182.7 bps with dominant hand removed, 217.8 bps with non-dominant hand removed, and 322.1 bps with both hands. Thus, bimanual gesturing genuinely increased TP from that of singlehanded gesturing.
Overview:
This method accommodates continuous movement of multiple limbs. Throughput (TP) is calculated as mutual information in repeated motor sequences. It is affected by the complexity of movements and the precision with which an actor reproduces them.
The new metric extends Fitts-TP metric by considering
• the shape of continuous trajectory as the source of information instead of target width and distance and
• the accuracy of the reproduced movement as the source of noise instead of end-point variation.
Moreover, the speed of performance affects the rate of information, as in Fitts-TP.
The metric allows researchers to examine any scenario wherein users’ motion can be represented as a sequence of vectors of movement features, from mouse movements to full-body motion. Naturally occurring movement can be analyzed, with the precondition that the data include matchable repetitions. The known extensions of Fitts’ law from discrete to continuous movements are predictive models of MT and do not carry an interpretation in information theory. Moreover, they are incapable of dealing with multi-feature arbitrary trajectories in 3D space.
Excerpts:
To assess joint human–computer performance, the “tempting but naïve” solution is to examine average speed and accuracy in a task. This approach, however, overlooks the fact that data from easy and from difficult motor acts are incommensurable. Information theory has contributed to the measurement of user performance in HCI by providing a metric that collapses data on speed and accuracy into a single metric: throughput.
Information capacity denotes the rate at which the user could have sent messages, given her speed and accuracy for given target properties. Selecting targets with the mouse, for instance, yields throughputs of 3.7–4.9 bps [17]. Although the metric has been contested, no better alternatives exist for comparing performance across tasks, conditions, and devices.
This paper extends the measurement of throughput from aimed movement to full-body movement—that is, multiple contributing limbs in continuous movement that does not need to be aimed at targets prescribed by an experimenter. In so-called configural movements, the goal is to produce a shape or pattern in movement.
We calculate throughput from mutual information of two or more deliberately repeated movement sequences. Our definition of mutual information captures the intuition that a skilled actor can produce complex (surprising) movements and reenact them precisely at will.
Analyzing precision in repeated efforts allows us to distinguish the controlled from uncontrolled aspects of movement. A newborn, for example, while able to produce complex-looking movements, does not have the capacity to reproduce them.
Since our I(x; y) excludes most of the uncontrolled movements and inaccuracies due to the actor’s inability to repeat the movement precisely, it provides a measure of the controlled information in x and y.
Extensions of Fitts’ law models to continuous aimed movements [1] covered only path width and length originally but were later extended to curvature [11]. However, to our understanding, these models have no interpretation in information theory.
Sometimes imposing constraints may lead to underestimation or overestimation of capacity, as in the case of sliding movements on a physical surface.
In handling of p-dimensional sequences, p > 1, where each time frame xt is composed of p measured movement features, it would be invalid simply to add up the information throughput of all of the features. For us to calculate the “genuine” capacity of the leg, any correlation in the movement of the knee and the calf must first be removed.
One drawback of the second-order autoregressive model is its short “memory”: a human observer can easily detect repeats in a movement, but the model considers each repetition as surprising as the first instance. However, when CTW was removed, TP fell by a factor of 6.7, to 43 bps. The actor’s high TP was achieved at the expense of accuracy in timing.
Furthermore, to understand which limbs are the best candidates for controlling an interface, we estimated limbs’ contribution to the capacity. We averaged raw TPs per movement feature across the dances. As the adjacent figure shows, the two hands and the right foot had the largest throughputs, all above 12 bps. Markers for the torso, head, and distal parts of the feet had far lower values. This analysis reveals a laterality effect (left vs. right hand) and that torso and leg movements may be less well-rehearsed and important aspects of the teacher’s dancing. An interface designer could use such information when mapping human movements to virtual controls.