Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 30 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
30
Dung lượng
405,56 KB
Nội dung
366 HUMAN PERFORMANCE IN MOTION PLANNING X i = 1 J j X ij. —average for the row i of A over related subjects’ scores, X j = 1 I i X ij. —average for the column j of B over related subjects’ scores, X = 1 J i X i = 1 I j X .j. —overall mean of all the scores. With this notation, if we only test for the main effect of factor B (similarly for the main effect of factor A), the null and alternative hypotheses can be written as • H 0 (B): µ j = µ for all j • H 1 (B): µ j = µ for at least one j The Main Sum Between, MS B b ,forJ cells of factor B, is the average of estimated variance of estimated column means (this ignoring factor A). That is, MS B b = nI J − 1 j (X j –X) 2 (7.12) The Main Sum Within is the same error variance MS w considered above; it is equal to the average of (separately estimated) variances within the individual cells. Under the null hypothesis, MS B b MS w ∼ F (J −1,N −IJ) (7.13) That is, the ratio of two main averages has an F distribution with (J − 1) degrees of freedom in the numerator and (N −IJ) degrees of freedom in the denominator, which is the total number of observations N minus the total number of cells, IJ. Interaction Between Factors. Unfortunately, the main effects are not suffi- cient to answer questions such as, “Does the effect of factor A remain the same at different levels of factor B?” For example, in some observations of Experi- ment One the main effect of the visibility factor is that it significantly affects the subjects’ path length: The invisible task results in longer path lengths compared to the visible task. However, we notice that the subjects’ scores on the visibility factor are also affected by the interface factor: Namely, in the physical test (in the booth) the visibility factor has no significant effect on the path length, whereas in the virtual test the visibility factor has a significant effect on the path length. This suggests that the tests on main effects may be missing such interaction effects. The latter can be tested by the following formulas of interaction (for details refer to Refs. 126 and 127): SS AB = n i j (X ij. − X i − X .j. + X ) 2 MS AB = SS AB (I − 1)(J − 1) (7.14) RESULTS—EXPERIMENT ONE 367 F = MS AB MS w ∼ F [(I −1)(J −1),N −IJ] (7.15) where MS AB represents the Interaction Mean Square between factors A and B. The ratio has an F distribution with (I −1)(J − 1) degrees of freedom in the numerator and (N − IJ) degrees of freedom in the denominator. In the case considered now, the results of F test may show that the effect of visibility factor on the path length depends on the type of interface utilized in a given task. In other words, there is an interaction between the visibility factor and the interface factor. One way to express interactions is by saying that one effect is modified (qualified) by another effect. When the data indicate an interaction between factors, the notion of a main effect has no meaning. In such cases, tests of simple effects can be more useful than tests of main effects. Simple effect tests are done via one-way analysis of variance across levels of one factor, performed separately at each level of the other factor. For example, even if we suspect an interaction between the visibility factor and the interface factor, we might undertake simple effect tests for the visibility factor separately at the virtual and physical level, respectively, and see what kind of conclusions can be made based on the results. 7.4.5 Implementation: Two-Way Analysis for Path Length We are now ready to perform the analysis of variance on the Experiment One data. From other tests above, we already know that the direction factor has a significant effect on the path length. We know, further, that the left-to-right task is significantly easier for the subjects (it results in shorter paths) than the right-to- left task. We now want to analyze the combined effect of visibility and interface factors on the subjects’ performance. Even though the underlying data are not known to obey the normal distribution, we justify using the ANOVA by the F test being known to be robust. The data set has been first separated into the LtoR and RtoL data sets. The ANOVA variables are: • Dependent variable: Path length. • Independent variables: 1. Visibility factor, with two levels: visible and invisible. 2. Interface factor, with two levels: virtual and physical. In the tables of results that appear here, the following terms are used: df effect—degrees of freedom for a given effect, including main and interac- tion effects. MS effect—Mean Square for an effect, including main and interaction effects. df error—degrees of freedom for the error variance, or Mean Square Within. 368 HUMAN PERFORMANCE IN MOTION PLANNING MS error—Mean Square for the error variance, or Mean Square Within. Rows with the effect names “1” and “2” correspond to main effects. Rows with more than one digit in the name, such as “12” or “123,” relate to the corresponding interaction effects. Results. For the left-to-right task, the summary of ANOVA results appears in Table 7.9. The p-levels for the visibility and interface factors are about 0.01, and the p-level for the interface is much greater than 0.01. This means the main effects of both factors are slightly significant, and there is no interaction. We therefore conclude that for both visible and invisible environments, the path length is affected only slightly by the interface factor. This reconciles with our knowing that for the physical task the path length is slightly shorter than for the virtual task. And, for both physical and virtual tasks the path length is only slightly affected by the visibility factor. Again, this reconciles with our knowing that in the visible environment the path length is slightly shorter than in the invisible environment. The summary of ANOVA results for the right-to-left task appears in Table 7.10. Here the p-levels for the visibility factor, the interface factor, and the interaction are all greater than 0.01. Therefore, the main effects make no significant differ- ence for the dependent variable, and there is no interaction. The conclusion is that TABLE 7.9. ANOVA Results for Path Length: Interface and Visibility Factors; LtoR Task ANOVA Effects Studied: 1—interface, 2—visibility df MS df MS Effect Effect Effect Error Error F-Value p-Level 1 1 18828.49 91 3036.650 6.200416 0.014587 2 1 21314.74 91 3036.650 7.019164 0.009508 12 1 201.26 91 3036.650 0.066277 0.797418 TABLE 7.10. ANOVA Results for Path Length: Interface and Visibility Factors; RtoL Task ANOVA Effects Studied: 1—interface, 2—visibility df MS df MS Effect Effect Effect Error Error F-Value p-Level 1 1 5283.137 91 12592.88 0.419534 0.518801 2 1 77.399 91 12592.88 0.006146 0.937684 12 1 8598.556 91 12592.88 0.682811 0.410782 RESULTS—EXPERIMENT ONE 369 for either the visible or invisible environments, the path length for the physical task is not significantly different from the path length in the virtual task. Also, in either of physical or virtual tasks, the path length in the visible environment does not significantly differ from the path length in the invisible environment. 7.4.6 Implementation: Two-Way Analysis for Completion Time In the previous section we have analyzed the effects of test factors on the length of paths generated by the human subjects in Experiment One. We will now analyze how these same factors affect another performance indicator, the task completion time. Each completion time score is random and independent (for the 48 subjects tested here); this meets the “sampling assumption” of nonparametric statistics and analysis of variance. Even though a closer look at the completion time data shows that they do not obey a normal distribution (as the ANOVA assumption requires), we still use ANOVA, counting on the F test known to be robust. To analyze the effect of all factors on the completion time data, a three-way analysis of variance has been done. The ANOVA variables are as follows: • Dependent variable: Completion time. • Independent variables: 1. Direction factor, with two levels: LtoR and RtoL. 2. Visibility factor, with two levels: visible and invisible. 3. Interface factor, with two levels: virtual (simulation) and physical (booth). Second, since we are more interested in the visibility factor and interface factor, and since the performance in LtoR task significantly differs from that in RtoL task, a two-way ANOVA was implemented. The ANOVA variables are: • Dependent variable: Completion time. • Independent variables: 1. Visibility factor, with two levels: visible and invisible. 2. Interface factor, with two levels: virtual and physical. Results. The summary of ANOVA results of analysis of variance for all three factors used in Experiment One appears in Table 7.11. The p-levels for the interface factor, the direction factor, and the interaction between them are less than 0.01. This means that these two main effects likely significantly affect the dependent variable (completion time), and there is interaction between them. The p-levels for the remaining main effect, visibility, and for interactions with this factor are greater than 0.01. This means that there is no significant difference for these effects and interactions. However, given that an interaction has been detected, we should not be forming any conclusions from the results in Table 7.11 until we separate the factor levels. 370 HUMAN PERFORMANCE IN MOTION PLANNING TABLE 7.11. ANOVA Results for Completion Time: Direction, Interface, and Visibility Factors ANOVA Effects Studied: 1 —interface, 2—visibility, 3—direction df MS df MS Effect Effect Effect Error Error F -Value p-Level 1 1 5248143.0 184 51978.68 100.9672 0.000000 2 1 189719.0 184 51978.68 3.6499 0.057626 3 1 3475104.0 184 51978.68 66.8563 0.000000 12 1 12523.0 184 51978.68 0.2409 0.624127 13 1 427953.0 184 51978.68 8.2332 0.005494 23 1 45246.0 184 51978.68 0.8705 0.352049 123 1 52450.0 184 51978.68 1.0091 0.316444 TABLE 7.12. ANOVA Results for Completion Time: Interface and Visibility Factors, LtoR Task ANOVA Effects Studied: 1—interface, 2—visibility df MS df MS Effect Effect Effect Error Error F -Value p-Level 1 1 1339396.0 92 32864.84 40.75467 0.000000 2 1 210132.0 92 32864.84 6.39382 0.013155 12 1 6858.0 92 32864.84 0.20867 0.648886 Hence the data for the left-to-right task was analyzed, which is one of two levels of the direction factor. The summary of related ANOVA results appears in Table 7.12. The p-level for the interface factor is smaller than 0.01, the p-level for the visibility factor is about 0.01, and the p-level for the interface factor is greater than 0.01. Therefore, the main effect of interface is statistically significant, the main effect of visibility is slightly significant, and there is no interaction between them. This reconciles with our knowledge that for both visible or invisible tasks, the completion time for the physical task is significantly shorter than for the virtual task. Similarly, for both physical or virtual tasks the completion time is slightly shorter in the visible environment than in the invisible environment. The summary of ANOVA results for the right-to-left task appears in Table 7.13. The p-level for the interface factor is smaller than 0.01, the p-levels for the vis- ibility factor and interaction are greater than 0.01. This means the main effect of the interface factor is statistically significant, and there is no interaction. This RESULTS— EXPERIMENT TWO 371 TABLE 7.13. ANOVA Results for Completion Time: Interface and Visibility Factors, RtoL Task ANOVA Effects Studied: 1—interface, 2—visibility df MS df MS Effect Effect Effect Error Error F -Value p-Level 1 1 4336700.0 92 71092.52 61.00080 0.000000 2 1 24833.0 92 71092.52 0.34930 0.555959 12 1 58115.0 92 71092.52 0.81746 0.368286 reconciles with test observations: For both visible or invisible tasks the com- pletion time in the physical task was significantly shorter than in the virtual task. Similarly, for both physical and virtual tasks the completion time in the visible environment shows no significant difference from that in the invisible environment. 7.5 RESULTS—EXPERIMENT TWO Recall that Experiment Two was designed to analyze the effect of subjects’ training and the related effect of the visibility factor on human performance. A total of 12 subjects appeared in this study. In the first group, which included six subjects, on day 1 each subject was subjected to six different training tasks, plus one test task at the end, all in the visible environment. About one week later, on day 2, the same subjects performed the same six training tasks, plus the same test task, this time in the invisible environment. In the second group, the remaining six subjects did the same tasks in the opposite order—that is, tests in the invisible environment on day 1 and tests in the visible environment on day 2. The specific task was right-to-left movement of the arm, the same as in Experiment One (recall that this is a more difficult task compared to the left-to-right task). We therefore have a training factor Day, with two levels, day 1 and day 2. Subjects were expected to learn the motion planning skill through a repeated exercise. Similar to Experiment One, human performance was measured by the path length and completion time for each of the tasks Path and Time. Path length is the measure of motion generated by the arm manipulator during the task. Completion time is the time it takes the subject to complete the task. Both measure the subjects’ proficiency in carrying out motion planning. We suppose that both the path length and the completion time may be affected by such factors as training and visibility of the scene, and we would like to quantify those effects. In statistical terms, the training and visibility factors are independent vari- ables, whereas the path length and completion time are dependent variables. The 372 HUMAN PERFORMANCE IN MOTION PLANNING objective of data analysis is to test whether the training and/or visibility fac- tor improves the overall human performance in motion planning. If in terms of both dependent variables the improvement in subjects’ performance turns out to be significant, follow-up tests on the separate effects on human performance should be conducted, to explain which specific aspects of human performance are responsible for such effects. Multivariate analysis of variance (MANOVA) is a good technique for data analysis of overall effects [128]. Multivariate analysis of variance is conceptually a straightforward extension of the univariate ANOVA technique described above. Their major distinction is that if in ANOVA one evaluates mean differences on a single dependent vari- able, in MANOVA one evaluates mean vector differences simultaneously on two or more dependent variables. In addition, the MANOVA design accounts for the fact that dependent variables may be correlated. For instance, two depen- dent variables in Experiment Two, the path length and completion time, are indeed relatively highly correlated, with the correlation coefficient 0.79. In this case, MANOVA should provide a distinct advantage over separate ANOVAs. In fact, performing separate ANOVA tests carries an implicit assumption that either the dependent variables are uncorrelated or such correlations are of no importance. 7.5.1 The Technique Assumptions. The first and partly second of the three following assumptions are required by MANOVA (and are the same for the statistical tests considered above): 1. Observation scores are randomly sampled from the population of interest. Observations are statistically independent of one another. 2. Dependent variables have a multivariate normal distribution within each group of interest. This means that (a) each dependent variable is dis- tributed normally, (b) any linear combination of the dependent variables are distributed normally as well; (c) all subsets of the variables have a multivariate normal distribution. In practice, it is unlikely that this and the next assumption are met precisely. Fortunately, similar to ANOVA, MANOVA is relatively robust to violations of these assumptions. In prac- tice, MANOVA tends to perform well regardless of whether or not the data violate these assumptions. 3. Homogeneity of covariance matrices. That is, all groups of data are assumed to have a common within-group population covariance matrix. This can be likened to the assumption in ANOVA of homogeneity of variance for each dependent variable, or the assumption that correlation between any two dependent variables must be the same in all groups. If the number of sub- jects is approximately the same in the experimental groups, a violation of the assumption of covariance matrix homogeneity leads to a slight reduction in statistical power [128–130]. RESULTS— EXPERIMENT TWO 373 Multivariate Null Hypothesis. Hypotheses in MANOVA are very similar to those in univariate ANOVA, except that vectors of means are considered instead of single values (scalars) of means. For a simple example, imagine we carry out a one-way MANOVA for a visible task and invisible task groups. We would like to know if the scores of path length and completion time came from the same population that includes visible and invisible task data. That is, we want to compare the population mean vector for the dependent variables for one group with the population mean vector for the dependent variables for another group. Suppose µ ij represents the mean of the dependent variable i for group j , i = 1, 2, j = 1, 2. The mean vector for group j can be written as µ j = µ 1j µ 2j Then the multivariate null hypothesis H 0 can be written as an equality of vectors: H 0 : µ 1 =µ 2 =µ The alternative hypothesis H 1 in this case says that for at least one variable there is at least one group with a population mean different from that in the other group(s): H 1 : µ 1 =µ 2 Calculating MANOVA Test Statistics. Derivation of the MANOVA test statistics is similar to that in ANOVA but involves relatively cumbersome matrix operations and equations. Hence we will limit the discussion to a conceptual level (see Ref. 130 for more detail). Recall that the ANOVA attempts to test if the amount of variance explained by the independent variable (namely, SS b , see Section 7.4.3) exceeds significantly the variance that has not been explained (namely, SS w ).Thevariancehereisa function of the sum of squares of deviations from the mean for an entire group (the latter being called the sum of squares, SS). The ANOVA’s F statistics is a ratio of the mean square between, MS b , to the mean square within, MS w . Instead of scalars of dependent variables, MANOVA employs a vector of dependent variables. A single sum of squares is replaced with a complete (total) matrix of sums of squares and cross-products, SP t . Along its diagonal the matrix has the sums of squares that represent variances for all dependent variables, and in its off-diagonal elements it has cross-products that represent covariances of variables. Just as a univariate ANOVA, MANOVA divides matrix SP t into the within-group matrix, SP w , and the between-group matrix, SP b . From algebra, the matrix determinant expresses the amount of generalized variance, or the total variability that is present in the underlying data and is expressed through the dependent variables. One can hence compare the generalized variance of one matrix with another. Wilks’ lambda test is perhaps the most widely used statistical test of multivari- ate mean differences [130]. It derives from the following idea. Since matrix SP b 374 HUMAN PERFORMANCE IN MOTION PLANNING represents the amount of explained variance and covariance, and matrix SP w rep- resents the remaining variance and covariance, in the case of a significant effect one would expect matrix SP b to have a larger generalized variance compared to matrix SP w . Wilks’ lambda index, , is defined as a ratio of determinants of the two matrices: = |SP w | |SP t | = |SP w | |SP w + SP b | (7.16) where SP t , SP w ,andSP b are the total, within-group, and between-group SP matrices, respectively. We associate the value of with the effect’s significance. The value can also be interpreted as the proportion of unexplained variance. The main effects and interaction effects in multiple-way MANOVA are conceptually the same as those in ANOVA. While computations are more complex in MANOVA, their underlying logic is the same as in ANOVA. If an overall significant multivariate effect is found, the next natural step is to submit the data to further testing, to see whether all dependent variables or some specific dependent variables are affected by the independent variables. Performing multiple univariate ANOVAs for each of the dependent variables is a common method for interpreting the respective effects. One attempts to identify specific dependent variables that contributed to the overall significant effect. Repeated Measures MANOVA. In our statistical tests so far, all independent variables involved in ANOVA and MANOVA were also between-subjects vari- ables (or factors); we were interested in differences between means or mean vectors of several distinct groups of subjects. The observed scores were indepen- dent of each other at different levels of the between-subjects variables. However, in Experiment Two we also want to study the difference in responses of the same subjects before and after treatment; in our case, treatment is training. This variable is called repeated measures, and its analysis is called repeated measures MANOVA. In a repeated measures design the several response variables are results of the same test carried out by the same subjects, applied a number of times or under more than one experimental condition. For example, in Experiment Two each subject was assessed as to their path length and completion time on day 1 and again on day 2. The variable “day” is a repeated measures variable, as well as a within-subjects variable. In other words, a between-subjects variable is a grouping variable—similar to the visibility or interface in our study—whereas a within-subjects variable refers to the measurements for every level of the within-subjects variable. For example, a within-subjects variable may be “time,” or “day,” or “training factor.” A study can involve both within- and between-subjects independent variables. Our Experi- ment Two analysis constitutes a 2 (days) by 2 (visibility levels) repeated measures MANOVA, or repeated measures ANOVA. The first independent variable, day, is a within-subjects (repeated measures) variable, and the last independent variable, visibility, is a between-subjects variable. RESULTS— EXPERIMENT TWO 375 Repeated measures MANOVA is an extension of the standard MANOVA. The underlying principles of both are almost the same. In the standard MANOVA, vectors of means are compared across the levels of independent variables. In the repeated measures MANOVA, vectors of mean differences are compared across the levels of independent variables. Mean differences are the differences in values of dependent measures between levels of the within-subjects variable. These can be seen as new independent variables. If, for example, the dependent variables were measured for each subject at four different time moments, say at times T1 through T4, these original four variables would be transformed to three alternative derived difference variables, denoted (T1–T2), (T2–T3), and (T3–T4). These three new variables directly address the questions of interest. The repeated measures MANOVA, therefore, compares the vectors of means across the new transformed variables, not the original scores. When conducting a repeated measures MANOVA, a sphericity assumption must be met. It requires that the covariance matrix for the transformed variables be a diagonal matrix. That is, the values (variances) along the diagonal of the transformed covariance matrix should be equal, and all the off-diagonal elements (correlation coefficients) should be zeros. The purpose of the sphericity assump- tion is to ensure the homogeneity of covariance matrices for the new transformed variables [131, 132]. 7.5.2 Implementation Scheme Experiment One. Recall that in Experiment One the observation scores in each task were measured on two dependent variables, path length and completion time. Subjects have been randomly selected, and sets of scores were mutually independent. Further, the two dependent variables were correlated, with the cor- relation coefficient 0.74. We take this correlation into account when performing the significance test, since the overall set of dependent variables may contain more information than each of the individual variables. This suggests that the Experiment One data can be a candidate for a multivariate analysis of variance, MANOVA. Since, as discussed in the previous section, the effect of direction factor in Experiment One is statistically significant, we separately perform two sets of MANOVAs—one for the left-to-right task and the other for the right-to-left task. When performing MANOVA for the left-to-right task, the data set forms a two- way array, 2 (visibility) × 2 (interface). For the right-to-left task, the data set also forms a two-way array, 2 (visibility) × 2 (interface). The results of analysis should answer questions such as: (1) does human performance improve in the visible environment compared to the invisible environment? (2) Does human performance improve in a test with the physical arm manipulator as compared to the virtual arm manipulator? (3) Does the effect of the visibility factor work across the levels of the interface factor? [...]... PERFORMANCE IN MOTION PLANNING TABLE 7 .14 Descriptive Statistics for the Data in Experiment Two Descriptive Statistics Valid N Mean Minimum Maximum Std Dev day1-path 12 96.68 24.39 232.55 66.59 day1-time 12 432.67 65.00 900.00 333.66 day2-path 12 129.04 15.13 393.90 107.99 day2-time 12 432.42 36.00 900.00 365.89 vis-path 12 88.83 15.13 181.92 62.20 vis-time 12 360.25 36.00 900.00 620.83 invis-path 12 136.89... robot motion planning with uncertainty, were undertaken in the author’s laboratories at Yale University in the late 1980s [106, 134] and at the University of Wisconsin—Madison Sensing, Intelligence, Motion, by Vladimir J Lumelsky Copyright 2006 John Wiley & Sons, Inc 389 390 SENSITIVE SKIN—DESIGNING AN ALL-SENSITIVE ROBOT ARM MANIPULATOR in the 1990s [135] These included our own work on whole-sensitive... improve—though only a little bit—human performance in motion planning tasks such as ours Effects of the Motion Direction Factor This factor has two components, left-toright direction and right-to-left direction of motion These two tasks took place in the same scene and with the same two-link arm manipulator The only difference was that in the first task one was asked to move the arm from position S (start) to position... involving motion where, given enough training, humans become extremely adept; an acrobat on the trapeze is but one example There is a big difference, however: The acrobat does a once-and-for-all learned motion, whereas our tasks require constant spatial reasoning Our test protocols do not allow a subject to simply memorize a task We want our subjects to learn how to do a class of tasks; we want them to... time, and so using human performance as a benchmark would be an “apples-to-apples” comparison We tend to associate motion planning tasks with “thinking” and intelligence: If our robots perform well in such tasks, we not only can be proud of the robots performance but can also use this fact in technical systems If human performance in motion planning tasks turns out to be less than ideal—and the results... to take us to task: How can we be sure that those strategies can be implemented in real systems? After all, those chapters imply that an important prerequisite of sensor-based motion planning algorithms is a whole-body sensing ability—an ability by the robot to sense surrounding objects at every point of its body This is a hardware component, and such hardware is hardly an off-the-shelf item Is it even... (effect) in the ANOVA left-to-right and in ANOVA right-to-left, respectively The fact that no significant effect of the interface factor on the path length was found in the more difficult right-to-left task is surprising It suggests that the importance to a human operator of the type of interface fades as the spatial tasks become harder To put it bluntly, in nontrivial teleoperation motion planning tasks... Is it even feasible? We are therefore obliged to convince the reader that the whole-body sensing is indeed feasible In fact, this is exactly how the research had proceeded: Once some sensor-based algorithms appeared, the work started in earnest on appropriate sensing hardware It became soon clear that the appropriate sensing device should look like a sensitive skin covering the whole robot body Research... robot intelligence; mere improvements in the control means will not go far enough The difference in the factor effects on the two dependent variables—path length and completion time—is not hard to explain The length of a path generated by a human subject is, in general, independent of how quickly or slowly DISCUSSION 385 one moves the arm or how continuous its motion is If the subject stops to think how. .. standpoint of the robot’s overall motion plan The latter function is done by the Path Planner unit (see Figure 8.1) The Path Planner makes sure that each step is implemented according to the sensor-based motion planning algorithm used (More detail on the overall scheme can be found in Ref 115.) As discussed in prior chapters, motion planning algorithms’ requirements to the whole-body sensing include two major . PERFORMANCE IN MOTION PLANNING X i = 1 J j X ij. —average for the row i of A over related subjects’ scores, X j = 1 I i X ij. —average for the column j of B over related subjects’ scores, X = 1 J i X i. left-to-right task and the other for the right-to-left task. When performing MANOVA for the left-to-right task, the data set forms a two- way array, 2 (visibility) × 2 (interface). For the right-to-left. path gen- erated by a human subject is, in general, independent of how quickly or slowly DISCUSSION 385 one moves the arm or how continuous its motion is. If the subject stops to think how to