

small (250x250 max)
medium (500x500 max)
Large
Extra Large
large ( > 500x500)
Full Resolution


A PERMUTATION TEST FOR THE STRUCTURE OF A COVARIANCE MATRIX By TRACY LYNNE MORRIS Bachelor of Science in Mathematics Oklahoma State University Stillwater, Oklahoma 1994 Master of Science in Applied Mathematical Sciences University of Central Oklahoma Edmond, Oklahoma 2001 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY May, 2007 ii A PERMUTATION TEST FOR THE STRUCTURE OF A COVARIANCE MATRIX Dissertation Approved: Dr. Mark Payton Dissertation Adviser Dr. William Warde Dr. Stephanie Monks Dr. Douglas Aichele Dr. A. Gordon Emslie Dean of the Graduate College iii ACKNOWLEDGMENTS I would like to express my sincere gratitude to Dr. Mark Payton for his support, expertise, encouragement, and advice, without which I would not have been able to complete this dissertation. I would also like to thank the other members of my committee, Dr. William Warde, Dr. Stephanie Monks, and Dr. Douglas Aichele, for reviewing my work and providing invaluable guidance. Finally, I would like to extend my gratitude to the entire Department of Statistics. Special thanks also goes to Dr. Mauricio Subieta and the High Performance Computing Center. Without Dr. Subieta’s help and the use of the computing center I would undoubtedly be running simulations for many months to come. Finally, I would like to thank my family and friends for their support. I am especially thankful for my husband, Mark, and my mother, Kay, for their loving encouragement and emotional support throughout this process. iv TABLE OF CONTENTS Chapter Page 1. INTRODUCTION ....................................................................................................1 2. PARAMETRIC TESTS FOR THE STRUCTURE OF A COVARIANCE MATRIX...............................................................................................................3 2.1 Tests of Sphericity .........................................................................................5 2.2 Test of Compound Symmetry......................................................................10 2.3 Test of Type H Structure..............................................................................13 2.4 Test of Serial Correlation.............................................................................15 2.5 Test of Independence of Sets of Variates ....................................................19 2.6 Factor Analysis / Structural Equation Modeling .........................................21 3. BOOTSTRAPPING................................................................................................26 4. PERMUTATION TESTS.......................................................................................28 5. PROPOSED TEST..................................................................................................33 5.1 Permutation Tests of Sphericity and Compound Symmetry........................34 5.2 Permutation Test of Type H Structure .........................................................38 5.3 Permutation Test of All Other Covariance Structures .................................40 6. SIMULATIONS .....................................................................................................45 6.1 Test of Sphericity.........................................................................................50 6.2 Test of Compound Symmetry......................................................................63 6.3 Test of Type H Structure..............................................................................73 6.4 Test of Serial Correlation.............................................................................81 6.5 Test of Independence of Sets of Variates ....................................................92 7. CONCLUSIONS...................................................................................................100 v 8. FUTURE WORK..................................................................................................102 BIBLIOGRAPHY......................................................................................................105 APPENDIX................................................................................................................111 A.1 R Code.......................................................................................................111 A.1.1 Randomization Test of Sphericity.................................................111 A.1.2 Randomization Test of Compound Symmetry..............................118 A.1.3 Permutation Test of Type H Structure ..........................................125 A.1.4 Randomization Test of Type H Structure .....................................132 A.1.5 Randomization Test of Serial Correlation ....................................139 A.1.6 Randomization Test of Independence of Sets of Variates ............149 A.1.7 Randomization Test of Independence of Sets of Variates for n=5 and (5,5)............................................................................157 A.2 Combinations of d and D used in the Type H Simulations........................162 A.3 Bias Correction for the Serial Correlation Parameters..............................166 A.4 Number of Simulated Data Sets for the Test of Independence of Sets of Variates for n=5 and (5,5)..................................................................180 vi LIST OF TABLES Table Page 4.0.1 All Possible Permutations of the Observed Data...................................................29 5.1.1 Observed Data........................................................................................................36 5.1.2 Some Permutations of the Centered Observed Data..............................................37 6.1.1 Simulated Type I Error Rates for the Test of Sphericity .......................................52 6.1.2 Simulated Power vs. NonHomoscedasticity for the Test of Sphericity ...............54 6.1.3 Simulated Power vs. NonZero Correlation for the Test of Sphericity .................56 6.1.4 Simulated Power vs. Type H for the Test of Sphericity (p=3) ..............................58 6.1.5 Simulated Power vs. Type H for the Test of Sphericity (p=5) ..............................59 6.1.6 Simulated Power vs. Type H for the Test of Sphericity (p=10) ............................60 6.1.7 Simulated Power vs. Serial Correlation for the Test of Sphericity ( 2 =1) .........62 6.2.1 Simulated Type I Error Rates for the Test of Compound Symmetry (p=3) ..........64 6.2.2 Simulated Type I Error Rates for the Test of Compound Symmetry (p=5) ..........65 6.2.3 Simulated Type I Error Rates for the Test of Compound Symmetry (p=10) ........66 6.2.4 Simulated Power vs. Type H for the Test of Compound Symmetry (p=3) ...........68 6.2.5 Simulated Power vs. Type H for the Test of Compound Symmetry (p=5) ...........69 6.2.6 Simulated Power vs. Type H for the Test of Compound Symmetry (p=10) .........70 6.2.7 Simulated Power vs. Serial Correlation for the Test of Compound Symmetry ( 2 =1) ............................................................................................................72 6.3.1 Simulated Type I Error Rates for the Test of Type H (p=3)..................................74 vii 6.3.2 Simulated Type I Error Rates for the Test of Type H (p=5)..................................75 6.3.3 Simulated Type I Error Rates for the Test of Type H (p=10)................................76 6.3.4 Simulated Power vs. Serial Correlation for the Test of Type H (p=3) ..................78 6.3.5 Simulated Power vs. Serial Correlation for the Test of Type H (p=5) ..................79 6.3.6 Simulated Power vs. Serial Correlation for the Test of Type H (p=10) ................80 6.4.1 Simulated Type I Error Rates for the Test of Serial Correlation (p=3) .................83 6.4.2 Simulated Type I Error Rates for the Test of Serial Correlation (p=5) .................84 6.4.3 Simulated Type I Error Rates for the Test of Serial Correlation (p=10) ...............85 6.4.4 Simulated Power vs. Compound Symmetry for the Test of Serial Correlation ( 2 =1) ............................................................................................................87 6.4.5 Simulated Power vs. Type H for the Tets of Serial Correlation (p=3) ..................89 6.4.6 Simulated Power vs. Type H for the Tets of Serial Correlation (p=5) ..................90 6.4.7 Simulated Power vs. Type H for the Tets of Serial Correlation (p=10) ................91 6.5.1 Simulated Type I Error Rates for the Test of Independence of Sets of Variates...96 6.5.2 Simulated Power vs. NonIndependence for the Test of Independence of Sets of Variates........................................................................................................99 A.2.1 Combinations of d and D Used in the Simulations...............................................165 A.3.1 Simulated Values of K ........................................................................................172 A.3.2 Simulated Type I Error Rates for the Test of Serial Correlation with Normally Distributed Data and p=3...............................................................................174 A.4.1 Number of Data Sets Simulated for the Test of Independence of Sets of Variates for n=5 and (5,5) ............................................................................................180 viii LIST OF FIGURES Figure Page 2.4.1 General Plot of f (Hˆ ) ............................................................................................18 5.1.1 Distribution of the Test Statistic for Compound Symmetry ..................................38 5.3.1 Distribution of the Test Statistic for Serial Correlation.........................................44 A.2.1 Regions of Possible Combinations of d and D .....................................................164 A.3.1 Bias of the MLEs of H and 2 .............................................................................167 A.3.2 Bias of 2 * ˆ and 2 * ˆ 1 n n .......................................................................................169 A.3.3 Scatterplots of K and nK ...................................................................................173 1 CHAPTER 1 INTRODUCTION Many statistical procedures, including repeated measures analysis, timeseries, structural equation modeling, and factor analysis, require an assessment of the structure of the covariance matrix of the measurements. For example, consider a repeated measures experiment in which researchers are interested in the effect of various teaching strategies on reading. Throughout the course of the experiment, reading tests are given to children at various time periods and the multiple test scores are recorded for each student. Repeated measurements taken on subjects tend to be correlated; consequently, the assumption of independent observations required by a univariate analysis of variance (ANOVA) is violated. However, Huynh and Feldt (1970) showed that if the structure of the covariance matrix is of the same form as a type H matrix (described in Section 2.3), a univariate ANOVA can be used. If the covariance matrix is not type H, an alternate analysis must be employed. Therefore, it is necessary to determine the structure of the covariance matrix to know how to proceed with the analysis. The classical parametric method of testing the hypothesis = 0 , where 0 is some hypothesized covariance structure, involves the use of a likelihood ratio test statistic that converges in distribution to a chisquared random variable. This test has many limitations, including the need for very large sample sizes and the requirement of a random sample from a multivariate normal population. It is quite reasonable to think of 2 many situations in which at least one of these conditions is violated. For example, in educational and medical studies, researchers frequently rely on volunteers, violating the assumption of a random sample; in psychological studies, responses are often recorded on Likert scales, violating the assumption of multivariate normality; and in studies in which experimental units are rare or costly, researchers are restricted to very small sample sizes. In situations in which only some or none of these assumptions are met, researchers could benefit from a nonparametric testing procedure. In particular, permutation tests have no distributional assumptions, do not require random samples, and allow any sample size. The objectives of this research are to develop a permutation testing procedure to test the null hypothesis, = 0 , and to investigate the empirical type I error rates and power of this test against various alternative structures. In the following chapters, I will present the motivation for developing such a test. Specifically, in Chapter 2, I will describe the parametric procedures for testing the structure of a covariance matrix, including a discussion of their benefits and limitations; in Chapter 3, I will briefly discuss the use of bootstrapping for estimating or testing the covariance structure; in Chapter 4, I will review the general history and development of permutation tests, including a description of the differences between permutation tests and bootstrapping; in Chapter 5, I will propose a permutation test for the structure of a covariance matrix and will argue as to why such a test would be appropriate and necessary; in Chapter 6, I will describe the evaluation of the proposed test using simulations; in Chapter 7, I will summarize the overall conclusions; and finally, in Chapter 8, I will list some future research questions. 3 CHAPTER 2 PARAMETRIC TESTS FOR THE STRUCTURE OF A COVARIANCE MATRIX The classical approach to testing the structure of a covariance matrix involves the use of a likelihood ratio test statistic. Let xi , i =1,K,n , be pcomponent vectors distributed according to ( , ) p N μ , where is positive definite and n > p . The likelihood ratio criterion for testing 0 0 H : = versus 0 : a H can be found by computing the ratio of the likelihood maximized under the null hypothesis (i.e. with respect to μ and 0 ) to the likelihood maximized under the alternative hypothesis (i.e. with respect to μ and an unrestricted positive definite ). The likelihood function under the alternative hypothesis is given by 2 2 1 ( ) 1 ( ) 2 1 (2 ) exp n np n i i i L = = x μ x μ and the corresponding log likelihood function is 1 1 1 ( ) 1 ( ) 2 2 2 1 log log(2 ) log n i i i L np n = = x μ x μ , (2.0.1) where log is assumed base e. Since log L is an increasing function of L, the values that maximize (2.0.1) are equivalent to the values that maximize L. To maximize (2.0.1) with respect to μ and , consider the following lemma given by Anderson (2003). 4 Lemma 2.0.1. Let xi , i =1,K,n , be pcomponent vectors, and let x be the corresponding sample mean vector. Then for any vector b ( )( ) ( )( ) ( )( ) 1 1 n n i i i i i i n = = x b x b = x x x x + x b x b . Applying the properties of the trace of a matrix (tr(•)) and lemma 2.0.1 to just the last term of (2.0.1) gives ( ) ( ) ( ) ( ) ( )( ) ( )( ) ( )( ) ( )( ) ( ) ( ) 1 1 1 1 1 1 1 1 1 1 1 1 tr tr tr tr tr . n n i i i i i i n i i i n i i i n i i i n n = = = = = = = = + = + x μ x μ x μ x μ x μ x μ x x x x x μ x μ x x x x x μ x μ Therefore, (2.0.1) can be written as ( )( ) ( ) ( ) 1 1 2 2 1 1 1 1 2 2 1 log log(2 ) log tr . n i i i L np n n = = x x x x x μ x μ (2.0.2) To maximize log L with respect to μ , it is only necessary to consider the last term of (2.0.2). Since is positive definite, 1 is also positive definite. Therefore, 1 ( ) 1 ( ) 2 n x μ x μ 0 and is maximized at 0 if and only if μ = x . Consequently, the maximum likelihood estimate (MLE) of μ is μˆ = x . Substituting x for μ , (2.0.2) simplifies to 1 1 1 1 ( )( ) 2 2 2 1 log log(2 ) log tr . n i i i L np n = = x x x x (2.0.3) 5 To find the MLE of , consider another lemma given by Anderson (2003). Lemma 2.0.2. If D is positive definite of order p, the maximum of f (G) = n log G tr (G 1D) with respect to positive definite matrices G exists and occurs at G = (1 n)D. To maximize (2.0.3) with respect to it is only necessary to consider the second and third terms. Applying lemma 2.0.2 to these terms gives the MLE of , ( )( ) 1 ˆ 1 n i i n i= = x x x x . These MLEs for μ and along with the MLEs found under various null hypotheses can be used to compute likelihood ratio statistics for parametrically testing the structure of a covariance matrix. Specific test statistics for various covariance structures are outlined in the sections to follow. 2.1 TESTS OF SPHERICITY Consider first the test of sphericity proposed by Mauchly (1940). A pvariate population is called spherical if the variances of the variables are all equal and the pairwise correlations among the variables are all zero. Specifically, this is a test of the null hypothesis 2 S p = I , where p I is the p× p identity matrix and 2 is the hypothesized common variance among the variables. This hypothesis applies to many univariate procedures, such as ANOVA, in which it is assumed that a set of random 6 variables are independent and have a common variance. To test this assumption, the likelihood ratio criterion given by ( ) ( ) , 2 , max , max , S S L L = μ μ μ μ can be computed. As shown previously, the MLE of μ does not depend on the specific form of ; therefore, the MLEs of μ and , in both the numerator and denominator of S , are given by μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x . To find the MLE of 2 , consider the following. Under the null hypothesis, the log likelihood function is ( )( ) 2 1 1 2 1 2 2 2 1 log log(2 ) log n S i i i L np np = = x μ x μ (2.1.1) and the partial derivative of (2.1.1) with respect to 2 is ( )( ) 2 2 4 1 log 1 2 2 n S i i i L np = = + x μ x μ . (2.1.2) Substituting x for μ and setting (2.1.2) equal to 0 gives the MLE of 2 , 2 ( ) ( ) 1 ˆ 1 n i i np i= = x x x x . Then, the likelihood ratio criterion for testing 2 S p = I becomes ( )( ) ( ) ( ) ( ) ( ) 2 2 2 2 1 1 2 2 1 2 2 2 2 1 1 2 1 (2 ) ˆ exp ˆ ˆ (2 ) ˆ exp ˆ ˆ n n np n i i i S n n np np i i i = = = = I x x I x x x x x x 7 since ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) 2 1 1 1 1 2 2 1 1 1 1 2 1 1 1 2 2 2 1ˆ 2 2 1 1 2 1 ˆ tr ˆ tr ˆ tr ˆ ˆ ˆ ˆ . n n i i i i i i n i i i np n i i i n np = = = = = = = = = = x x x x x x x x x x x x x x I x x (2.1.3) 2 n S is often called the W statistic in the literature and is usually expressed as ( 2 ) (1 ) ( 1 ( )) 1 ˆ ˆ ˆ ˆ ˆ tr ˆ p p p p p j jj p W = = = = (2.1.4) where ˆ jj is the jth diagonal element of ˆ which corresponds to the variance of the jth variable. To use W for hypothesis testing, it is necessary to know its distribution. Mauchly (1940) gave the exact distribution of W for p = 2 and Consul (1967a) for p = 3,4 , and 6. Nagarsenker and Pillai (1973a, 1973b) derived the exact distribution of W in series form and have published tables of 1% and 5% critical values for various combinations of p and n. However, due to the complexity of the exact distribution, the asymptotic distribution of W is most commonly used in practice. Similar to other likelihood ratio criteria, nlogW is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 1 (2.1.5) 8 degrees of freedom. This approximation works well for large sample sizes, but performs poorly for small sample sizes. Therefore, Anderson (2003), using a method derived by Box (1949), found a correction factor such that (n 1)C logW is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( ) 2 2 2 1 6 1 C p p p n + + = . (2.1.6) Many authors have found, through Monte Carlo simulation, that Mauchly’s (1940) test of sphericity has poor power and is not robust to nonnormality. Box (1954) developed a measure of the degree to which a covariance matrix is spherical. He called this measure , given by ( )2 1 2 1 p j j p j j p = = = , where j , j =1,K, p , are the eigenvalues of . If is spherical, all of the eigenvalues are equal and =1. The further departs from sphericity, the smaller the value of becomes until reaches its minimum at 1 ( p 1) . For p = 4 , Boik (1975) found that must be as low as 0.644 for n =18 and 0.828 for n = 36 before the power of Mauchly’s test of sphericity is greater than 0.70. Cornell, et al. (1992) found similar results. For p = 3 , was 0.51 for n =10 and 0.77 for n = 30 before the power exceeded 0.70; and for p = 5 , was 0.43 for n =10 and 0.5 for n = 30 before the power exceeded 0.70. Therefore, it appears that this test does not have the ability to detect small departures 9 from sphericity, for which the univariate ANOVA Ftests are susceptible (Boik, 1981; Box, 1954; Geisser & Greenhouse, 1958). Other studies have explored the effects of nonnormality on Mauchly’s (1940) test of sphericity. Huynh & Mandeville (1979) simulated data from three different lighttailed distributions (the uniform distribution on (0,1), the convolution of two uniforms forming a triangular distribution, and the convolution of three uniforms forming a trapezoidal distribution) and five different heavytailed distributions (the distribution of the product of a uniform random variable and a standard normal random variable, the double exponential distribution, and three mixtures of two standard normal distributions). They found that Mauchly’s (1940) test of sphericity is conservative in terms of the type I error rate for lighttailed distributions; however, the type I error rates are much larger than the respective nominal rates for heavytailed distributions. Also, as the sample size increases, the test becomes more conservative for lighttailed distributions and less conservative for heavytailed distributions. Another study by Keselman, et al. (1980) presented simulated data from a chisquared distribution with 3 degrees of freedom for which the type I error rate was 0.203, well above the nominal rate of 0.05. One alternative parametric test of sphericity is the locally best invariant test developed by John (1971, 1972) and Sugiura (1972). The test statistic is ( ) ( ) 2 2 tr ˆ tr ˆ V = (2.1.7) and Sugiura showed that 2 1 2 p n V p (2.1.8) 10 is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 1 degrees of freedom. This test has slightly greater power than Mauchly’s (1940) test of sphericity, with the difference increasing as p approaches n. However, this test still suffers from a lack of power to detect small departures from sphericity (Carter & Srivastava, 1983; Cornell et al., 1992). In addition to Mauchly’s (1940) test of sphericity and the locally best invariant test, several other tests of sphericity exist. One such statistic developed by Krishnaiah and Waikar (1972) consists of the ratio of the largest to smallest eigenvalues of ˆ and another family of test statistics is based on Roy’s unionintersection principle (Khatri, 1978; Srivastava & Khatri, 1979; Venables, 1976). In each case, however, the power is smaller than for Mauchly’s test of sphericity (Cornell et al., 1992). Therefore, the details of these tests will not be discussed here. 2.2 TEST OF COMPOUND SYMMETRY Independence between variables is actually too restrictive an assumption for a valid univariate ANOVA. It has been shown that the compound symmetry covariance structure is sufficient (Box, 1950). This structure arises when the variances of the variables are all equal and the covariances (or pairwise correlations) of the variables are all equal. Wilks (1946) was the first to develop a test for compound symmetry structure. In matrix notation, this is a test of the null hypothesis 2 (1 ) CS p p p = I + 1 1 , where 2 is the common variance, is the common pairwise correlation, p I is the p× p identity matrix, and p 1 is a p×1 vector of ones. 11 The derivation of this test is an extension of Mauchly’s (1940) test of sphericity. The likelihood ratio criterion is given by ( ) ( ) , 2 , , max , max , CS CS L L = μ μ μ μ (2.2.1) and the MLEs of μ and in both the numerator and denominator of CS can be found as shown at the beginning of Chapter 2. That is, μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x . To find the MLEs of 2 and , it will be necessary to determine the inverse of the covariance matrix under the null hypothesis. Call this matrix 1 CS . Wilks (1946) showed that 1 CS A B B B A B B B A = L L M M O M L where ( ) 2 ( )( ( ) ) 1 2 1 1 1 p A p + = + and 2 (1 )(1 ( 1) ) B p = + . He also noted that 1 ( ) 1 ( ( ) ) 1 p CS A B A p B = + . Therefore, the log likelihood, under the null hypothesis, becomes ( ) ( ( ) ) ( ) ( )( ) 1 1 1 2 2 1 2 2 1 1 1 1 log log(2 ) log 1 p CS n p n p ij j ij j ik k i j i j k L np n AB A p B A x B x x = = = = = + + μ + μ μ 12 where ij x and j μ are the jth elements of i x and μ , respectively. The MLEs of A and B can be found by substituting x for μ and solving the system of equations given by log CS 0 L A = and log CS 0 L B = . This results in the following MLEs of A and B, ( ) 2 ( )( ( ) ) ˆ 1 2 1 1 1 p r A s r p r + = + and 2 ( )( ( ) ) ˆ 1 1 1 B r s r p r = + where ( )( ) 1 1 n jk ij j ik k i s x x x x n = = , ( ) 1 1 1 p jk j k p jj j s r p s = = = , and 2 1 p jj j s s p = = . Substituting these MLEs into (2.2.1) and applying a similar argument to that shown in (2.1.3) we obtain ( ) ( ( ) ) ( ) ( ) ( ) ( ) ( ) ( ( ) ) ( ) ( ( ) ) ( ) ( ) 1 2 2 1 1 2 1 2 2 1 1 2 1 2 1 2 1 2 2 2 2 2 2 (2 ) ˆ ˆ ˆ 1 ˆ exp ˆ (2 ) ˆ exp ˆ ˆ ˆ ˆ ˆ 1 ˆ ˆ 1 1 1 1 1 ˆ 1 p n n np i CS i i CS n n np i i i n p n p n n n p A B A p B A B A p B s r s p r s r = = = + = + = !! "" ! " # $ ! + " # $ = x x x x x x x x ( ( ) ) 2 1 . 1 1 n p p r + Wilks (1946) determined the exact distribution of 2 n CS for p = 2 and 3. However, the derivation of the exact distribution for larger values of p is too complex to 13 be of practical use. Therefore, the asymptotic distribution is more commonly used. Specifically, log P2 n CS n is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 2 degrees of freedom. As with other likelihood ratio tests, this is a good approximation for large sample sizes, but is very poor for small sample sizes. Therefore, the corrected likelihood ratio test derived by Box (1950) is preferred. Box found that ( 1) log 2 n CS n C is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( )( ) ( )( )( ) 2 2 1 2 3 1 6 1 1 4 p p p C n p p p + = + . 2.3 TEST OF TYPE H STRUCTURE Huynh and Feldt (1970) and Rouanet and Lepine (1970) showed independently that the conditions required for a valid univariate ANOVA are actually less stringent than the sphericity or compound symmetry conditions. Specifically, they found that if the covariance matrix is of the form TH ij p× p = , (2.3.1) where 1 ( ) ij 2 ii jj = + % for i j and some % > 0 , then the mean square ratios in the univariate ANOVA have exact Fdistributions. Huynh and Feldt called this covariance form a type H matrix. (Notice that when the variances are equal in a type H matrix, the covariance matrix has compound symmetry.) More recently, type H structure has come to be known as spherical. However, since both forms will be discussed separately in this 14 paper, the covariance structure of Section 2.1 will be referred to as spherical and that of this section will be referred to as type H. Conveniently, Mauchly’s (1940) test of sphericity described in Section 2.1 can be used to test whether a covariance matrix has the type H structure (Kuehl, 2000). Let C be a ( p 1)× p matrix whose rows are normalized orthogonal contrasts on the p repeated measures. If is of type H then can be expressed as p = A + A + %I where the elements in the ith row of A are equal to 1 ( ) i 2 ii a = % . Then, C C = CAC +CA C + CC . Since each row of A consists of equivalent elements and C is orthogonal, it can be shown that AC = CA = 0 and p 1 CC = I . Therefore, p 1 C C = %I and Mauchly’s test of sphericity can be used to test 0 1 : p H C C = %I versus 1 :a p H C C =/ %I . Substituting 1 p for p and ˆ C C for ˆ in (2.1.4), (2.1.5), and (2.1.6), the test statistic is ( ( )) 1 1 1 ˆ tr ˆ p p W = C C C C and nlogW is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p 1 1 degrees of freedom, or after applying a correction factor, (n 1)C logW is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( )( ) 2 2 3 3 1 . 6 1 1 C p p p n + = (2.3.2) Just as for the test of sphericity, there are alternative tests for type H covariance structure, including a locally best invariant test. Substituting p 1 for p and C ˆC for 15 ˆ in (2.1.7) and (2.1.8) yields the corresponding test statistic and asymptotic distribution. Krishnaiah and Waikar’s (1972) test and the unionintersection tests described in Section 2.1 can also be adapted to test for type H structure. All of these tests, however, suffer from the same limitations as the tests of sphericity. They have poor power, especially to detect small departures from type H structure, and are not robust to nonnormality. 2.4 TEST OF SERIAL CORRELATION For designs, such as repeatedmeasures, in which one of the factors is time, observations closer together temporally tend to be more highly correlated than those farther apart. This covariance pattern is known as serial correlation, simplex, or autoregressive of order one and has the form 2 1 2 2 2 1 2 3 1 1 1 1 p p SC p p p = L L M M M O M L , (2.4.1) where 2 (1 2 ) is the common variance of the p observations and is the correlation between successive observations in time. Hearne et al. (1983) developed a likelihood ratio test for the null hypothesis SC = . The derivation of this test is as follows. The likelihood ratio criterion is given by ( ) ( ) , 2 , , max , max , SC SC L L = μ μ μ μ (2.4.2) 16 where the MLEs of μ and in both the numerator and denominator are μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x as shown at the beginning of Chapter 2. Before deriving the MLEs of 2 and , note that it can be shown 2 p (1 2 ) SC = and 1 ( 2 ) 2 SC 1 2 p = C C + I , where p, 2 , and are as defined previously, p I is the p× p identity matrix and 1 C and 2 C are given by 1 2 0 01 1 10 1 and 1 . 1 01 0 10 × × = = O OO O p p p p C C 0 0 0 0 (2.4.3) Using this notation and substituting x for μ , the loglikelihood under the null hypothesis can be expressed as ( ) ( ) ( ) ( ) ( ) ( ) ( ( ))( ) ( ) ( ) 2 2 2 2 2 2 2 1 1 1 1 2 2 0 2 0 1 1 1 1 1 2 2 2 1 2 1 2 1 1 1 2 1 2 1 2 2 2 2 1 2 2 2 3 log log 2 log log 2 log log 2 log log 1 , p n SC i i i n i p i i L np n np n np np n S S S = = = = + = + + x x x x x x C C I x x where ( ) ( ) 1 1 1 n i i i S = = x x C x x , ( ) ( ) 2 1 2 n i i i S = = x x C x x , and ( )( ) 3 1 n i i i S = = x x x x . Taking the partial derivatives of log SC L with respect to 2 and yields 2 2 2 4 1 4 2 4 3 log 1 2 2 2 2 SC L np S S S = + + 17 and 2 2 1 2 2 log 1 1 2 SC L n S S = + . Setting these derivatives equal to zero and solving simultaneously results in 2 1 ( 2 ) 1 2 3 ˆ ˆ ˆ np = S S + S (2.4.4) and ( ) 3 ( ) 2 ( ) 1 2 1 3 2 2S 1 p ˆ + S 2 p ˆ 2 S p + S ˆ + S p = 0 . (2.4.5) Note that the MLE of 2 is easy to obtain once the MLE of has been determined; however, there are three possible solutions for ˆ to equation (2.4.5). To determine the appropriate solution consider the following. Call the lefthand side of (2.4.5) f (Hˆ ) . Then, ( ) ( ) ( ) ( ) ( ) 1 2 1 3 2 1 2 3 f 1 = 2S 1 p + S 2 p + 2 S p + S + S p = 2 S + S + S > 0, and ( ) ( ) ( ) ( ) ( ) 1 2 1 3 2 1 2 3 f 1 = 2S 1 p + S 2 p 2 S p + S + S p = 2 S S + S < 0. and, consequently, there must be at least one solution in the interval ( 1,1) . If there is only one solution in ( 1,1) then that is the only reasonable solution since the MLE of the correlation between successive observations in time must be in ( 1,1) . Now note that f ( () = ( and f (() = (. So, a general plot of f (Hˆ ) would appear as shown in Figure 2.4.1 below with one solution in each of ( (, 1) , ( 1,1) , and (1,() . Therefore, there is one and only one solution in ( 1,1) which is the desired MLE of H . 18 Figure 2.4.1. General Plot of f (Hˆ ) . 1 1 f ( ) Finally, after substituting these MLEs into (2.4.2) and applying an argument similar to (2.1.3), the likelihood ratio criterion becomes ( ) ( ) ( ( ))( ) ( ) ( ) ( ) 2 2 2 2 2 2 2 ˆ 1 1 2 2 1 ˆ 2 ˆ 1 2 1 2 2 2 1 1 ˆ 2 1 ˆ 1 (2 ) exp ˆ ˆ ˆ (2 ) ˆ exp ˆ p p n n np n i p i i SC n n n np i i i = = + = = x x C C I x x x x x x where log P2 n SC n is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 2 degrees of freedom. A correction factor for this likelihood ratio test, similar to those for the tests of sphericity and compound symmetry, is not known to exist, and Hearne and Clark (1983) even go so far as to say that one is not tractable. Therefore, using simulation and simple linear regression, they derived an approximate correction factor given by ˆ 1 ( 1.541 1.017 0.414 ) n C = + n p , where ˆ log P2 n SC Cn is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above. 19 2.5 TEST OF INDEPENDENCE OF SETS OF VARIATES In some situations it may be of interest to determine whether k groups of variables are mutually independent. Let xi , i =1,K,n , be partitioned into k subvectors with 1 2 , , ,k p p K p ( ) m p = p components so that ( (1) , (2) , , (k ) ) i i i i x = x x K x . Also, let μ and be partitioned accordingly; (1) (2) (k ) ! " = ! " ! " !! "" # $ μ μ μ μ M , and 11 12 1 21 22 2 1 2 k k k k kk ! " = ! " ! " ! " # $ K K M M O M K where ij ji = . The null hypothesis of interest is that the subvectors (1) , (2) , , (k ) i i i x x K x are mutually independent. This is equivalent to testing I = , where 11 22 I kk ! " = ! " ! " ! " # $ 0 0 0 0 0 0 K K M M O M K . Wilks (1935) is credited with developing a likelihood ratio criterion for testing this hypothesis. Consider the likelihood ratio criterion given by ( ) ( ) , , max , max , I I I L L = μ μ μ μ . As shown at the beginning of Chapter 2, the MLEs of μ and in the numerator and denominator of I are given by μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x , 20 where x is partitioned as x = (x(1) , x(2) ,K, x(k ) ). Under the null hypothesis, the likelihood function becomes ( ( ) ) 1 , k m I m mm m L L = =) μ , where ( ( ) ) 2 2 1 ( ( ) ( ) ) 1 ( ( ) ( ) ) 2 1 , (2 ) m exp n m np n m m m m m mm mm i mm i i L = = μ x μ x μ . Maximizing I L is equivalent to maximizing ( ( ) ) 1 , k m m mm m L = ) μ and, since the likelihood function is strictly nonnegative, ( ) ( ) ( ) 0 ( ) ( ) , 1 1 , max , max , m mm k k m m m mm m mm m m L L = = ) =) μ μ μ μ . Therefore, each ( (m) , ) m mm L μ can be maximized separately. Thus, the MLEs of μ(m) and mm can be found as shown at the beginning of Chapter 2. That is, μˆ (m) = x(m) and ( ( ) ( ) )( ( ) ( ) ) 1 ˆ 1 n m m m m mm i i n i= = x x x x . By a similar argument to (2.1.3), the likelihood ratio criterion then becomes ( ) ( ) ( ) ( ) 2 2 ( ) ( ) 1 ( ) ( ) 1 2 2 1 1 2 2 2 1 1 2 1 1 (2 ) ˆ exp ˆ ˆ (2 ) ˆ exp ˆ ˆ m k n n np m m m m n mm i mm i m i I n n k n np i i mm i m = = = = = = ) ) x x x x x x x x . This can be further reduced by recognizing that each element of ˆ (and consequently each element of ˆ mm ) can be expressed as ij ij ii jj s = r s s , where ij s and ij r are the sample covariance and sample correlation, respectively, of the ith and jth variables. After 21 calculating the determinants and canceling like terms in the numerator and denominator, I can be expressed entirely in terms of the sample correlation matrix, ˆR , as 2 2 2 2 1 1 ˆ ˆ ˆ ˆ n n I k n k n mm mm m= m= = = ) ) R R Wilks (1935), Wald and Brookner (1941), and Consul (1967b) have determined the exact distribution of I for various values of k and m p (m =1,Kk ). However, the asymptotic distribution determined by Box (1949) is much more practical and is applicable to any combination of k and m p . Box (1949) found that log P2 n I n is asymptotically distributed as a chisquared random variable with 1 ( ) 1 ( ) 1 2 2 2 2 2 1 1 1 1 k k m m m m m p p p p p p = = + + = ! " # $ degrees of freedom. As with other likelihood ratio tests, this approximation is very poor for small sample sizes. Consequently, Box (1949) derived a correction factor such that ( 1) log P2 n I n C is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above where ( ) ( ) 3 3 1 2 2 1 1 1 2 1 3 1 k m m k m m p p C n n p p = = = ! " # $ . 2.6 FACTOR ANALYSIS / STRUCTURAL EQUATION MODELING Factor analysis is a multivariate procedure in which one tries to account for the covariances among the observed variables by a smaller number of underlying 22 hypothetical variables, called factors. Let xi , i =1,K,n , be pcomponent vectors of observations from a population with mean μ and covariance matrix . The factor analysis model is given by ix = μ + f + e , where f is an m×1 (m < p ) vector of underlying factors, is a p×m matrix of factor loadings, and e is a p×1 vector of residuals. It is assumed that the underlying factors are independently and identically distributed with mean 0 and covariance matrix I, that the residuals are independently distributed with mean 0 and covariance matrix , and that f and e are independent. Therefore, Cov( ) Cov( ) Cov( ) Cov( ) i= + + = + = + = + x μ f e f e I In most applications, factor analysis is performed on the centered data, i x μ , since Cov( ) Cov( ) i i x μ = x . Therefore, for the remainder of this section, the pcomponent vector i x will represent the centered data. In factor analysis, the researcher hypothesizes an adequate number of underlying factors, then chooses one of many methods to estimate , based on the chosen number of factors. One such method of estimation is the maximum likelihood method. The advantage of using this procedure is, assuming the data come from a multivariate normal population, that it allows the computation of a likelihood ratio test statistic that can be used to test the goodness of fit of the chosen number of factors. This is a test of 0 H : there are m underlying factors, or in matrix form, 0 : F H = + . The details of the 23 derivation of this likelihood ratio test statistic can be found in Lawley & Maxwell (1971). Briefly, consider the likelihood ratio criterion given by ( ) ( ) ( ) { ( ( ) )} ( ) { ( )} { ( ( ) )} { } 2 2 1 1 2 , , 2 2 1 1 , 2 2 1 1 2 2 1 2 max , 2 ˆ ˆ ˆ exp tr ˆ ˆ ˆ ˆ max , 2 ˆ exp tr ˆ ˆ ˆ ˆ ˆ exp tr ˆ ˆ ˆ ˆ ˆ exp np n F F np n n n L n L n n np + + = = + + = μ μ μ μ where ˆ ˆ , , and ˆ are the MLE’s of , , and , respectively. The MLE of can be found as shown at the beginning of Chapter 2. Namely, ( )( ) 1 ˆ 1 n i i n i= = x x x x . Typically, the MLE’s of and are derived by maximizing the loglikelihood function with respect to and . However, Lawley and Maxwell (1971) state that it is more convenient to minimize 2log * log tr ( ( ) 1 ) log Fn p = + + + S S where the * indicates that the unbiased sample covariance matrix, S, is used in place of ˆ in F . Minimizing 2log *F instead of maximizing 1 1 1 ( ( ) 1 ) 2 2 2 log log(2 ) log tr ˆ FL np n n = + + is acceptable since they differ by a constant, 1 2 np log(2 ) , and a function of the data, log S p , and the remaining terms of log F L are just 1 2 times the corresponding terms of 2log *F . The only other difference between log F L and 2log *F is in the use 24 of S in 2log *F rather than ˆ . Since 1 n ˆ n S = , these matrices will be essentially equivalent for large n. To find ˆ and ˆ let ij and ij be the elements in the ith row and jth column of and , respectively. Also, let i , be the ith diagonal element of . (The nondiagonal elements of are all zero, since the residuals are independently distributed.) Then, 2 1 m ii ik i k= = + , and 1 , m ij ik jk k i j = = , and the MLE’s of and can be found by setting the following expressions equal to zero and solving simultaneously for ij and i , ; 1 log log and log log p F F ij F F ii il j ij il i ii i L L L L = = !!  "" =  # $ , , . In most cases, these equations cannot be solved directly. Therefore, an iterative numerical procedure, such as NewtonRaphson, scoring, or steepest descent must be used to find the MLE’s. To perform this likelihood ratio test, we must know how the test statistic is distributed. Lawley and Maxwell (1971) found that 2log *F is asymptotically distributed as a chisquared random variable with 1 ( )2 ( ) 2 p m p + m degrees of freedom. For many years, this likelihood ratio test was the primary criterion used to determine the goodness of fit of the hypothesized number of factors. However, in the early 1980’s researchers discovered through Monte Carlo simulation that in large 25 samples, goodfitting models were rejected too often, and in small samples the type I error rates were too large and power was very poor (Gerbing & Anderson, 1993). 26 CHAPTER 3 BOOTSTRAPPING At least one nonparametric procedure has been applied to the problem of testing the structure of a covariance matrix. Specifically, bootstrapping has been used to estimate the distribution of the likelihood ratio test statistic used in structural equation modeling. Consider testing the hypothesis H0 : = ( ) , where ( ) is the hypothesized covariance structure expressed as a function of the vector of parameters, . To perform a bootstrap test, the resampling must be done from a bootstrap population with covariance structure specified by the null hypothesis. Therefore, the observed data must be transformed as follows before resampling (Bollen & Stine, 1993). Let X be the n× p matrix of centered data, let S = X X/(n 1) denote the sample covariance matrix of X, and let ˆ = ( ˆ ) be the estimated hypothesized covariance structure. Also, let M1 2 denote the lower triangular matrix resulting from the Cholesky decomposition of a positive definite matrix M. Then, the sample covariance matrix of Y = XS 1 2 ˆ 1 2 is given by 27 ( )( ) ( )( ) ( )( )( ) 1 1 1 2 1 2 1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 12 12 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ . n n = = = = = YY S XXS S SS S S S S Therefore, the transformed data matrix, Y, has sample covariance structure ˆ . To find the bootstrap distribution of the likelihood ratio test statistic, resample the rows of the transformed data matrix, Y, with replacement, compute the sample covariance matrix S* of the resampled data, find the MLE’s of the parameters based on the resampled data, and compute the likelihood ratio test statistic for the resampled data. When the null hypothesis is true, the bootstrap distribution is approximately the same as the distribution of the likelihood ratio test statistic, and a pvalue can be computed by dividing the number of bootstrap samples resulting in a test statistic value at least as large as the one resulting from the observed data by the total number of bootstrap samples. 28 CHAPTER 4 PERMUTATION TESTS Permutation tests have long been studied as alternatives to parametric procedures when the assumptions of such procedures are violated. The idea behind permutation tests is to generate the sampling distribution of a test statistic from the values obtained by calculating the test statistic for all possible permutations of the data under the null hypothesis. Consider a simple example given by Edgington (1995). Suppose five subjects are randomly assigned to two treatments, A and B, with the following results. A: 18, 30, 54 B: 6, 12 We wish to test the null hypothesis of no difference in the treatment means. Assuming this null hypothesis is true, we would expect each subject to have the same result regardless of their treatment assignment. Therefore, under the null hypothesis, there are 5C3 =10 possible arrangements of subjects to treatments, where n k C is the number of combinations of k items chosen from a total of n items. These arrangements, as well as the corresponding test statistic values, are displayed in Table 4.0.1. In this example, the test statistic is the absolute value of the pooled ttest statistic (displayed as t in Table 4.0.1). 29 Table 4.0.1. All Possible Permutations of the Observed Data Trt A Trt B t 18, 30, 54 6, 12 1.81 6, 12, 18 30, 54 3.00 6, 12, 30 18, 54 1.22 6, 12, 54 18, 30 0.00 6, 18, 30 12, 54 0.83 6, 18, 54 12, 30 0.25 6, 30, 54 12, 18 0.83 12, 18, 30 6, 54 0.52 12, 18, 54 6, 30 0.52 12, 30, 54 6, 18 1.22 The distribution of the test statistic values calculated from each possible permutation of the data is the sampling distribution of the test statistic. Consequently, a pvalue can be found by dividing the number of observations greater than or equal to the test statistic value obtained from the observed data, by the total number of permutations of the data under the null hypothesis. In this example, the pvalue is 2 /10 = 0.20 since there are two values of t that are greater than or equal to 1.81 (bolded numbers in Table 4.0.1), the value obtained from the observed data. Notice that the 1.81 corresponds to the most significant data configuration in which the value for treatment A is greater than that of treatment B and the 3.00 corresponds to the most significant data configuration in which the value for treatment A is less than that of treatment B. Therefore, a onetailed pvalue is 1/10 = 0.10 . Fisher (1971) is often credited with developing the first permutation tests; however, Edgington (1995) claims that permutation tests based on the ranks of data have been used since the 1880’s. Even if Fisher was not the first to develop permutation tests 30 in general, he was the first to suggest permuting the raw data as opposed to the ranks of the data. He is also responsible for generating considerable interest in the merits of permutation tests, namely the lack of distributional assumptions required by parametric tests. Fisher (1971) writes, “The utility of such nonparametric (permutation) tests consists in their being able to supply confirmation whenever, rightly or, more often, wrongly, it is suspected that the simpler (parametric) tests have been appreciably injured by departures from normality” (p. 48). Fisher (1936) even goes on to write in a later article that conclusions from parametric tests “have no justification beyond the fact that they agree with those which could have been arrived at by this elementary method (permutation tests)” (p. 59). Although Fisher (1971) showed that permutation tests eliminate the need for normality, it was another statistician, Pitman (1937a, 1937b, 1938), who recognized that permutation tests also eliminate the need for random samples. In these three papers, Pitman developed much of the theory of permutation tests and showed that random sampling is not necessary for a valid test. Rather, random assignment of experimental units to treatments is sufficient. Given the benefits of permutation tests, one would assume that the majority of analyses would be performed utilizing these procedures. However, the ability to determine the test statistic value for all possible permutations of the observed data was virtually impossible (except for the smallest sample sizes) due to the lack of computer technology in Fisher’s and Pitman’s day. Consequently, permutation tests based on ranks continued to be developed, since these tests do not require the generation of a new 31 sampling distribution for each new set of observed data. Tables exhibiting critical values were readily available for such tests. It took significant improvements in technology before interest in permutation tests based on raw data was renewed; however, the computing time required to generate all possible permutations of the data was, and still is, prohibitive except for the smallest sample sizes. Finally, in 1957, Dwass proposed “the almost obvious procedure of examining a ‘random sample’ of permutations and making the decision to accept or reject (the null hypothesis) on the basis of those permutations only” (p. 182). He called this new class of tests randomization tests and found that the power of these tests is ‘close’ to the power of the corresponding permutation tests. In his 1957 paper, Dwass restricts attention to the two sample case, but indicates that these randomization tests can be applied in more general situations. In more recent years, Edgington (1995), Manly (1997), and Good (1994) have applied permutation and randomization tests to factorial designs, randomized block designs, and multivariate designs, among others. Several statisticians have even used permutation tests to test the equality of correlation or covariance matrices from multiple populations (Krzanowski, 1993; Shipley, 2000; Zhu, Ng, & Jing, 2003). However, neither permutation nor randomization tests have been applied to testing the structure of a covariance matrix. As a side note, permutation tests are very similar to bootstrapping described in Chapter 3. The primary difference is that in bootstrapping the resampling is done with replacement, whereas in permutation tests the resampling is done without replacement. Because of this, permutation tests are exact and unbiased, whereas Good (1994) writes, 32 “The bootstrap is neither exact nor conservative. Generally, but not always, a nonparametric bootstrap is less powerful than a permutation test… If the observations are independent and from distributions with identical values of the parameter of interest, then the bootstrap is asymptotically exact” (p. 20). 33 CHAPTER 5 PROPOSED TEST Recognizing the benefits of permutation tests and the limitations of parametric procedures for testing the structure of a covariance matrix, it is the purpose of this research to develop a permutation test for the structure of a covariance matrix. To develop such a test, it must be established that the observations are exchangeable under the null hypothesis. Good (2002) gives a simple definition of exchangeability. He writes that observations are considered exchangeable if, “under the (null) hypothesis, the joint distribution of the observations is invariant under permutations of the subscripts” (p. 243). He then goes on to say “It is easy to see that a set of i.i.d. variables is exchangeable. Or that the joint distribution of a set of normally distributed random variables whose covariance matrix is such that all diagonal elements have the same value 2 and all of the offdiagonal elements have the same value is invariant under permutations of the variable subscripts” (p. 244). Good (2002) focuses on permuting variable subscripts rather than the actual observations so as to include permutation tests in which residuals are permuted, but these conditions for exchangeability also apply to cases in which the raw data are permuted. It will be argued in the following sections that all of the proposed permutation tests satisfy at least one of the criteria for exchangeability given by Good. 34 Before describing the permutation tests for the structure of a covariance matrix, note that covariance matrices are invariant to changes in location. Therefore, it will be assumed throughout this chapter that the variable means are all equal. If the variable means are unequal or it is unknown whether the means are equal, the raw data can easily be centered by calculating xi μ or i x x depending on whether μ is assumed known or unknown, respectively. This centering is necessary to eliminate the effect of the mean vector when permuting the data. For example, consider a situation in which two variables are assumed to have equal variances, but one has a mean of 100 and the other a mean of 1. If the values were permuted between the variables, the assumption of equal variances would be violated since the relatively ‘large’ values of the first variable would be combined with the relatively ‘small’ values of the second. This problem, however, can be remedied without affecting the variance or covariance assumptions by centering the raw data as described above. 5.1 PERMUTATION TESTS OF SPHERICITY AND COMPOUND SYMMETRY Consider first a permutation test for compound symmetry. Let i x , i =1,K,n , be equally distributed, pvariate vectors of observations taken on n subjects. We wish to test : o CS H = where is the covariance matrix of the distribution of i x , 2 (1 ) CS p p p = I + 1 1 , (5.1.1) 2 is the common population variance, is the common pairwise correlation, p I is the p× p identity matrix, and p 1 is a p×1 vector of ones. Under the null hypothesis, the variances are assumed equal and the pairwise correlations are assumed equal, but no 35 distributional assumptions have been made. To completely satisfy one of the conditions for exchangeability given previously, we would also need to assume joint normality. However, the simulation results shown in Chapter 6 indicate that this assumption might be too strict. It does not appear necessary to assume joint normality, but rather that each of the marginal distributions is from the same family of distributions, i.e. all uniform, all exponential, etc. Consequently, the values within each vector i x can be permuted without altering the covariance matrix. Before developing a test statistic, it is necessary to discuss the estimation of 2 and . Consider using the MLEs. Call these 2 1 2 1 p p j j s s = = and 1 ( ) 2 1 1 1 1 1 p p p p j k j jk r r = =+ = , where 2j s and jk r are the usual MLEs of 2j and jk , respectively. Since covariance matrices are symmetric, one possible test statistic can be computed by summing the absolute differences between the elements on or above the diagonal of the covariance matrices obtained from each possible permutation and the elements on or above the diagonal of the covariance matrix estimated as described above. In matrix notation this test statistic can be expressed as ( ) ( ( ) ) 1 2 2 1 vec 1 p p perm p p p D s r r += 1 I + 1 1 where perm is the covariance matrix obtained after permuting the data and vec(M) is a vector of the elements on or above the diagonal of a matrix M. This test statistic is computed for each possible permutation of the data and the proportion of test statistic values greater than or equal to the one obtained from the observed data is the corresponding pvalue. This test for compound symmetry can also be used to test for sphericity by setting r = 0 . 36 Consider a very simple example of the proposed permutation test in which there are three measurements taken on each of three subjects resulting in the data and sample statistics shown in Table 5.1.1 below. We are interested in testing whether the covariance matrix has the compound symmetry structure. This is equivalent to testing Ho : = CS versus : a CS H =/ , where CS is of the form of (5.1.1) and s2 and r are the MLEs of 2 and given by 2 1 2 1 ( ) 1 3 1.416 1.389 0.436 1.080 p p j j s s = = . + + . and ( ) ( ) 1 2 1 1 1 1 1 1 3 0.69 0.99 0.79 0.823 p p p p j k j jk r r = =+ = . + + . . In this case, since μ is unknown, the data will be centered by subtracting i x x for each subject, i. This results in the centered data shown in Table 5.1.1. Notice that the sample variances and correlations are invariant to centering. Table 5.1.1. Observed Data Raw Data Centered Data 6.4 4.8 1.8 1.17 1.67 0.73 3.6 2.3 0.2 1.63 0.83 0.87 5.7 2.3 1.2 0.47 0.83 0.13 1 x = 5.23 2 x = 3.13 3 x =1.07 1 x = 0 2 x = 0 3 x = 0 2 1 s =1.416 2 2 s =1.389 2 3 s = 0.436 2 1 s =1.416 2 2 s =1.389 2 3 s = 0.436 1 0.69 0.99 0.69 1 0.79 0.99 0.79 1 R = 1 0.69 0.99 0.69 1 0.79 0.99 0.79 1 R = 37 In this example, there are p!= 3!= 6 possible permutations of each row, resulting in ( !) 63 216 n p = = possible permutations of the observed data. Four of these permutations along with the corresponding test statistic values, D, are displayed in Table 5.1.2 below. The bolded data shown in Table 5.1.2 are the observed data. Table 5.1.2. Some Permutations of the Centered Observed Data 1.17 1.67 0.73 0.73 1.67 1.17 1.63 0.83 0.87 1.63 0.87 0.83 0.47 0.83 0.13 0.47 0.83 0.13 D = 1.761422 D =1.064711 1.17 0.73 1.67 1.17 1.67 0.73 0.83 0.87 1.63 0.87 1.63 0.83 0.13 0.47 0.83 0.47 0.13 0.83 D = 2.587289 D = 2.345689 The pvalue can be found by computing the proportion of test statistic values that are greater than or equal to the one obtained from the observed data. The distribution of the test statistic values is shown in Figure 5.1.1. The vertical line at 1.761422 represents the value of the test statistic resulting from the observed data. In this example, 96 of the 216 possible permutations result in test statistic values greater than or equal to 1.761422. Therefore, the pvalue is 96 / 216 . 0.4444 and at any reasonable type I error rate there is not enough evidence to conclude that the covariance matrix does not have the compound symmetry structure. 38 Figure 5.1.1. Distribution of the Test Statistic for Compound Symmetry Test Statistic Relative Frequency 0.5 1.0 1.5 2.0 2.5 0.0 0.2 0.4 0.6 0.8 5.2 PERMUTATION TEST OF TYPE H STRUCTURE The type H covariance structure does not satisfy either of the criteria for exchangeability mentioned previously. Therefore, the transformation described in Section 2.3 can be applied to the data so that the permutation test for sphericity described in Section 5.1 can be used. Specifically, assume we wish to test H0 : = TH versus : a TH H , where TH has the form of (2.3.1). Let C be a ( p 1)× p matrix of 39 normalized orthogonal contrasts on the p repeated measures and let Y be an n× p matrix of centered data. Under 0 H , ( ) ( ) 1 Var ' Var ' ' TH p YC = C Y C = C C = %I as shown in Section 2.3. Therefore, the permutation test for sphericity can be applied to the transformed data, YC' . As an example, return to the sample given in Table 5.1.1. This time we wish to test 0 : TH H = . The matrix of normalized orthogonal contrasts is given by 0.7071068 0.7071068 0 0.4082483 0.4082483 0.8164966 = C . Postmultiplying the centered data shown in Table 5.1.1 by C' yields 1.17 1.67 0.73 0.7071 0.4082 1.63 0.83 0.87 0.7071 0.4082 0.47 0.83 0.13 0 0.8165 0.3536 0.5634 0.5657 0.2939 . 0.9192 0.2531 =  = YC' (5.2.1) The MLE of the covariance matrix of this transformed data is then given by 0.4300 0.0885 0.0885 0.1559 and the MLE of p 1 %I is given by 1 0.29295 0 p 0 0.29295 % = I . The permutation test for sphericity is applied to the transformed data to test 0 1 : ' p H C C = %I versus 1 : ' a p H C C %I by finding all possible within row permutations of the transformed data. The test statistic is then calculated by 40 ( ) ( ) 1 2 1 vec ˆ p p perm p D + = 1 %I . A pvalue can be found by determining the proportion of test statistic values greater than or equal to the one resulting from the original set of transformed data. For the transformed data shown in (5.2.1), there are only ( !) 23 8 n p = = possible permutations of which all 8 result in test statistic values greater than or equal to 0.3626011, the test statistic value resulting from the original set of transformed data. Therefore, the pvalue is 8/8 =1 and at any reasonable type I error rate there is not enough evidence to conclude that the covariance matrix does not have the type H structure. One drawback with this permutation test is that the transformed data matrix has only p1 as opposed to p columns as in the original data matrix. For large combinations of n and p this is not a problem. However, if the combination of n and p is small, as in the example shown above, there may be too few possible permutations for the permutation test to be meaningful or even useful at all. 5.3 PERMUTATION TEST OF ALL OTHER COVARIANCE STRUCTURES Neither of the remaining covariance structures discussed in Chapter 2 (serial correlation and independence of sets of variates) satisfy either of the conditions for exchangeability described previously. Therefore, a data transformation is required to achieve exchangeability. The following theorem given by Graybill (1983), can be used to transform a data set with covariance matrix to one with covariance matrix D, where D is a diagonal matrix. Finally, one additional calculation, described in the following paragraphs, enables any test of the structure of a covariance matrix to be accomplished by a test for sphericity. 41 Theorem 5.3.1. Let A be any n× n matrix. There exists an orthogonal matrix P such that P AP = D, where D is a diagonal matrix, if and only if A is symmetric (p. 19). Consider the linear model Y = X$ + e where Y is an n× p matrix of observations, X is a known matrix of constants, $ is a matrix of unknown parameters, and e is a matrix of unknown errors such that Var(e) = (and consequently, Var(Y) = ). Covariance matrices are symmetric; therefore, by Theorem 5.3.1, there exists an orthogonal matrix P such that P P = D, where D is a diagonal matrix. Specifically, P consists of the eigenvectors of and the eigenvalues of are on the diagonal of D. Then, postmultiplying the data matrix, Y, by P yields, Var(YP) = P Var(Y)P = P P = D. Therefore, any test of 0 : o H = is equivalent to testing : o H P P = D, where the columns of P are the eigenvectors of 0 . Then the previously described permutation test for sphericity can be performed on the postmultiplied data, YP, after dividing each column of YP by the square root of the respective eigenvalue. To illustrate this test, return to the sample of data given in Table 5.1.1. This time we are interested in testing 0 : SC H = , where SC has the serial correlation form shown in (2.4.1). The problem with testing for this structure is that even though the variances are assumed to be equal, the covariances are not. Therefore, neither of the previously described criteria for exchangeability is satisfied. Consequently, the centered data must be transformed by applying Theorem 5.3.1 before permuting. Before 42 transforming the data, consider the following preliminary calculations used to find the MLEs of X2 and H in SC as described in Section 2.4. ( ) ( ) 1 1 1 4.166667 = = = n i i i S x x C x x , ( ) ( ) 2 1 2 9.5 = = = n i i i S x x C x x , and ( )( ) 3 1 9.72 = = = n i i i S x x x x where 1 C and 2 C are as shown in (2.4.3). Then, the MLE of H can be found by substituting 1 S , 2 S , 3 S , and p = 3 into (2.4.5) to get 16.666668 ˆ3 9.5 ˆ 2 44.440002 ˆ + 28.5 = 0 . The only solution to this equation in the interval (1, 1) is Hˆ = 0.6549899 . Substituting this value into (2.4.4) yields ˆ 2 = 0.5872383. Therefore, under the null hypothesis the MLE of the covariance matrix is 1.0284596 0.6736306 0.4412212 ˆ 0.6736306 1.0284596 0.6736306 0.4412212 0.6736306 1.0284596 SC = and the eigenvectors and eigenvalues of ˆ SC are 17 0.5535 0.7071 0.4400 0.6223 5.4400 10 0.7828 0.5535 0.7071 0.4400 × and [2.2269 0.5872 0.2712] , respectively. Postmultiplying the centered data matrix shown in Table 5.1.1 by the matrix of eigenvectors yields 43 17 1.17 1.67 0.73 0.5535 0.7071 0.4400 1.63 0.83 0.87 0.6223 5.4400 10 0.7828 0.47 0.83 0.13 0.5535 0.7071 0.4400 2.0888 0.3064 0.4687 1.9024 0.5421 0.4477 ; 0.1864 0.2357 0.9163  × = (5.3.1) however, this data matrix still cannot be permuted since the variances of the variables, given by the eigenvalues, are not equivalent. Therefore, each column of this data matrix must be divided by the square root of its respective eigenvalue before the data can be permuted. Refer to the data matrix found by dividing each column of (5.3.1) by the square root of its respective eigenvalue. We will refer to this matrix as the matrix of transformed data. This matrix is given by 1.3997 0.3999 0.9000 1.2748 0.7074 0.8596 0.1249 0.3076 1.7596 . (5.3.2) We can now perform a test for sphericity on the matrix of transformed data. This is done by finding all possible permutations of the transformed data such that the data are permuted within each row, and calculating the test statistic given by ( ) ( ) 1 2 1 vec p p perm p D + = 1 I . A pvalue can be found by determining the proportion of test statistic values greater than or equal to the one resulting from the original set of transformed data. For the transformed data shown in (5.3.2), there are ( !) 63 216 n p = = possible permutations of which only 6 result in test statistic values greater than or equal to 0.6095927 which is the value resulting from the original set of transformed data. Therefore, the pvalue is 6 / 216 . 0.0278 and at Y = 0.05 there is enough evidence to conclude that does not 44 have the serial correlation structure. This conclusion is expected since the sample correlation matrix shown in Table 5.1.1 does not suggest serial correlation. Figure 5.3.1 below shows the distribution of the test statistic for this set of data. The vertical line at 0.6095927 represents the test statistic value resulting from the original set of transformed data. Figure 5.3.1. Distribution of the Test Statistic for Serial Correlation Test Statistic Relative Frequency 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 45 CHAPTER 6 SIMULATIONS Onethousand simulations were run using R version 2.3.1 for all combinations of n ( = 5,10,25 ) and p ( = 3,5,10 ). The R code for each test can be found in Appendix A.1. Due to the extremely large number of permutations required to perform the permutation tests described in Chapter 5 for any reasonable values of n and p, randomization tests were primarily used in the simulations. Permutation tests were only run for the test of type H structure when n = 5 or 10 and p = 3 (see Section 6.3). Within each simulation, a pvariate data set was generated and the randomization test (RT) (or permutation test [PT] in the cases described above), likelihood ratio test (LRT), and corrected likelihood ratio test (CLRT) were all run for comparison. Onethousand random permutations of the centered and/or transformed data were sampled for each RT. The number of randomly selected permutations was chosen according to the suggestions of Manly (1997). For the LRTs, the asymptotic chisquared distributions described in Chapter 2 were used to determine approximate 5% critical values. Three different multivariate distributions (normal, uniform, and double exponential) were investigated. For the multivariate normal distribution, data were generated using the builtin R functions by specifying the desired covariance structure. For the multivariate double exponential distributions data were generated using a 46 procedure described in Vale and Maurelli (1983). For univariate data, this procedure first involves the generation of a random sample from a standard normal distribution. Each of these data values, X, is then substituted into the polynomial Y = a + bX + cX 2 + dX 3 , where the constants a, b, c, and d are determined by expressing the first four moments of the desired nonnormal distribution in terms of the first four moments of the standard normal distribution and solving algebraically. Vale and Maurelli (1983) provide a system of equations that can be used to find these constants. In extending this to the multivariate case, there are issues with specifying the desired covariance structure. Initially, data can be generated from the ( , ) p N 0 distribution, however, once the polynomial transformation is applied, the resulting data no longer have the same covariance structure. Therefore, it is necessary to determine intermediate correlations to be used to generate the multivariate normal data that will result in multivariate double exponential data with the desired covariance structure. Again, Vale and Maurelli (1983) provide a system of equations that can be solved to determine these intermediate correlations. There exists a more recent extension of the Vale and Maurelli (1983) procedure developed by Headrick (2002) in which the first six moments of the desired nonnormal distribution are used instead of just the first four. Headrick (2002) argues that specifying two additional moments results in much more accurate nonnormal distributions, but the inclusion of these additional moments places restrictions on the possible correlations that can be simulated. Specifically, once one of the correlations in the desired covariance matrix is specified, the remaining correlations cannot differ from the first too drastically, and the amount of difference changes for each desired distribution. For example, in trying to simulate threevariate uniform data with an unstructured covariance matrix, the 47 largest and smallest of the three correlations could not be varied by more than approximately 0.3. Differences larger than this resulted in intermediate correlations that were greater than one. This restriction is not a problem when generating data with a compound symmetry covariance structure; however, it severely limits the number of alternative covariance structures that can be explored when estimating power, especially as the number of variables is increased. Therefore, the method given by Vale and Maurelli (1983) was used to generate the multivariate double exponential data for the simulations in this chapter. Although convenient for generating data from many multivariate distributions, the Vale and Maurelli (1983) procedure cannot be used to generate multivariate uniform data. This is due to the fact that this procedure restricts the lower bound of the kurtosis of the desired marginal distributions. Specifically, if the skewness of the desired marginal distribution is 0, the lower bound for kurtosis is 1.15132 (Headrick, 2002); whereas the kurtosis of the UNIF(a,b) distribution is 1.2. Therefore, to generate the multivariate uniform data, a procedure described in Falk (1999) was used. This procedure consists of generating a random sample, xi , i =1,...n , from the ( , ) p N 0 R% distribution where ( ) 6 R% = 2sin R and R is the desired correlation matrix. Then the standard normal CDF, 1, is applied to the ( , ) p N 0 R% data so that ( ) i 1 x has a multivariate UNIF( , ) p p 0 1 distribution with correlation matrix R, where p 0 and p 1 are p×1 vectors of zeros and ones, respectively, that represent the lower and upper bounds of the marginal uniform distributions. To achieve the desired variances, note that the variance of the univariate UNIF(a,b) distribution is given by 2 = (b a)2 12 . Setting a = 0 we have 2 = b2 12 48 which implies that b = 12 . Multiplying each column of the multivariate UNIF( , ) p p 0 1 data by 12 j jj b = , j =1,..., p , with the desired standard deviation, jj , results in multivariate UNIF( , ) p 0 b data with covariance matrix , where ( ) 1,..., ' p b = b b . The type I error rate and power will be investigated for five different randomization tests: the tests of sphericity, compound symmetry, type H, serial correlation, and independence of sets of variates. The tests of sphericity and compound symmetry will be performed by permuting the raw data; the test of type H structure will be performed by first postmultiplying the data matrix by a matrix of normalized orthogonal contrasts and then running the randomization test for sphericity; and the tests of serial correlation and independence of sets of variates will be performed by first postmultiplying the data matrix by the eigenvectors of the estimated hypothesized covariance matrix, then dividing the columns of the resulting matrix by the square root of the respective eigenvalues, and finally running the randomization test for sphericity. The values of the various parameters used to simulate the type I error rates for the different tests are as follows. For sphericity, the covariance structure is given by 2 S p = I where 2 = 1, 9, or 25; for Compound Symmetry, the covariance structure is given by 2 (1 ) CS p p p = I + 1 1 (6.0.1) where 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9; for Type H, the covariance structure is given by 49 ( ) ( ( ) ) ( ) ( ) ( ( ) ) ( ) ( ) 1 1 2 2 1 1 2 2 ( ) 1 1 2 2 1 2 2 1 2 1 2 2 1 2 1 1 TH p d p d d d pd p d pd p d + % + % + % + + % = + % + % + L L M M O M L (6.0.2) where d > 0 and % > 0 (See Appendix A.2 for the exact parameter values and a description of how they were chosen.); for serial correlation, the covariance structure is given by 2 1 2 2 2 1 2 3 1 1 1 1 p p SC p p p = L L M M M O M L (6.0.3) where 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9; and for the test of independence of sets of variates, the covariance structure is given by 11 22 I kk ! " = ! " ! " ! " # $ 0 0 0 0 0 0 K K M M O M K where the number of variates in each of the mm , m =1,K, k , are (1, 2), (2, 3), (5, 5), or (3, 3, 4) depending on whether p = 3, 5, or 10; and mm has the compound symmetry structure with 2 =1 and = 0.2, 0.5, or 0.8. For example, for p = 5 , number of variates (2, 3), and = 0.2 the simulated covariance structure is 1 0.2 0 0 0 0.2 1 0 0 0 0 0 1 0.2 0.2 0 0 0.2 1 0.2 0 0 0.2 0.2 1 . 50 The various covariance structures used to investigate power are detailed in the sections to follow. 6.1 TEST OF SPHERICITY Simulated type I error rates for the test of sphericity are displayed in Table 6.1.1. For normally distributed data, the CLRT performs better than the other two tests with respect to the simulated type I error rates. For uniform data, the CLRT underestimates the nominal type I error rate and for double exponential data the CLRT overestimates the nominal type I error rate. These results are consistent with those of Huynh and Mandeville (1979) who performed a simulation study of Mauchly’s (1940) test of sphericity and found that for lighttailed distributions the LRTs were conservative and for heavytailed distributions, the simulated type I error rates exceeded the nominal rate. This same pattern is slightly seen in the results of the RT, however, the simulated type I error rates of the RT appear to be converging to 0.05 as n increases, whereas the simulated type I error rates of the LRTs do not. The simulated type I error rates for the RT seem to be unaffected by changes in the variance, however, they appear to increase as p increases. This latter pattern is also seen in the LRTs, but not as greatly as for the RT. One definite benefit of the RT is that it is applicable in situations for which the LRTs do not exist, specifically when p 2 n . However, the simulated type I error rates for these cases are much too large. Overall, the RT appears to be a viable alternative when the data are not normally distributed, but it is not beneficial in small sample situations or in cases where n is close to p. Clearly, the RT is preferred over the LRTs for cases in which p 2 n for the simple 51 fact that a pvalue exists for the RT when it does not for the LRTs. However, the simulated type I error rates are much too large to be of any practical use. The CLRT does not appear to be a level 3 test in nonnormal situations and clearly outperforms the LRT. Similarly, the RT does not appear to be a level 3 test when p 2 n . Therefore, all three tests will be largely ignored in these situations for the power discussions to follow. For completeness all three tests were included in the simulations. 52 Table 6.1.1. Simulated Type I Error Rates for the Test of Sphericity a. Normal p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.095 0.079 0.078 0.169 0.175 0.183 0.550 0.553 0.550 LRT 0.303 0.309 0.320 5 NA NA NA NA NA NA CLRT 0.054* 0.070 0.058* NA NA NA NA NA NA RT 0.070 0.075 0.079 0.100 0.122 0.093 0.204 0.202 0.183 10 LRT 0.125 0.136 0.138 0.296 0.295 0.317 NA NA NA CLRT 0.043* 0.059* 0.056* 0.051* 0.066 0.071 NA NA NA RT 0.064 0.053* 0.044* 0.086 0.079 0.075 0.108 0.098 0.100 25 LRT 0.085 0.078 0.073 0.130 0.091 0.117 0.327 0.317 0.322 CLRT 0.062* 0.050* 0.049* 0.058* 0.050* 0.060* 0.079 0.060* 0.069 b. Uniform p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.095 0.095 0.098 0.166 0.158 0.160 0.449 0.471 0.476 5 LRT 0.278 0.288 0.277 NA NA NA NA NA NA CLRT 0.054* 0.057* 0.059* NA NA NA NA NA NA RT 0.056* 0.061* 0.067 0.082 0.095 0.088 0.170 0.150 0.146 10 LRT 0.092 0.075 0.087 0.189 0.204 0.191 NA NA NA CLRT 0.035 0.032 0.021 0.046* 0.043* 0.037* NA NA NA RT 0.048* 0.056* 0.050* 0.071 0.063* 0.063* 0.077 0.082 0.076 25 LRT 0.036 0.037* 0.037* 0.061* 0.058* 0.054* 0.202 0.198 0.194 CLRT 0.021 0.024 0.025 0.025 0.028 0.020 0.031 0.026 0.019 c. Double Exponential p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.105 0.120 0.104 0.238 0.212 0.226 0.598 0.613 0.609 5 LRT 0.422 0.440 0.407 NA NA NA NA NA NA CLRT 0.087 0.126 0.096 NA NA NA NA NA NA RT 0.083 0.081 0.083 0.112 0.122 0.113 0.239 0.238 0.226 10 LRT 0.262 0.243 0.252 0.470 0.454 0.451 NA NA NA CLRT 0.131 0.126 0.134 0.152 0.148 0.143 NA NA NA RT 0.062* 0.067 0.064 0.070 0.080 0.072 0.102 0.111 0.103 25 LRT 0.226 0.223 0.190 0.288 0.305 0.292 0.587 0.604 0.602 CLRT 0.169 0.180 0.143 0.183 0.200 0.186 0.239 0.250 0.239 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 53 Table 6.1.2 contains the simulated power of the test of sphericity versus nonhomoscedasticity. Specifically, multivariate data were generated from distributions with covariance matrices having diagonal elements given by 1, 1+d/(p1), 1+2d/(p1), …, 1+d and zero off diagonal elements, where d=4, 8, or 12 represents the difference between the first and last (or smallest and largest) diagonal elements. As expected, the power of both the RT and CLRT increases as d and/or n increases, and the power of both tests decreases as p approaches n. For normally distributed data the power of the CLRT is greater than that for the RT in most cases, but the RT does seem to perform fairly well, achieving a power of at least 0.75 for five of the nine cases when n=25. The true benefit of the RT is seen in the nonnormal cases. For these cases, it appears that the CLRT is outperforming the RT; however, recall from Table 6.1.1 that there is evidence that neither of the LRTs are 3level tests for nonnormal data. There does appear to be a distributional effect on the simulated power of the RT with the greatest power resulting from uniformly distributed data and the least from double exponential data in most cases. For uniform data, the RT achieves a power of at least 0.75 in all nine of the cases when n=25 and for double exponential data, in one of the nine cases. 54 Table 6.1.2. Simulated Power vs. NonHomoscedasticity for the Test of Sphericity a. Normal p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.178 0.191 0.217 0.255 0.326 0.313 0.604 0.627 0.622 LRT 0.513 0.614 0.685 5 NA NA NA NA NA NA CLRT 0.130 0.197 0.236 NA NA NA NA NA NA RT 0.283 0.383 0.426 0.299 0.401 0.384 0.408 0.467 0.470 10 LRT 0.513 0.778 0.899 0.599 0.803 0.887 NA NA NA CLRT 0.315 0.600 0.746 0.219 0.403 0.532 NA NA NA RT 0.730 0.915 0.967 0.654 0.803 0.866 0.616 0.749 0.817 25 LRT 0.873 0.997 1.000 0.834 0.987 0.998 0.932 0.992 0.999 CLRT 0.843 0.995 1.000 0.735 0.965 0.996 0.639 0.912 0.988 b. Uniform p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.195 0.243 0.282 0.299 0.299 0.375 0.580 0.608 0.588 5 LRT 0.449 0.603 0.689 NA NA NA NA NA NA CLRT 0.091 0.131 0.212 NA NA NA NA NA NA RT 0.433 0.560 0.623 0.401 0.488 0.573 0.433 0.515 0.521 10 LRT 0.442 0.796 0.932 0.534 0.743 0.874 NA NA NA CLRT 0.212 0.537 0.777 0.145 0.303 0.478 NA NA NA RT 0.957 0.988 1.000 0.868 0.965 0.989 0.787 0.896 0.920 25 LRT 0.933 1.000 1.000 0.844 0.997 1.000 0.900 0.997 1.000 CLRT 0.892 0.999 1.000 0.713 0.991 1.000 0.532 0.900 0.985 c. Double Exponential p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.165 0.220 0.244 0.263 0.338 0.320 0.634 0.629 0.614 5 LRT 0.573 0.657 0.747 NA NA NA NA NA NA CLRT 0.173 0.221 0.297 NA NA NA NA NA NA RT 0.236 0.327 0.360 0.244 0.314 0.344 0.358 0.404 0.421 10 LRT 0.554 0.777 0.880 0.688 0.823 0.889 NA NA NA CLRT 0.378 0.632 0.761 0.353 0.497 0.638 NA NA NA RT 0.470 0.665 0.756 0.457 0.601 0.667 0.429 0.578 0.638 25 LRT 0.843 0.984 0.996 0.858 0.977 0.991 0.945 0.989 0.999 CLRT 0.804 0.976 0.995 0.808 0.958 0.983 0.779 0.942 0.978 55 Table 6.1.3 contains the simulated power for the test of sphericity versus nonzero correlation. Data were generated from multivariate distributions with marginal variances of 1 and pairwise correlations of = 0.3, 0.6, or 0.9. As seen in Table 6.1.3, the power of the RT for sphericity versus nonzero correlation is very low, virtually indistinguishable from the simulated type I error rates shown in Table 6.1.1. The CLRT, however, has much greater power. For n=10, the CLRT has power greater than 0.9 when = 0.9 and for n=25 when = 0.6 or 0.9, but the CLRT seems to have trouble detecting a correlation of = 0.3 even for samples as large as 25. The simulated power of the CLRT increases as p increases. This pattern is also evident for the RT, but the increase is not as extreme. The simulated power of the RT appears to be unaffected by the increase in , whereas the power of the CLRT clearly increases as increases. Just as in Table 6.1.2, there appears to be a distributional effect on the simulated power of the RT. However, in Table 6.1.3 the pattern is reversed with the greatest power resulting from double exponential data and the least from uniform data in most cases. 56 Table 6.1.3. Simulated Power vs. NonZero Correlation for the Test of Sphericity a. Normal p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.075 0.044 0.061 0.128 0.093 0.072 0.338 0.156 0.101 LRT 0.402 0.616 0.944 5 NA NA NA NA NA NA CLRT 0.094 0.185 0.688 NA NA NA NA NA NA RT 0.058 0.042 0.054 0.074 0.050 0.070 0.119 0.043 0.050 10 LRT 0.288 0.722 0.999 0.540 0.911 1.000 NA NA NA CLRT 0.146 0.553 0.993 0.210 0.750 1.000 NA NA NA RT 0.039 0.040 0.046 0.045 0.041 0.045 0.054 0.062 0.053 25 LRT 0.455 0.986 1.000 0.742 1.000 1.000 0.975 1.000 1.000 CLRT 0.393 0.980 1.000 0.645 0.999 1.000 0.877 1.000 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.062 0.045 0.057 0.128 0.077 0.077 0.292 0.114 0.086 5 LRT 0.338 0.590 0.936 NA NA NA NA NA NA CLRT 0.080 0.207 0.696 NA NA NA NA NA NA RT 0.047 0.028 0.059 0.053 0.040 0.059 0.093 0.035 0.029 10 LRT 0.235 0.706 0.998 0.501 0.903 1.000 NA NA NA CLRT 0.125 0.541 0.996 0.178 0.709 1.000 NA NA NA RT 0.040 0.042 0.037 0.044 0.041 0.042 0.048 0.028 0.053 25 LRT 0.392 0.975 1.000 0.714 0.999 1.000 0.960 1.000 1.000 CLRT 0.338 0.966 1.000 0.603 0.999 1.000 0.813 1.000 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.107 0.064 0.108 0.137 0.107 0.126 0.370 0.208 0.189 5 LRT 0.501 0.645 0.959 NA NA NA NA NA NA CLRT 0.134 0.243 0.721 NA NA NA NA NA NA RT 0.073 0.063 0.077 0.077 0.061 0.082 0.133 0.074 0.098 10 LRT 0.397 0.784 0.998 0.673 0.943 1.000 NA NA NA CLRT 0.231 0.619 0.997 0.325 0.795 0.998 NA NA NA RT 0.055 0.048 0.049 0.057 0.051 0.059 0.067 0.049 0.064 25 LRT 0.568 0.986 1.000 0.835 0.998 1.000 0.991 1.000 1.000 CLRT 0.516 0.981 1.000 0.747 0.997 1.000 0.933 1.000 1.000 57 The simulated power of the test of sphericity versus the type H structure seen in (6.0.2) is displayed in Tables 6.1.4 through 6.1.6. See Appendix A.2 for a description of how and why the values of d and D were chosen. As expected the power of both the RT and CLRT increases as d and/or n increases, but there does not appear to be much of a change in the power of the tests as D increases. It is difficult to determine the effect of p on the simulated power of the tests due to the fact that it was necessary to use radically different values of d and D as p increased (See Appendix A.2), but there are two cases with equal parameter values. These are when d=0.1, D=1, and p=5 (Table 6.1.5) or p=10 (Table 6.1.5). From these two cases, it appears that the power of both the RT and CLRT increases as p increases. The CLRT is more powerful than the RT in most cases, but recall that the CLRT is not an Y level test for nonnormal data (See Table 6.1.1). Even with normally distributed data, the RT appears to have a slight edge over the CLRT with respect to power when d, n, and p are all small. Overall, the ability of the RT for sphericity to detect the type H structure is fairly good with simulated power values greater than 0.75 when n=25 in fifteen of the 27 cases for normal data, 21 of the 27 cases for uniform data, and nine of the 27 cases for double exponential data. Just as in previous tables there appears to be a distributional effect on the simulated power of the RT, but the pattern is again reversed from that seen in the previous table (Table 6.1.3) with the greatest power values resulting from uniform data and the lowest from double exponential data in most cases. 58 Table 6.1.4. Simulated Power vs. Type H for the Test of Sphericity ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.153 0.182 0.212 0.240 0.195 0.235 0.292 0.245 0.265 5 LRT 0.453 0.674 0.904 0.651 0.744 0.923 0.843 0.846 0.975 CLRT 0.111 0.241 0.527 0.217 0.255 0.518 0.347 0.366 0.626 RT 0.274 0.419 0.594 0.484 0.565 0.663 0.555 0.694 0.757 10 LRT 0.437 0.806 0.991 0.800 0.921 1.000 0.982 0.993 1.000 CLRT 0.261 0.663 0.975 0.622 0.806 0.992 0.936 0.961 0.999 RT 0.690 0.943 0.993 0.930 0.995 1.000 0.994 1.000 1.000 25 LRT 0.797 0.996 1.000 0.996 1.000 1.000 1.000 1.000 1.000 CLRT 0.741 0.993 1.000 0.994 1.000 1.000 1.000 1.000 1.000 b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.185 0.204 0.285 0.284 0.260 0.303 0.350 0.308 0.319 5 LRT 0.438 0.665 0.913 0.641 0.722 0.934 0.851 0.875 0.982 CLRT 0.103 0.183 0.511 0.177 0.256 0.504 0.338 0.370 0.681 RT 0.349 0.580 0.803 0.597 0.774 0.859 0.707 0.852 0.911 10 LRT 0.354 0.804 0.999 0.816 0.939 0.999 0.983 0.995 1.000 CLRT 0.180 0.606 0.979 0.616 0.833 0.993 0.940 0.971 1.000 RT 0.894 0.999 1.000 0.998 1.000 1.000 1.000 1.000 1.000 25 LRT 0.803 0.999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.744 0.998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.162 0.146 0.201 0.255 0.200 0.239 0.282 0.253 0.265 5 LRT 0.509 0.701 0.924 0.713 0.766 0.932 0.854 0.878 0.973 CLRT 0.141 0.262 0.535 0.254 0.353 0.537 0.378 0.430 0.712 RT 0.216 0.332 0.455 0.385 0.441 0.561 0.508 0.561 0.642 10 LRT 0.500 0.834 0.996 0.812 0.914 0.995 0.980 0.986 1.000 CLRT 0.347 0.696 0.988 0.674 0.809 0.984 0.929 0.955 0.999 RT 0.471 0.707 0.907 0.804 0.927 0.939 0.907 0.976 0.980 25 LRT 0.806 0.996 1.000 0.992 1.000 1.000 1.000 1.000 1.000 CLRT 0.766 0.994 1.000 0.991 1.000 1.000 1.000 1.000 1.000 59 Table 6.1.5. Simulated Power vs. Type H for the Test of Sphericity ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.148 0.161 0.205 0.166 0.216 0.240 0.217 0.222 0.240 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.110 0.207 0.465 0.140 0.256 0.479 0.185 0.331 0.543 10 LRT 0.430 0.874 0.999 0.379 0.855 0.996 0.504 0.844 1.000 CLRT 0.122 0.615 0.980 0.086 0.553 0.973 0.137 0.549 0.980 RT 0.105 0.524 0.952 0.146 0.664 0.957 0.331 0.823 0.989 25 LRT 0.416 0.997 1.000 0.211 0.997 1.000 0.631 0.995 1.000 CLRT 0.290 0.989 1.000 0.116 0.987 1.000 0.502 0.981 1.000 b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.153 0.162 0.287 0.173 0.201 0.315 0.222 0.233 0.311 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.100 0.318 0.681 0.138 0.386 0.667 0.211 0.479 0.722 10 LRT 0.346 0.846 0.998 0.280 0.819 1.000 0.417 0.856 1.000 CLRT 0.098 0.581 0.989 0.055 0.517 0.986 0.099 0.522 0.991 RT 0.130 0.865 0.996 0.172 0.940 0.999 0.411 0.987 1.000 25 LRT 0.307 0.993 1.000 0.119 0.995 1.000 0.516 0.998 1.000 CLRT 0.186 0.986 1.000 0.063 0.981 1.000 0.367 0.989 1.000 c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.185 0.178 0.225 0.234 0.180 0.245 0.263 0.241 0.271 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.096 0.154 0.333 0.147 0.248 0.362 0.189 0.280 0.399 10 LRT 0.552 0.908 0.998 0.525 0.892 0.998 0.638 0.907 1.000 CLRT 0.205 0.699 0.990 0.185 0.643 0.987 0.261 0.648 0.997 RT 0.081 0.309 0.753 0.133 0.437 0.749 0.250 0.562 0.793 25 LRT 0.587 0.997 1.000 0.468 0.989 1.000 0.770 0.999 1.000 CLRT 0.485 0.994 1.000 0.333 0.983 1.000 0.658 0.997 1.000 60 Table 6.1.6. Simulated Power vs. Type H for the Test of Sphericity ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.153 0.185 0.220 0.223 0.231 0.244 0.349 0.331 0.345 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.168 0.213 0.315 0.175 0.190 0.312 0.211 0.263 0.375 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.394 0.623 0.863 0.278 0.484 0.796 0.335 0.578 0.803 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.992 0.999 1.000 CLRT 1.000 1.000 1.000 0.999 0.999 1.000 0.937 0.997 1.000 b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.154 0.181 0.257 0.214 0.249 0.258 0.304 0.311 0.310 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.212 0.321 0.523 0.175 0.302 0.447 0.212 0.324 0.509 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.739 0.911 0.987 0.580 0.860 0.966 0.551 0.899 0.992 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.996 1.000 1.000 CLRT 1.000 1.000 1.000 0.999 1.000 1.000 0.929 0.995 1.000 c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.212 0.241 0.282 0.327 0.301 0.295 0.393 0.395 0.386 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.135 0.179 0.244 0.135 0.175 0.242 0.216 0.261 0.268 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.202 0.309 0.558 0.186 0.269 0.454 0.242 0.344 0.494 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.999 1.000 1.000 CLRT 1.000 1.000 1.000 1.000 1.000 1.000 0.978 0.997 1.000 61 Table 6.1.7 contains the simulated power for the test of sphericity versus serial correlation shown in (6.0.3). Since the serial correlation structure has equal variances as does the sphericity structure, only one value was simulated for the marginal variances. Data were generated from multivariate distributions with marginal variances of 1 and serial correlations of = 0.3, 0.6, or 0.9. As expected, the power of the CLRT increases as H increases, but there does not appear to be any relationship between the power of the RT and the value of H. On the other hand, the power of both tests increases as p and/or n increases. Overall, the power of the RT is very poor, beating the power of the CLRT in only two cases (normal, n=10, p=5, H=0.3 and uniform, n=5, p=3, H=0.3), but in both cases the power is much too low (0.122 and 0.076, respectively). Again, there appears to be a distributional effect on the power of the RT, but the pattern is again reversed from the previous tables (Tables 6.1.4 through 6.1.6). This time the greatest power values result from double exponential data and the lowest from uniform data in most cases 62 Table 6.1.7. Simulated Power vs. Serial Correlation for the Test of Sphericity ( 2 =1) a. Normal p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.075 0.065 0.062 0.199 0.158 0.117 0.554 0.528 0.289 5 LRT 0.378 0.578 0.919 NA NA NA NA NA NA CLRT 0.082 0.150 0.659 NA NA NA NA NA NA RT 0.076 0.053 0.044 0.122 0.147 0.107 0.320 0.416 0.180 10 LRT 0.243 0.674 0.996 0.432 0.882 1.000 NA NA NA CLRT 0.111 0.476 0.993 0.121 0.618 0.999 NA NA NA RT 0.078 0.054 0.071 0.142 0.118 0.088 0.312 0.533 0.121 25 LRT 0.373 0.970 1.000 0.535 0.996 1.000 0.834 1.000 1.000 CLRT 0.300 0.956 1.000 0.393 0.990 1.000 0.460 1.000 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.076 0.069 0.061 0.175 0.157 0.113 0.495 0.460 0.251 5 LRT 0.327 0.554 0.927 NA NA NA NA NA NA CLRT 0.069 0.175 0.647 NA NA NA NA NA NA RT 0.064 0.042 0.056 0.103 0.116 0.082 0.284 0.401 0.141 10 LRT 0.198 0.626 0.998 0.382 0.858 1.000 NA NA NA CLRT 0.099 0.441 0.993 0.106 0.569 0.997 NA NA NA RT 0.072 0.043 0.054 0.135 0.127 0.062 0.343 0.575 0.100 25 LRT 0.309 0.960 1.000 0.409 1.000 1.000 0.772 1.000 1.000 CLRT 0.250 0.950 1.000 0.298 0.996 1.000 0.370 1.000 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.108 0.082 0.068 0.218 0.183 0.159 0.591 0.607 0.324 5 LRT 0.469 0.614 0.941 NA NA NA NA NA NA CLRT 0.128 0.214 0.682 NA NA NA NA NA NA RT 0.068 0.070 0.088 0.142 0.138 0.094 0.324 0.460 0.211 10 LRT 0.332 0.732 0.998 0.600 0.916 0.999 NA NA NA CLRT 0.180 0.542 0.994 0.272 0.680 0.999 NA NA NA RT 0.063 0.048 0.069 0.141 0.101 0.088 0.316 0.467 0.146 25 LRT 0.492 0.981 1.000 0.715 0.998 1.000 0.930 1.000 1.000 CLRT 0.421 0.967 1.000 0.576 0.997 1.000 0.697 1.000 1.000 63 6.2 TEST OF COMPOUND SYMMETRY The simulated type I error rates for the test of compound symmetry are displayed in Tables 6.2.1 through 6.2.3. Varying X2 and/or H does not seem to have much, if any, affect on the simulated type I error rates of any of the tests of compound symmetry. For normally distributed data the CLRT clearly performs better than either of the other tests with respect to the simulated type I error rates; however, for uniform data this test is too conservative and for double exponential data it results in rates that are much too large. This is the same pattern seen in the simulated type I error rates for the test of sphericity (Table 6.1.1). There also appears to be a distributional effect on the simulated type I error rates of the RT for compound symmetry. Specifically, these rates are generally highest for the double exponential data and lowest for uniform data. Unlike the LRTs, however, the simulated type I error rates of the RT appear to be converging to the nominal rate as n increases. Just as for the test of sphericity, the RT exists in cases for which the LRTs do not. That is when p 2 n . However, the simulated type I error rates in these situations are much too large, especially when p=10 and n=5 (Table 6.2.3). The simulated type I error rates of the LRTs seem to increase as p increases, but this pattern is not seen in the type I error rates of the RT for compound symmetry as it was in the RT for sphericity. For normally distributed data, the CLRT is clearly the best choice with respect to type I error rates. However, in nonnormal situations, the RT performs very well, especially as n increases. Seeing that the LRTs are not level Y tests for nonnormally distributed data and the RT is not a level Y test when p 2 n , these tests will be primarily 64 excluded in these situations in the following power discussions. They were included in the simulations, however, for completeness. Table 6.2.1. Simulated Type I Error Rates for the Test of Compound Symmetry ( p = 3 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.080 0.085 0.089 0.108 0.078 0.093 0.079 0.103 0.091 5 LRT 0.342 0.320 0.351 0.374 0.313 0.338 0.311 0.345 0.338 CLRT 0.059* 0.062* 0.081 0.058* 0.065 0.056* 0.053* 0.080 0.065 RT 0.075 0.072 0.055* 0.070 0.061* 0.068 0.066 0.074 0.057* 10 LRT 0.136 0.148 0.108 0.127 0.125 0.132 0.124 0.126 0.128 CLRT 0.043* 0.063* 0.037 0.059* 0.049* 0.045* 0.045* 0.057* 0.050* RT 0.062* 0.067 0.048* 0.056* 0.054* 0.063* 0.056* 0.054* 0.058* 25 LRT 0.069 0.067 0.065 0.075 0.082 0.077 0.073 0.058* 0.061* CLRT 0.048* 0.049* 0.041* 0.045* 0.051* 0.052* 0.045* 0.039* 0.044* b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.100 0.089 0.092 0.078 0.079 0.080 0.096 0.112 0.076 5 LRT 0.262 0.304 0.297 0.254 0.254 0.300 0.272 0.272 0.276 CLRT 0.053* 0.038* 0.058* 0.042* 0.041* 0.060* 0.053* 0.050* 0.047* RT 0.068 0.074 0.057* 0.068 0.063* 0.060* 0.057* 0.062* 0.054* 10 LRT 0.078 0.095 0.099 0.076 0.090 0.114 0.073 0.084 0.093 CLRT 0.026 0.037* 0.043* 0.033 0.030 0.049* 0.024 0.027 0.030 RT 0.050* 0.058* 0.040* 0.037* 0.052* 0.050* 0.054* 0.056* 0.049* 25 LRT 0.029 0.045* 0.060* 0.028 0.045* 0.050* 0.032 0.041* 0.049* CLRT 0.017 0.030 0.047* 0.014 0.032 0.040* 0.024 0.032 0.035 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.125 0.113 0.103 0.109 0.098 0.113 0.111 0.104 0.110 5 LRT 0.396 0.434 0.387 0.431 0.401 0.420 0.437 0.430 0.425 CLRT 0.095 0.113 0.093 0.085 0.074 0.094 0.102 0.091 0.096 RT 0.074 0.079 0.091 0.068 0.067 0.091 0.076 0.079 0.095 10 LRT 0.231 0.266 0.287 0.247 0.270 0.274 0.256 0.277 0.285 CLRT 0.112 0.132 0.159 0.121 0.140 0.146 0.137 0.150 0.147 RT 0.057* 0.059* 0.053* 0.044* 0.058* 0.056* 0.067 0.065 0.061* 25 LRT 0.227 0.237 0.243 0.200 0.244 0.254 0.244 0.233 0.252 CLRT 0.183 0.189 0.205 0.156 0.198 0.200 0.192 0.190 0.215 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 65 Table 6.2.2. Simulated Type I Error Rates for the Test of Compound Symmetry ( p = 5 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.159 0.103 0.086 0.155 0.115 0.109 0.131 0.140 0.109 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.078 0.081 0.053* 0.085 0.080 0.074 0.088 0.085 0.076 10 LRT 0.289 0.298 0.284 0.299 0.303 0.309 0.296 0.293 0.300 CLRT 0.057* 0.064 0.056* 0.053* 0.068 0.076 0.069 0.057* 0.065 RT 0.060* 0.070 0.046* 0.078 0.048* 0.063* 0.071 0.060* 0.061* 25 LRT 0.104 0.097 0.102 0.115 0.093 0.107 0.111 0.107 0.089 CLRT 0.055* 0.050* 0.044* 0.053* 0.050* 0.059* 0.055* 0.047* 0.046* b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.137 0.120 0.100 0.139 0.129 0.088 0.126 0.114 0.119 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.069 0.075 0.059* 0.083 0.074 0.068 0.090 0.067 0.089 10 LRT 0.202 0.273 0.360 0.218 0.264 0.334 0.231 0.267 0.346 CLRT 0.038* 0.049* 0.094 0.031 0.042* 0.085 0.038* 0.052* 0.096 RT 0.048* 0.058* 0.050* 0.066 0.061* 0.047* 0.065 0.059* 0.054* 25 LRT 0.054* 0.076 0.150 0.049* 0.073 0.127 0.045* 0.071 0.131 CLRT 0.023 0.037* 0.081 0.024 0.036 0.069 0.021 0.035 0.072 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.182 0.163 0.161 0.180 0.163 0.166 0.189 0.166 0.155 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.104 0.090 0.082 0.094 0.087 0.093 0.085 0.095 0.091 10 LRT 0.482 0.504 0.517 0.462 0.493 0.498 0.454 0.483 0.530 CLRT 0.155 0.174 0.197 0.174 0.170 0.195 0.140 0.185 0.190 RT 0.072 0.046* 0.062* 0.073 0.053* 0.068 0.055* 0.080 0.060* 25 LRT 0.310 0.343 0.398 0.343 0.346 0.393 0.320 0.357 0.386 CLRT 0.209 0.244 0.269 0.220 0.248 0.273 0.199 0.238 0.289 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 66 Table 6.2.3. Simulated Type I Error Rates for the Test of Compound Symmetry ( p =10 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.344 0.217 0.186 0.338 0.225 0.143 0.325 0.208 0.171 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.138 0.082 0.084 0.129 0.126 0.080 0.122 0.099 0.075 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.096 0.053* 0.072 0.067 0.065 0.054* 0.067 0.060* 0.055* 25 LRT 0.333 0.325 0.312 0.326 0.334 0.329 0.315 0.299 0.311 CLRT 0.057* 0.067 0.050* 0.057* 0.076 0.063* 0.060* 0.059* 0.066 b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.334 0.211 0.152 0.298 0.226 0.129 0.314 0.178 0.136 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.115 0.081 0.059* 0.111 0.082 0.072 0.132 0.074 0.052* 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.066 0.057* 0.065 0.069 0.053* 0.080 0.072 0.053* 0.051* 25 LRT 0.238 0.322 0.571 0.230 0.320 0.585 0.243 0.320 0.564 CLRT 0.035 0.087 0.194 0.033 0.072 0.198 0.029 0.070 0.188 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.404 0.311 0.285 0.408 0.295 0.271 0.405 0.285 0.267 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.189 0.124 0.126 0.165 0.133 0.123 0.232 0.140 0.142 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.080 0.067 0.061* 0.070 0.076 0.065 0.095 0.071 0.090 25 LRT 0.620 0.704 0.748 0.625 0.727 0.788 0.645 0.697 0.760 CLRT 0.272 0.350 0.417 0.268 0.357 0.454 0.275 0.310 0.432 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 67 The simulated power of the test of compound symmetry versus the type H structure is shown in Tables 6.2.4 through 6.2.6. For these simulations, data were generated from distributions having type H covariance structures as shown in (6.0.2). See Appendix A.2 for a description of how and why the values of d and D were chosen. As expected, the power of all three tests increases as d and/or n increases, but there doesn’t seem to be much effect, if any, on the power of the tests as D increases. It is difficult to determine the effect of increasing p on the power of the tests since very different parameter values were simulated for the different values of p (See Appendix A.2). However, there are two cases for which the parameter values are equal. These are when d=0.1, D=1, and p=5 (Table 6.2.5) or p=10 (Table 6.2.10). From these two cases, it appears that the power of both the RT and CLRT increases as p increases. For normally distributed data there are many cases when the RT is more powerful than the CLRT. Specifically, for 25 of the 54 total cases, the power of the RT is greater than or equal to the power of the CLRT. These are typically when d is small and n is close to p. Overall, the power of the RT exceeds 0.75 in 54 of the 81 cases when n=25. Again, there appears to be a slight distributional effect on the power of the RT with the greatest power values usually resulting from uniformly distributed data and the lowest from double exponential data. This relationship is the opposite of that seen in Tables 6.2.1 through 6.2.3. 68 Table 6.2.4. Simulated Power vs. Type H for the Test of Compound Symmetry ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.180 0.232 0.343 0.253 0.289 0.367 0.312 0.336 0.390 5 LRT 0.471 0.614 0.860 0.690 0.766 0.915 0.864 0.878 0.977 CLRT 0.119 0.170 0.359 0.200 0.253 0.449 0.372 0.382 0.603 RT 0.269 0.472 0.682 0.424 0.577 0.704 0.549 0.649 0.734 10 LRT 0.469 0.754 0.988 0.848 0.935 0.996 0.991 0.993 1.000 CLRT 0.266 0.565 0.951 0.675 0.817 0.987 0.956 0.964 0.997 RT 0.699 0.969 1.000 0.980 0.998 1.000 0.999 1.000 1.000 25 LRT 0.829 0.996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.770 0.991 1.000 1.000 1.000 1.000 1.000 1.000 1.000 b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.204 0.264 0.380 0.277 0.333 0.405 0.327 0.388 0.424 5 LRT 0.424 0.624 0.874 0.646 0.746 0.941 0.868 0.894 0.993 CLRT 0.082 0.155 0.367 0.178 0.238 0.449 0.351 0.371 0.632 RT 0.330 0.657 0.876 0.592 0.758 0.875 0.748 0.843 0.899 10 LRT 0.394 0.786 0.994 0.846 0.962 1.000 0.994 0.996 1.000 CLRT 0.202 0.537 0.958 0.671 0.830 0.999 0.960 0.984 1.000 RT 0.896 0.997 1.000 1.000 1.000 1.000 1.000 1.000 1.000 25 LRT 0.851 0.998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.789 0.993 1.000 1.000 1.000 1.000 1.000 1.000 1.000 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.195 0.267 0.351 0.263 0.300 0.375 0.321 0.359 0.404 5 LRT 0.559 0.671 0.878 0.730 0.793 0.938 0.889 0.888 0.977 CLRT 0.166 0.228 0.427 0.274 0.322 0.531 0.446 0.455 0.707 RT 0.244 0.402 0.614 0.409 0.491 0.627 0.523 0.567 0.660 10 LRT 0.561 0.798 0.993 0.850 0.922 0.999 0.987 0.991 0.999 CLRT 0.389 0.632 0.960 0.707 0.813 0.990 0.942 0.962 0.999 RT 0.468 0.823 0.972 0.811 0.924 0.981 0.946 0.979 0.982 25 LRT 0.832 0.985 1.000 0.993 1.000 1.000 1.000 1.000 1.000 CLRT 0.781 0.977 1.000 0.989 1.000 1.000 1.000 1.000 1.000 69 Table 6.2.5. Simulated Power vs. Type H for the Test of Compound Symmetry ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.179 0.222 0.360 0.204 0.233 0.363 0.223 0.288 0.382 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.105 0.284 0.620 0.109 0.335 0.629 0.174 0.409 0.634 10 LRT 0.322 0.527 0.977 0.334 0.602 0.987 0.480 0.691 0.995 CLRT 0.079 0.181 0.784 0.085 0.230 0.826 0.149 0.294 0.891 RT 0.111 0.676 0.996 0.128 0.804 0.995 0.288 0.883 0.995 25 LRT 0.155 0.709 1.000 0.201 0.865 1.000 0.568 0.954 1.000 CLRT 0.083 0.563 1.000 0.103 0.753 1.000 0.417 0.905 1.000 b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.150 0.231 0.412 0.166 0.271 0.416 0.201 0.302 0.422 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.112 0.391 0.796 0.121 0.469 0.800 0.201 0.543 0.802 10 LRT 0.225 0.483 0.991 0.253 0.554 0.994 0.415 0.678 0.999 CLRT 0.043 0.135 0.839 0.048 0.176 0.878 0.099 0.241 0.953 RT 0.158 0.879 0.999 0.134 0.956 1.000 0.416 0.976 1.000 25 LRT 0.098 0.659 1.000 0.122 0.843 1.000 0.487 0.959 1.000 CLRT 0.042 0.499 1.000 0.067 0.726 1.000 0.334 0.894 1.000 c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.205 0.239 0.372 0.219 0.268 0.372 0.258 0.295 0.381 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.133 0.268 0.498 0.140 0.314 0.511 0.178 0.370 0.531 10 LRT 0.493 0.686 0.994 0.505 0.753 0.996 0.613 0.799 0.999 CLRT 0.148 0.291 0.892 0.164 0.367 0.917 0.250 0.448 0.963 RT 0.126 0.481 0.914 0.119 0.605 0.916 0.236 0.704 0.928 25 LRT 0.374 0.820 1.000 0.415 0.914 1.000 0.730 0.953 1.000 CLRT 0.255 0.710 1.000 0.291 0.842 1.000 0.615 0.929 1.000 70 Table 6.2.6. Simulated Power vs. Type H for the Test of Compound Symmetry ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.296 0.330 0.389 0.358 0.367 0.388 0.403 0.418 0.437 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.277 0.369 0.518 0.269 0.358 0.492 0.286 0.386 0.507 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.550 0.810 0.965 0.453 0.748 0.947 0.491 0.776 0.960 25 LRT 0.699 0.891 1.000 0.639 0.836 0.999 0.672 0.881 1.000 CLRT 0.288 0.533 0.998 0.246 0.462 0.950 0.257 0.514 0.974 b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.279 0.329 0.400 0.323 0.343 0.401 0.384 0.400 0.429 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.365 0.517 0.725 0.320 0.483 0.677 0.327 0.489 0.685 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.832 0.964 1.000 0.738 0.929 0.999 0.684 0.929 0.997 25 LRT 0.747 0.922 1.000 0.603 0.860 0.997 0.550 0.863 0.998 CLRT 0.350 0.621 0.995 0.197 0.474 0.952 0.176 0.463 0.971 c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.387 0.407 0.462 0.399 0.434 0.488 0.478 0.476 0.514 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.257 0.337 0.443 0.261 0.334 0.436 0.276 0.366 0.469 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.405 0.570 0.835 0.379 0.551 0.807 0.423 0.625 0.845 25 LRT 0.904 0.962 1.000 0.855 0.945 1.000 0.850 0.952 1.000 CLRT 0.643 0.818 1.000 0.551 0.741 0.984 0.536 0.758 0.996 71 Table 6.2.7 displays the simulated power of the test of compound symmetry versus the serial correlation structure shown in (6.0.3). Data were generated from distributions having the serial correlation covariance structure with 2 =1 and = 0.3, 0.6, or 0.9. Only one value of 2 was simulated since both the compound symmetry and serial correlation structures have equal variances. As expected, the power of the CLRT increases as H increases. However, the power of the RT is greatest when = 0.6 in all but three cases (uniform, n=5, p=3; double exponential, n=5, p=3; and double exponential, n=5, p=10). The power of both tests increases as p increases. This is anticipated since as p increases there are more observations for which to estimate . For normally distributed data, the RT is more powerful than the CLRT in seven of the 27 cases. Most of these cases (five of the seven) are when n=5 or 10 and p=3. Even though the RT is more powerful in these situations, the power is still not very high, only reaching 0.398 in the most powerful case (n=25, p=10, = 0.3). In fact, neither the CLRT nor RT are very powerful except when n=25, p=10, and = 0.6 or 0.9. 72 Table 6.2.7. Simulated Power vs. Serial Correlation for the Test of Compound Symmetry ( 2 =1) a. Normal p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.099 0.107 0.106 0.181 0.217 0.199 0.546 0.559 0.368 5 LRT 0.349 0.371 0.415 NA NA NA NA NA NA CLRT 0.055 0.074 0.085 NA NA NA NA NA NA RT 0.085 0.100 0.093 0.163 0.189 0.151 0.295 0.520 0.351 10 LRT 0.146 0.213 0.290 0.394 0.605 0.805 NA NA NA CLRT 0.055 0.085 0.145 0.114 0.221 0.462 NA NA NA RT 0.080 0.109 0.078 0.155 0.294 0.170 0.398 0.828 0.580 25 LRT 0.142 0.298 0.505 0.278 0.808 0.983 0.787 0.996 1.000 CLRT 0.094 0.227 0.442 0.176 0.685 0.969 0.361 0.987 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.116 0.101 0.107 0.167 0.185 0.161 0.507 0.512 0.373 5 LRT 0.279 0.315 0.398 NA NA NA NA NA NA CLRT 0.054 0.051 0.085 NA NA NA NA NA NA RT 0.071 0.104 0.086 0.136 0.206 0.122 0.270 0.507 0.417 10 LRT 0.083 0.163 0.253 0.293 0.535 0.784 NA NA NA CLRT 0.032 0.065 0.118 0.053 0.163 0.476 NA NA NA RT 0.088 0.143 0.097 0.188 0.426 0.215 0.392 0.907 0.809 25 LRT 0.073 0.239 0.463 0.210 0.763 0.973 0.643 0.999 1.000 CLRT 0.049 0.187 0.382 0.119 0.641 0.939 0.258 0.985 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.132 0.126 0.120 0.178 0.263 0.226 0.594 0.589 0.466 5 LRT 0.430 0.462 0.446 NA NA NA NA NA NA CLRT 0.104 0.098 0.117 NA NA NA NA NA NA RT 0.082 0.107 0.092 0.156 0.160 0.180 0.329 0.523 0.309 10 LRT 0.298 0.347 0.464 0.515 0.717 0.890 NA NA NA CLRT 0.165 0.192 0.267 0.207 0.353 0.612 NA NA NA RT 0.081 0.089 0.073 0.151 0.230 0.144 0.357 0.736 0.424 25 LRT 0.311 0.465 0.670 0.543 0.887 0.992 0.895 1.000 1.000 CLRT 0.240 0.385 0.597 0.404 0.808 0.986 0.594 0.997 1.000 73 6.3 TEST OF TYPE H The simulated type I error rates of the test of type H are shown in Tables 6.3.1 through 6.3.3. Recall from Section 2.3 that the data transformation required for the test of type H results in an n×( p 1) data matrix. Therefore, the number of permutations required to perform a permutation test (PT) for n=5 or 10 and p=3 are (2!)5 = 32 and (2!)10 =1024, respectively. Since neither of these situations requires a very large number of permutations, PTs rather than RTs were performed in these cases. The simulated type I error rates of the PT are very low for n=5 and p=3, but due to the small number of possible permutations, the only pvalues less than 0.05 are 0/32=0 and 1/32=0.03125. Therefore, we would expect lower type I error rates in these cases. The simulated type I error rates of the CLRT and PT/RT seem to be unaffected by increases in either d or D. However, they appear to increase as n approaches p and as n exceeds p in the case of the PT/RT. Just as with previous tests, the CLRT performs very well with respect to type I error rates for normally distributed data, but the CLRT is too conservative for uniformly distributed data and the simulated type I error rates are too high for double exponential data. This pattern is also seen with the PT/RT, but unlike the CLRT, the type I error rates for the PT/RT seem to be converging to 0.05 as n increases. Also similar to previous tests, the RT exists in cases when the CLRT does not, specifically when p 2 n , but the type I error rates of the RT are much too high in these cases for the RT to be of any practical use. Due to the inability of the CLRT and the RT to maintain the nominal type I error rate in these cases, these tests will be excluded for these cases in the power 74 discussions to follow, but the simulation results have been included in the tables for completeness. Table 6.3.1. Simulated Type I Error Rates for the Test of Type H ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.015 0.014 0.014 0.012 0.010 0.020 0.011 0.015 0.012 5 LRT 0.167 0.156 0.169 0.171 0.149 0.178 0.158 0.184 0.181 CLRT 0.055* 0.053* 0.045* 0.054* 0.043* 0.057* 0.047* 0.062* 0.066 †PT 0.051* 0.057* 0.059* 0.069 0.066 0.062* 0.060* 0.059* 0.056* 10 LRT 0.077 0.091 0.099 0.097 0.076 0.094 0.080 0.082 0.082 CLRT 0.040* 0.042* 0.053* 0.062* 0.033 0.053* 0.039* 0.043* 0.051* RT 0.046* 0.045* 0.046* 0.044* 0.047* 0.045* 0.049* 0.048* 0.050* 25 LRT 0.067 0.066 0.067 0.068 0.067 0.064 0.064 0.063* 0.064 CLRT 0.056* 0.057* 0.055* 0.055* 0.056* 0.052* 0.054* 0.053* 0.053* b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.012 0.013 0.016 0.018 0.015 0.011 0.012 0.020 0.014 5 LRT 0.128 0.163 0.160 0.145 0.150 0.135 0.145 0.127 0.130 CLRT 0.037* 0.049* 0.051* 0.044* 0.054* 0.042* 0.041* 0.036 0.043* †PT 0.060* 0.065 0.062* 0.044* 0.055* 0.063* 0.068 0.052* 0.050* 10 LRT 0.057* 0.073 0.057* 0.050* 0.056* 0.072 0.053* 0.046* 0.074 CLRT 0.028 0.045* 0.032 0.022 0.026 0.039* 0.027 0.026 0.041* RT 0.057* 0.053* 0.056* 0.056* 0.051* 0.046* 0.053* 0.064 0.049* 25 LRT 0.046* 0.051* 0.040* 0.045* 0.051* 0.050* 0.042* 0.032 0.044* CLRT 0.038 0.035 0.027 0.036 0.043* 0.038* 0.034 0.024 0.035 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.022 0.015 0.019 0.011 0.017 0.017 0.017 0.026 0.019 5 LRT 0.214 0.233 0.213 0.241 0.215 0.224 0.185 0.224 0.238 CLRT 0.080 0.060* 0.073 0.084 0.069 0.082 0.064 0.083 0.067 †PT 0.066 0.084 0.081 0.067 0.077 0.070 0.072 0.065 0.075 10 LRT 0.155 0.151 0.175 0.157 0.160 0.150 0.167 0.177 0.181 CLRT 0.090 0.084 0.109 0.105 0.101 0.095 0.096 0.106 0.111 RT 0.059* 0.046* 0.060* 0.062* 0.051* 0.062* 0.061* 0.063* 0.053* 25 LRT 0.126 0.154 0.176 0.137 0.142 0.165 0.142 0.172 0.157 CLRT 0.109 0.123 0.146 0.114 0.116 0.142 0.121 0.149 0.124 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 †Permutation tests rather than randomization tests were run for n=5, 10 and p=3 75 Table 6.3.2. Simulated Type I Error Rates for the Test of Type H ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.150 0.144 0.156 0.150 0.138 0.147 0.143 0.150 0.150 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.081 0.084 0.089 0.087 0.089 0.095 0.089 0.088 0.096 10 LRT 0.199 0.208 0.190 0.209 0.207 0.190 0.191 0.204 0.192 CLRT 0.060* 0.063* 0.062* 0.066 0.067 0.060* 0.062* 0.064 0.060* RT 0.067 0.068 0.065 0.073 0.071 0.070 0.066 0.077 0.062* 25 LRT 0.081 0.093 0.097 0.087 0.096 0.097 0.096 0.098 0.095 CLRT 0.053* 0.051* 0.059* 0.053* 0.054* 0.058* 0.061* 0.057* 0.059* b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.109 0.124 0.137 0.131 0.131 0.130 0.121 0.134 0.128 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.074 0.070 0.082 0.082 0.089 0.083 0.087 0.077 0.081 10 LRT 0.151 0.193 0.161 0.138 0.174 0.154 0.154 0.152 0.151 CLRT 0.039* 0.058* 0.048* 0.030 0.049* 0.048* 0.036 0.034 0.043* RT 0.076 0.062* 0.048* 0.057* 0.059* 0.051* 0.058* 0.061* 0.050* 25 LRT 0.067 0.084 0.075 0.037* 0.081 0.072 0.040* 0.054* 0.067 CLRT 0.043* 0.052* 0.043* 0.019 0.048* 0.039* 0.022 0.028 0.038* c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.161 0.168 0.178 0.156 0.173 0.170 0.164 0.160 0.171 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.082 0.087 0.108 0.076 0.093 0.108 0.085 0.093 0.108 10 LRT 0.286 0.288 0.343 0.330 0.313 0.333 0.300 0.322 0.335 CLRT 0.114 0.117 0.139 0.108 0.123 0.136 0.105 0.125 0.130 RT 0.066 0.062* 0.072 0.081 0.058* 0.071 0.068 0.069 0.074 25 LRT 0.206 0.226 0.300 0.257 0.225 0.297 0.236 0.240 0.297 CLRT 0.152 0.158 0.233 0.179 0.169 0.224 0.182 0.188 0.229 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 76 Table 6.3.3. Simulated Type I Error Rates for the Test of Type H ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.485 0.443 0.475 0.474 0.492 0.481 0.443 0.467 0.482 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.185 0.170 0.180 0.181 0.194 0.161 0.185 0.184 0.183 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.097 0.092 0.092 0.095 0.091 0.100 0.089 0.094 0.102 25 LRT 0.273 0.280 0.279 0.272 0.278 0.277 0.277 0.278 0.275 CLRT 0.065 0.067 0.063* 0.070 0.065 0.065 0.065 0.066 0.060* b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.488 0.503 0.498 0.452 0.465 0.466 0.430 0.446 0.437 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.201 0.204 0.195 0.187 0.194 0.183 0.170 0.182 0.166 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.078 0.072 0.084 0.077 0.071 0.078 0.074 0.069 0.075 25 LRT 0.379 0.381 0.364 0.284 0.308 0.305 0.198 0.221 0.243 CLRT 0.105 0.105 0.113 0.069 0.071 0.081 0.048* 0.051* 0.061* c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.496 0.509 0.517 0.469 0.493 0.519 0.486 0.510 0.511 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.203 0.215 0.206 0.194 0.200 0.203 0.186 0.193 0.197 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.112 0.114 0.114 0.110 0.108 0.107 0.103 0.102 0.106 25 LRT 0.594 0.613 0.638 0.556 0.566 0.607 0.532 0.547 0.589 CLRT 0.279 0.313 0.350 0.244 0.269 0.319 0.227 0.256 0.300 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 77 The simulated power of the test of type H versus the serial correlation structure is displayed in Tables 6.3.4 through 6.3.6. For these simulations, data were generated from distributions with the serial correlation covariance structure given by (6.0.3) with 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9. Again, PTs rather than RTs were performed when n=5 or 10 and p=3. The power of both the CLRT and PT/RT increases as H increases, but seems to be unaffected by an increase in 2 . The power of both tests decreases as p approaches n. Overall the power of both tests is fairly low with the CLRT achieving a power greater than 0.75 in ten of the 27 normally distributed cases and the PT/RT achieving a power greater than 0.75 in only sixteen of the 81 cases regardless of the distribution. All of these cases are when n=25 and = 0.6 or 0.9. For normally distributed data, the PT/RT is more powerful than the CLRT in only four of the 27 cases. All of these are when n=10 and = 0.3. However, the power of the PT/RT in these cases is extremely low even though greater than the power of the CLRT. 78 Table 6.3.4. Simulated Power vs. Serial Correlation for the Test of Type H ( p = 3 ) a. Normal X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.011 0.011 0.022 0.015 0.013 0.020 0.011 0.016 0.017 5 LRT 0.193 0.198 0.221 0.168 0.226 0.236 0.190 0.227 0.251 CLRT 0.064 0.062 0.071 0.059 0.068 0.086 0.062 0.083 0.084 †PT 0.065 0.075 0.102 0.056 0.072 0.108 0.071 0.077 0.094 10 LRT 0.119 0.164 0.305 0.113 0.175 0.286 0.105 0.192 0.311 CLRT 0.063 0.108 0.199 0.065 0.106 0.178 0.065 0.113 0.216 RT 0.058 0.112 0.245 0.082 0.133 0.217 0.064 0.120 0.218 25 LRT 0.129 0.339 0.613 0.131 0.328 0.613 0.138 0.338 0.624 CLRT 0.101 0.302 0.556 0.112 0.293 0.558 0.115 0.301 0.583 b. Uniform X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.013 0.011 0.014 0.013 0.016 0.022 0.013 0.017 0.017 5 LRT 0.168 0.235 0.309 0.156 0.213 0.313 0.161 0.229 0.333 CLRT 0.051 0.067 0.115 0.043 0.091 0.104 0.042 0.070 0.127 †PT 0.078 0.073 0.103 0.058 0.070 0.109 0.060 0.084 0.101 10 LRT 0.118 0.199 0.368 0.097 0.191 0.345 0.103 0.201 0.356 CLRT 0.061 0.138 0.257 0.057 0.118 0.238 0.057 0.134 0.258 RT 0.072 0.114 0.184 0.058 0.113 0.172 0.072 0.135 0.185 25 LRT 0.119 0.340 0.618 0.103 0.325 0.603 0.100 0.377 0.652 CLRT 0.096 0.298 0.576 0.081 0.284 0.566 0.087 0.338 0.624 c. Double Exponential X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.015 0.013 0.014 0.014 0.014 0.018 0.015 0.023 0.019 5 LRT 0.219 0.243 0.274 0.188 0.246 0.289 0.215 0.257 0.271 CLRT 0.067 0.088 0.097 0.073 0.097 0.094 0.071 0.093 0.092 †PT 0.073 0.081 0.095 0.054 0.075 0.123 0.068 0.073 0.102 10 LRT 0.176 0.237 0.323 0.163 0.219 0.357 0.162 0.231 0.331 CLRT 0.110 0.160 0.223 0.097 0.163 0.239 0.103 0.170 0.237 RT 0.058 0.116 0.191 0.069 0.099 0.194 0.064 0.119 0.181 25 LRT 0.194 0.377 0.594 0.198 0.402 0.612 0.198 0.370 0.607 CLRT 0.165 0.349 0.556 0.166 0.349 0.570 0.165 0.329 0.560 †Permutation tests rather than randomization tests were run for n=5, 10 and p=3 79 Table 6.3.5. Simulated Power vs. Serial Correlation for the Test of Type H ( p = 5 ) a. Normal X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.1400 0.131 0.165 0.152 0.150 0.164 0.113 0.147 0.169 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.090 0.152 0.239 0.101 0.156 0.238 0.115 0.164 0.220 10 LRT 0.237 0.502 0.771 0.297 0.540 0.788 0.288 0.526 0.795 CLRT 0.072 0.237 0.533 0.107 0.292 0.544 0.091 0.265 0.565 RT 0.136 0.321 0.528 0.139 0.334 0.522 0.133 0.319 0.508 25 LRT 0.304 0.837 0.988 0.319 0.800 0.985 0.276 0.808 0.987 CLRT 0.203 0.751 0.975 0.219 0.738 0.975 0.213 0.725 0.975 b. Uniform X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.127 0.170 0.192 0.140 0.152 0.189 0.108 0.148 0.196 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.110 0.165 0.232 0.105 0.157 0.209 0.089 0.172 0.222 10 LRT 0.226 0.523 0.826 0.266 0.548 0.830 0.234 0.532 0.831 CLRT 0.077 0.247 0.623 0.087 0.272 0.623 0.087 0.283 0.619 RT 0.158 0.323 0.452 0.156 0.313 0.464 0.141 0.327 0.472 25 LRT 0.254 0.799 0.989 0.232
Click tabs to swap between content that is broken into logical sections.
Rating  
Title  Permutation Test for the Structure of a Covariance Matrix 
Date  20070501 
Author  Morris, Tracy Lynne 
Department  Statistics 
Document Type  
Full Text Type  Open Access 
Abstract  Many statistical procedures, such as repeated measures analysis, timeseries, structural equation modeling, and factor analysis, require an assessment of the structure of the underlying covariance matrix. The classical parametric method of testing such a hypothesis involves the use of a likelihood ratio test (LRT). These tests have many limitations, including the need for very large sample sizes and the requirement of a random sample from a multivariate normal population. The LRT is also undefined for cases in which the sample size is not greater than the number of repeated measures. In such situations, researchers could benefit from a nonparametric testing procedure. In particular, permutation tests have no distributional assumptions and do not require random samples of any particular size. This research involves the development and analysis of a permutation/randomization test for the structure of a covariance matrix. Samples of various sizes and number of measures on each subject were simulated from multiple distributions. In each case, the type I error rates and power were examined. Findings and conclusions./ When testing for sphericity, compound symmetry, type H structure, and serial correlation, the LRT clearly performs best with regard to type I error rates for normally distributed data, but for uniform data, it is too conservative, and for double exponential data, it results in extremely large type I error rates. The randomization test, however, is consistent regardless of the data distribution and performs better than the LRT, in most cases, for nonnormally distributed data. In most situations, the LRT is more powerful than the randomization test, but the power of the randomization test is comparable to that of the LRT in many situations. 
Note  Dissertation 
Rights  © Oklahoma Agricultural and Mechanical Board of Regents 
Transcript  A PERMUTATION TEST FOR THE STRUCTURE OF A COVARIANCE MATRIX By TRACY LYNNE MORRIS Bachelor of Science in Mathematics Oklahoma State University Stillwater, Oklahoma 1994 Master of Science in Applied Mathematical Sciences University of Central Oklahoma Edmond, Oklahoma 2001 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY May, 2007 ii A PERMUTATION TEST FOR THE STRUCTURE OF A COVARIANCE MATRIX Dissertation Approved: Dr. Mark Payton Dissertation Adviser Dr. William Warde Dr. Stephanie Monks Dr. Douglas Aichele Dr. A. Gordon Emslie Dean of the Graduate College iii ACKNOWLEDGMENTS I would like to express my sincere gratitude to Dr. Mark Payton for his support, expertise, encouragement, and advice, without which I would not have been able to complete this dissertation. I would also like to thank the other members of my committee, Dr. William Warde, Dr. Stephanie Monks, and Dr. Douglas Aichele, for reviewing my work and providing invaluable guidance. Finally, I would like to extend my gratitude to the entire Department of Statistics. Special thanks also goes to Dr. Mauricio Subieta and the High Performance Computing Center. Without Dr. Subieta’s help and the use of the computing center I would undoubtedly be running simulations for many months to come. Finally, I would like to thank my family and friends for their support. I am especially thankful for my husband, Mark, and my mother, Kay, for their loving encouragement and emotional support throughout this process. iv TABLE OF CONTENTS Chapter Page 1. INTRODUCTION ....................................................................................................1 2. PARAMETRIC TESTS FOR THE STRUCTURE OF A COVARIANCE MATRIX...............................................................................................................3 2.1 Tests of Sphericity .........................................................................................5 2.2 Test of Compound Symmetry......................................................................10 2.3 Test of Type H Structure..............................................................................13 2.4 Test of Serial Correlation.............................................................................15 2.5 Test of Independence of Sets of Variates ....................................................19 2.6 Factor Analysis / Structural Equation Modeling .........................................21 3. BOOTSTRAPPING................................................................................................26 4. PERMUTATION TESTS.......................................................................................28 5. PROPOSED TEST..................................................................................................33 5.1 Permutation Tests of Sphericity and Compound Symmetry........................34 5.2 Permutation Test of Type H Structure .........................................................38 5.3 Permutation Test of All Other Covariance Structures .................................40 6. SIMULATIONS .....................................................................................................45 6.1 Test of Sphericity.........................................................................................50 6.2 Test of Compound Symmetry......................................................................63 6.3 Test of Type H Structure..............................................................................73 6.4 Test of Serial Correlation.............................................................................81 6.5 Test of Independence of Sets of Variates ....................................................92 7. CONCLUSIONS...................................................................................................100 v 8. FUTURE WORK..................................................................................................102 BIBLIOGRAPHY......................................................................................................105 APPENDIX................................................................................................................111 A.1 R Code.......................................................................................................111 A.1.1 Randomization Test of Sphericity.................................................111 A.1.2 Randomization Test of Compound Symmetry..............................118 A.1.3 Permutation Test of Type H Structure ..........................................125 A.1.4 Randomization Test of Type H Structure .....................................132 A.1.5 Randomization Test of Serial Correlation ....................................139 A.1.6 Randomization Test of Independence of Sets of Variates ............149 A.1.7 Randomization Test of Independence of Sets of Variates for n=5 and (5,5)............................................................................157 A.2 Combinations of d and D used in the Type H Simulations........................162 A.3 Bias Correction for the Serial Correlation Parameters..............................166 A.4 Number of Simulated Data Sets for the Test of Independence of Sets of Variates for n=5 and (5,5)..................................................................180 vi LIST OF TABLES Table Page 4.0.1 All Possible Permutations of the Observed Data...................................................29 5.1.1 Observed Data........................................................................................................36 5.1.2 Some Permutations of the Centered Observed Data..............................................37 6.1.1 Simulated Type I Error Rates for the Test of Sphericity .......................................52 6.1.2 Simulated Power vs. NonHomoscedasticity for the Test of Sphericity ...............54 6.1.3 Simulated Power vs. NonZero Correlation for the Test of Sphericity .................56 6.1.4 Simulated Power vs. Type H for the Test of Sphericity (p=3) ..............................58 6.1.5 Simulated Power vs. Type H for the Test of Sphericity (p=5) ..............................59 6.1.6 Simulated Power vs. Type H for the Test of Sphericity (p=10) ............................60 6.1.7 Simulated Power vs. Serial Correlation for the Test of Sphericity ( 2 =1) .........62 6.2.1 Simulated Type I Error Rates for the Test of Compound Symmetry (p=3) ..........64 6.2.2 Simulated Type I Error Rates for the Test of Compound Symmetry (p=5) ..........65 6.2.3 Simulated Type I Error Rates for the Test of Compound Symmetry (p=10) ........66 6.2.4 Simulated Power vs. Type H for the Test of Compound Symmetry (p=3) ...........68 6.2.5 Simulated Power vs. Type H for the Test of Compound Symmetry (p=5) ...........69 6.2.6 Simulated Power vs. Type H for the Test of Compound Symmetry (p=10) .........70 6.2.7 Simulated Power vs. Serial Correlation for the Test of Compound Symmetry ( 2 =1) ............................................................................................................72 6.3.1 Simulated Type I Error Rates for the Test of Type H (p=3)..................................74 vii 6.3.2 Simulated Type I Error Rates for the Test of Type H (p=5)..................................75 6.3.3 Simulated Type I Error Rates for the Test of Type H (p=10)................................76 6.3.4 Simulated Power vs. Serial Correlation for the Test of Type H (p=3) ..................78 6.3.5 Simulated Power vs. Serial Correlation for the Test of Type H (p=5) ..................79 6.3.6 Simulated Power vs. Serial Correlation for the Test of Type H (p=10) ................80 6.4.1 Simulated Type I Error Rates for the Test of Serial Correlation (p=3) .................83 6.4.2 Simulated Type I Error Rates for the Test of Serial Correlation (p=5) .................84 6.4.3 Simulated Type I Error Rates for the Test of Serial Correlation (p=10) ...............85 6.4.4 Simulated Power vs. Compound Symmetry for the Test of Serial Correlation ( 2 =1) ............................................................................................................87 6.4.5 Simulated Power vs. Type H for the Tets of Serial Correlation (p=3) ..................89 6.4.6 Simulated Power vs. Type H for the Tets of Serial Correlation (p=5) ..................90 6.4.7 Simulated Power vs. Type H for the Tets of Serial Correlation (p=10) ................91 6.5.1 Simulated Type I Error Rates for the Test of Independence of Sets of Variates...96 6.5.2 Simulated Power vs. NonIndependence for the Test of Independence of Sets of Variates........................................................................................................99 A.2.1 Combinations of d and D Used in the Simulations...............................................165 A.3.1 Simulated Values of K ........................................................................................172 A.3.2 Simulated Type I Error Rates for the Test of Serial Correlation with Normally Distributed Data and p=3...............................................................................174 A.4.1 Number of Data Sets Simulated for the Test of Independence of Sets of Variates for n=5 and (5,5) ............................................................................................180 viii LIST OF FIGURES Figure Page 2.4.1 General Plot of f (Hˆ ) ............................................................................................18 5.1.1 Distribution of the Test Statistic for Compound Symmetry ..................................38 5.3.1 Distribution of the Test Statistic for Serial Correlation.........................................44 A.2.1 Regions of Possible Combinations of d and D .....................................................164 A.3.1 Bias of the MLEs of H and 2 .............................................................................167 A.3.2 Bias of 2 * ˆ and 2 * ˆ 1 n n .......................................................................................169 A.3.3 Scatterplots of K and nK ...................................................................................173 1 CHAPTER 1 INTRODUCTION Many statistical procedures, including repeated measures analysis, timeseries, structural equation modeling, and factor analysis, require an assessment of the structure of the covariance matrix of the measurements. For example, consider a repeated measures experiment in which researchers are interested in the effect of various teaching strategies on reading. Throughout the course of the experiment, reading tests are given to children at various time periods and the multiple test scores are recorded for each student. Repeated measurements taken on subjects tend to be correlated; consequently, the assumption of independent observations required by a univariate analysis of variance (ANOVA) is violated. However, Huynh and Feldt (1970) showed that if the structure of the covariance matrix is of the same form as a type H matrix (described in Section 2.3), a univariate ANOVA can be used. If the covariance matrix is not type H, an alternate analysis must be employed. Therefore, it is necessary to determine the structure of the covariance matrix to know how to proceed with the analysis. The classical parametric method of testing the hypothesis = 0 , where 0 is some hypothesized covariance structure, involves the use of a likelihood ratio test statistic that converges in distribution to a chisquared random variable. This test has many limitations, including the need for very large sample sizes and the requirement of a random sample from a multivariate normal population. It is quite reasonable to think of 2 many situations in which at least one of these conditions is violated. For example, in educational and medical studies, researchers frequently rely on volunteers, violating the assumption of a random sample; in psychological studies, responses are often recorded on Likert scales, violating the assumption of multivariate normality; and in studies in which experimental units are rare or costly, researchers are restricted to very small sample sizes. In situations in which only some or none of these assumptions are met, researchers could benefit from a nonparametric testing procedure. In particular, permutation tests have no distributional assumptions, do not require random samples, and allow any sample size. The objectives of this research are to develop a permutation testing procedure to test the null hypothesis, = 0 , and to investigate the empirical type I error rates and power of this test against various alternative structures. In the following chapters, I will present the motivation for developing such a test. Specifically, in Chapter 2, I will describe the parametric procedures for testing the structure of a covariance matrix, including a discussion of their benefits and limitations; in Chapter 3, I will briefly discuss the use of bootstrapping for estimating or testing the covariance structure; in Chapter 4, I will review the general history and development of permutation tests, including a description of the differences between permutation tests and bootstrapping; in Chapter 5, I will propose a permutation test for the structure of a covariance matrix and will argue as to why such a test would be appropriate and necessary; in Chapter 6, I will describe the evaluation of the proposed test using simulations; in Chapter 7, I will summarize the overall conclusions; and finally, in Chapter 8, I will list some future research questions. 3 CHAPTER 2 PARAMETRIC TESTS FOR THE STRUCTURE OF A COVARIANCE MATRIX The classical approach to testing the structure of a covariance matrix involves the use of a likelihood ratio test statistic. Let xi , i =1,K,n , be pcomponent vectors distributed according to ( , ) p N μ , where is positive definite and n > p . The likelihood ratio criterion for testing 0 0 H : = versus 0 : a H can be found by computing the ratio of the likelihood maximized under the null hypothesis (i.e. with respect to μ and 0 ) to the likelihood maximized under the alternative hypothesis (i.e. with respect to μ and an unrestricted positive definite ). The likelihood function under the alternative hypothesis is given by 2 2 1 ( ) 1 ( ) 2 1 (2 ) exp n np n i i i L = = x μ x μ and the corresponding log likelihood function is 1 1 1 ( ) 1 ( ) 2 2 2 1 log log(2 ) log n i i i L np n = = x μ x μ , (2.0.1) where log is assumed base e. Since log L is an increasing function of L, the values that maximize (2.0.1) are equivalent to the values that maximize L. To maximize (2.0.1) with respect to μ and , consider the following lemma given by Anderson (2003). 4 Lemma 2.0.1. Let xi , i =1,K,n , be pcomponent vectors, and let x be the corresponding sample mean vector. Then for any vector b ( )( ) ( )( ) ( )( ) 1 1 n n i i i i i i n = = x b x b = x x x x + x b x b . Applying the properties of the trace of a matrix (tr(•)) and lemma 2.0.1 to just the last term of (2.0.1) gives ( ) ( ) ( ) ( ) ( )( ) ( )( ) ( )( ) ( )( ) ( ) ( ) 1 1 1 1 1 1 1 1 1 1 1 1 tr tr tr tr tr . n n i i i i i i n i i i n i i i n i i i n n = = = = = = = = + = + x μ x μ x μ x μ x μ x μ x x x x x μ x μ x x x x x μ x μ Therefore, (2.0.1) can be written as ( )( ) ( ) ( ) 1 1 2 2 1 1 1 1 2 2 1 log log(2 ) log tr . n i i i L np n n = = x x x x x μ x μ (2.0.2) To maximize log L with respect to μ , it is only necessary to consider the last term of (2.0.2). Since is positive definite, 1 is also positive definite. Therefore, 1 ( ) 1 ( ) 2 n x μ x μ 0 and is maximized at 0 if and only if μ = x . Consequently, the maximum likelihood estimate (MLE) of μ is μˆ = x . Substituting x for μ , (2.0.2) simplifies to 1 1 1 1 ( )( ) 2 2 2 1 log log(2 ) log tr . n i i i L np n = = x x x x (2.0.3) 5 To find the MLE of , consider another lemma given by Anderson (2003). Lemma 2.0.2. If D is positive definite of order p, the maximum of f (G) = n log G tr (G 1D) with respect to positive definite matrices G exists and occurs at G = (1 n)D. To maximize (2.0.3) with respect to it is only necessary to consider the second and third terms. Applying lemma 2.0.2 to these terms gives the MLE of , ( )( ) 1 ˆ 1 n i i n i= = x x x x . These MLEs for μ and along with the MLEs found under various null hypotheses can be used to compute likelihood ratio statistics for parametrically testing the structure of a covariance matrix. Specific test statistics for various covariance structures are outlined in the sections to follow. 2.1 TESTS OF SPHERICITY Consider first the test of sphericity proposed by Mauchly (1940). A pvariate population is called spherical if the variances of the variables are all equal and the pairwise correlations among the variables are all zero. Specifically, this is a test of the null hypothesis 2 S p = I , where p I is the p× p identity matrix and 2 is the hypothesized common variance among the variables. This hypothesis applies to many univariate procedures, such as ANOVA, in which it is assumed that a set of random 6 variables are independent and have a common variance. To test this assumption, the likelihood ratio criterion given by ( ) ( ) , 2 , max , max , S S L L = μ μ μ μ can be computed. As shown previously, the MLE of μ does not depend on the specific form of ; therefore, the MLEs of μ and , in both the numerator and denominator of S , are given by μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x . To find the MLE of 2 , consider the following. Under the null hypothesis, the log likelihood function is ( )( ) 2 1 1 2 1 2 2 2 1 log log(2 ) log n S i i i L np np = = x μ x μ (2.1.1) and the partial derivative of (2.1.1) with respect to 2 is ( )( ) 2 2 4 1 log 1 2 2 n S i i i L np = = + x μ x μ . (2.1.2) Substituting x for μ and setting (2.1.2) equal to 0 gives the MLE of 2 , 2 ( ) ( ) 1 ˆ 1 n i i np i= = x x x x . Then, the likelihood ratio criterion for testing 2 S p = I becomes ( )( ) ( ) ( ) ( ) ( ) 2 2 2 2 1 1 2 2 1 2 2 2 2 1 1 2 1 (2 ) ˆ exp ˆ ˆ (2 ) ˆ exp ˆ ˆ n n np n i i i S n n np np i i i = = = = I x x I x x x x x x 7 since ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) 2 1 1 1 1 2 2 1 1 1 1 2 1 1 1 2 2 2 1ˆ 2 2 1 1 2 1 ˆ tr ˆ tr ˆ tr ˆ ˆ ˆ ˆ . n n i i i i i i n i i i np n i i i n np = = = = = = = = = = x x x x x x x x x x x x x x I x x (2.1.3) 2 n S is often called the W statistic in the literature and is usually expressed as ( 2 ) (1 ) ( 1 ( )) 1 ˆ ˆ ˆ ˆ ˆ tr ˆ p p p p p j jj p W = = = = (2.1.4) where ˆ jj is the jth diagonal element of ˆ which corresponds to the variance of the jth variable. To use W for hypothesis testing, it is necessary to know its distribution. Mauchly (1940) gave the exact distribution of W for p = 2 and Consul (1967a) for p = 3,4 , and 6. Nagarsenker and Pillai (1973a, 1973b) derived the exact distribution of W in series form and have published tables of 1% and 5% critical values for various combinations of p and n. However, due to the complexity of the exact distribution, the asymptotic distribution of W is most commonly used in practice. Similar to other likelihood ratio criteria, nlogW is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 1 (2.1.5) 8 degrees of freedom. This approximation works well for large sample sizes, but performs poorly for small sample sizes. Therefore, Anderson (2003), using a method derived by Box (1949), found a correction factor such that (n 1)C logW is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( ) 2 2 2 1 6 1 C p p p n + + = . (2.1.6) Many authors have found, through Monte Carlo simulation, that Mauchly’s (1940) test of sphericity has poor power and is not robust to nonnormality. Box (1954) developed a measure of the degree to which a covariance matrix is spherical. He called this measure , given by ( )2 1 2 1 p j j p j j p = = = , where j , j =1,K, p , are the eigenvalues of . If is spherical, all of the eigenvalues are equal and =1. The further departs from sphericity, the smaller the value of becomes until reaches its minimum at 1 ( p 1) . For p = 4 , Boik (1975) found that must be as low as 0.644 for n =18 and 0.828 for n = 36 before the power of Mauchly’s test of sphericity is greater than 0.70. Cornell, et al. (1992) found similar results. For p = 3 , was 0.51 for n =10 and 0.77 for n = 30 before the power exceeded 0.70; and for p = 5 , was 0.43 for n =10 and 0.5 for n = 30 before the power exceeded 0.70. Therefore, it appears that this test does not have the ability to detect small departures 9 from sphericity, for which the univariate ANOVA Ftests are susceptible (Boik, 1981; Box, 1954; Geisser & Greenhouse, 1958). Other studies have explored the effects of nonnormality on Mauchly’s (1940) test of sphericity. Huynh & Mandeville (1979) simulated data from three different lighttailed distributions (the uniform distribution on (0,1), the convolution of two uniforms forming a triangular distribution, and the convolution of three uniforms forming a trapezoidal distribution) and five different heavytailed distributions (the distribution of the product of a uniform random variable and a standard normal random variable, the double exponential distribution, and three mixtures of two standard normal distributions). They found that Mauchly’s (1940) test of sphericity is conservative in terms of the type I error rate for lighttailed distributions; however, the type I error rates are much larger than the respective nominal rates for heavytailed distributions. Also, as the sample size increases, the test becomes more conservative for lighttailed distributions and less conservative for heavytailed distributions. Another study by Keselman, et al. (1980) presented simulated data from a chisquared distribution with 3 degrees of freedom for which the type I error rate was 0.203, well above the nominal rate of 0.05. One alternative parametric test of sphericity is the locally best invariant test developed by John (1971, 1972) and Sugiura (1972). The test statistic is ( ) ( ) 2 2 tr ˆ tr ˆ V = (2.1.7) and Sugiura showed that 2 1 2 p n V p (2.1.8) 10 is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 1 degrees of freedom. This test has slightly greater power than Mauchly’s (1940) test of sphericity, with the difference increasing as p approaches n. However, this test still suffers from a lack of power to detect small departures from sphericity (Carter & Srivastava, 1983; Cornell et al., 1992). In addition to Mauchly’s (1940) test of sphericity and the locally best invariant test, several other tests of sphericity exist. One such statistic developed by Krishnaiah and Waikar (1972) consists of the ratio of the largest to smallest eigenvalues of ˆ and another family of test statistics is based on Roy’s unionintersection principle (Khatri, 1978; Srivastava & Khatri, 1979; Venables, 1976). In each case, however, the power is smaller than for Mauchly’s test of sphericity (Cornell et al., 1992). Therefore, the details of these tests will not be discussed here. 2.2 TEST OF COMPOUND SYMMETRY Independence between variables is actually too restrictive an assumption for a valid univariate ANOVA. It has been shown that the compound symmetry covariance structure is sufficient (Box, 1950). This structure arises when the variances of the variables are all equal and the covariances (or pairwise correlations) of the variables are all equal. Wilks (1946) was the first to develop a test for compound symmetry structure. In matrix notation, this is a test of the null hypothesis 2 (1 ) CS p p p = I + 1 1 , where 2 is the common variance, is the common pairwise correlation, p I is the p× p identity matrix, and p 1 is a p×1 vector of ones. 11 The derivation of this test is an extension of Mauchly’s (1940) test of sphericity. The likelihood ratio criterion is given by ( ) ( ) , 2 , , max , max , CS CS L L = μ μ μ μ (2.2.1) and the MLEs of μ and in both the numerator and denominator of CS can be found as shown at the beginning of Chapter 2. That is, μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x . To find the MLEs of 2 and , it will be necessary to determine the inverse of the covariance matrix under the null hypothesis. Call this matrix 1 CS . Wilks (1946) showed that 1 CS A B B B A B B B A = L L M M O M L where ( ) 2 ( )( ( ) ) 1 2 1 1 1 p A p + = + and 2 (1 )(1 ( 1) ) B p = + . He also noted that 1 ( ) 1 ( ( ) ) 1 p CS A B A p B = + . Therefore, the log likelihood, under the null hypothesis, becomes ( ) ( ( ) ) ( ) ( )( ) 1 1 1 2 2 1 2 2 1 1 1 1 log log(2 ) log 1 p CS n p n p ij j ij j ik k i j i j k L np n AB A p B A x B x x = = = = = + + μ + μ μ 12 where ij x and j μ are the jth elements of i x and μ , respectively. The MLEs of A and B can be found by substituting x for μ and solving the system of equations given by log CS 0 L A = and log CS 0 L B = . This results in the following MLEs of A and B, ( ) 2 ( )( ( ) ) ˆ 1 2 1 1 1 p r A s r p r + = + and 2 ( )( ( ) ) ˆ 1 1 1 B r s r p r = + where ( )( ) 1 1 n jk ij j ik k i s x x x x n = = , ( ) 1 1 1 p jk j k p jj j s r p s = = = , and 2 1 p jj j s s p = = . Substituting these MLEs into (2.2.1) and applying a similar argument to that shown in (2.1.3) we obtain ( ) ( ( ) ) ( ) ( ) ( ) ( ) ( ) ( ( ) ) ( ) ( ( ) ) ( ) ( ) 1 2 2 1 1 2 1 2 2 1 1 2 1 2 1 2 1 2 2 2 2 2 2 (2 ) ˆ ˆ ˆ 1 ˆ exp ˆ (2 ) ˆ exp ˆ ˆ ˆ ˆ ˆ 1 ˆ ˆ 1 1 1 1 1 ˆ 1 p n n np i CS i i CS n n np i i i n p n p n n n p A B A p B A B A p B s r s p r s r = = = + = + = !! "" ! " # $ ! + " # $ = x x x x x x x x ( ( ) ) 2 1 . 1 1 n p p r + Wilks (1946) determined the exact distribution of 2 n CS for p = 2 and 3. However, the derivation of the exact distribution for larger values of p is too complex to 13 be of practical use. Therefore, the asymptotic distribution is more commonly used. Specifically, log P2 n CS n is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 2 degrees of freedom. As with other likelihood ratio tests, this is a good approximation for large sample sizes, but is very poor for small sample sizes. Therefore, the corrected likelihood ratio test derived by Box (1950) is preferred. Box found that ( 1) log 2 n CS n C is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( )( ) ( )( )( ) 2 2 1 2 3 1 6 1 1 4 p p p C n p p p + = + . 2.3 TEST OF TYPE H STRUCTURE Huynh and Feldt (1970) and Rouanet and Lepine (1970) showed independently that the conditions required for a valid univariate ANOVA are actually less stringent than the sphericity or compound symmetry conditions. Specifically, they found that if the covariance matrix is of the form TH ij p× p = , (2.3.1) where 1 ( ) ij 2 ii jj = + % for i j and some % > 0 , then the mean square ratios in the univariate ANOVA have exact Fdistributions. Huynh and Feldt called this covariance form a type H matrix. (Notice that when the variances are equal in a type H matrix, the covariance matrix has compound symmetry.) More recently, type H structure has come to be known as spherical. However, since both forms will be discussed separately in this 14 paper, the covariance structure of Section 2.1 will be referred to as spherical and that of this section will be referred to as type H. Conveniently, Mauchly’s (1940) test of sphericity described in Section 2.1 can be used to test whether a covariance matrix has the type H structure (Kuehl, 2000). Let C be a ( p 1)× p matrix whose rows are normalized orthogonal contrasts on the p repeated measures. If is of type H then can be expressed as p = A + A + %I where the elements in the ith row of A are equal to 1 ( ) i 2 ii a = % . Then, C C = CAC +CA C + CC . Since each row of A consists of equivalent elements and C is orthogonal, it can be shown that AC = CA = 0 and p 1 CC = I . Therefore, p 1 C C = %I and Mauchly’s test of sphericity can be used to test 0 1 : p H C C = %I versus 1 :a p H C C =/ %I . Substituting 1 p for p and ˆ C C for ˆ in (2.1.4), (2.1.5), and (2.1.6), the test statistic is ( ( )) 1 1 1 ˆ tr ˆ p p W = C C C C and nlogW is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p 1 1 degrees of freedom, or after applying a correction factor, (n 1)C logW is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above, where ( )( ) 2 2 3 3 1 . 6 1 1 C p p p n + = (2.3.2) Just as for the test of sphericity, there are alternative tests for type H covariance structure, including a locally best invariant test. Substituting p 1 for p and C ˆC for 15 ˆ in (2.1.7) and (2.1.8) yields the corresponding test statistic and asymptotic distribution. Krishnaiah and Waikar’s (1972) test and the unionintersection tests described in Section 2.1 can also be adapted to test for type H structure. All of these tests, however, suffer from the same limitations as the tests of sphericity. They have poor power, especially to detect small departures from type H structure, and are not robust to nonnormality. 2.4 TEST OF SERIAL CORRELATION For designs, such as repeatedmeasures, in which one of the factors is time, observations closer together temporally tend to be more highly correlated than those farther apart. This covariance pattern is known as serial correlation, simplex, or autoregressive of order one and has the form 2 1 2 2 2 1 2 3 1 1 1 1 p p SC p p p = L L M M M O M L , (2.4.1) where 2 (1 2 ) is the common variance of the p observations and is the correlation between successive observations in time. Hearne et al. (1983) developed a likelihood ratio test for the null hypothesis SC = . The derivation of this test is as follows. The likelihood ratio criterion is given by ( ) ( ) , 2 , , max , max , SC SC L L = μ μ μ μ (2.4.2) 16 where the MLEs of μ and in both the numerator and denominator are μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x as shown at the beginning of Chapter 2. Before deriving the MLEs of 2 and , note that it can be shown 2 p (1 2 ) SC = and 1 ( 2 ) 2 SC 1 2 p = C C + I , where p, 2 , and are as defined previously, p I is the p× p identity matrix and 1 C and 2 C are given by 1 2 0 01 1 10 1 and 1 . 1 01 0 10 × × = = O OO O p p p p C C 0 0 0 0 (2.4.3) Using this notation and substituting x for μ , the loglikelihood under the null hypothesis can be expressed as ( ) ( ) ( ) ( ) ( ) ( ) ( ( ))( ) ( ) ( ) 2 2 2 2 2 2 2 1 1 1 1 2 2 0 2 0 1 1 1 1 1 2 2 2 1 2 1 2 1 1 1 2 1 2 1 2 2 2 2 1 2 2 2 3 log log 2 log log 2 log log 2 log log 1 , p n SC i i i n i p i i L np n np n np np n S S S = = = = + = + + x x x x x x C C I x x where ( ) ( ) 1 1 1 n i i i S = = x x C x x , ( ) ( ) 2 1 2 n i i i S = = x x C x x , and ( )( ) 3 1 n i i i S = = x x x x . Taking the partial derivatives of log SC L with respect to 2 and yields 2 2 2 4 1 4 2 4 3 log 1 2 2 2 2 SC L np S S S = + + 17 and 2 2 1 2 2 log 1 1 2 SC L n S S = + . Setting these derivatives equal to zero and solving simultaneously results in 2 1 ( 2 ) 1 2 3 ˆ ˆ ˆ np = S S + S (2.4.4) and ( ) 3 ( ) 2 ( ) 1 2 1 3 2 2S 1 p ˆ + S 2 p ˆ 2 S p + S ˆ + S p = 0 . (2.4.5) Note that the MLE of 2 is easy to obtain once the MLE of has been determined; however, there are three possible solutions for ˆ to equation (2.4.5). To determine the appropriate solution consider the following. Call the lefthand side of (2.4.5) f (Hˆ ) . Then, ( ) ( ) ( ) ( ) ( ) 1 2 1 3 2 1 2 3 f 1 = 2S 1 p + S 2 p + 2 S p + S + S p = 2 S + S + S > 0, and ( ) ( ) ( ) ( ) ( ) 1 2 1 3 2 1 2 3 f 1 = 2S 1 p + S 2 p 2 S p + S + S p = 2 S S + S < 0. and, consequently, there must be at least one solution in the interval ( 1,1) . If there is only one solution in ( 1,1) then that is the only reasonable solution since the MLE of the correlation between successive observations in time must be in ( 1,1) . Now note that f ( () = ( and f (() = (. So, a general plot of f (Hˆ ) would appear as shown in Figure 2.4.1 below with one solution in each of ( (, 1) , ( 1,1) , and (1,() . Therefore, there is one and only one solution in ( 1,1) which is the desired MLE of H . 18 Figure 2.4.1. General Plot of f (Hˆ ) . 1 1 f ( ) Finally, after substituting these MLEs into (2.4.2) and applying an argument similar to (2.1.3), the likelihood ratio criterion becomes ( ) ( ) ( ( ))( ) ( ) ( ) ( ) 2 2 2 2 2 2 2 ˆ 1 1 2 2 1 ˆ 2 ˆ 1 2 1 2 2 2 1 1 ˆ 2 1 ˆ 1 (2 ) exp ˆ ˆ ˆ (2 ) ˆ exp ˆ p p n n np n i p i i SC n n n np i i i = = + = = x x C C I x x x x x x where log P2 n SC n is asymptotically distributed as a chisquared random variable with 1 ( ) 2 p p +1 2 degrees of freedom. A correction factor for this likelihood ratio test, similar to those for the tests of sphericity and compound symmetry, is not known to exist, and Hearne and Clark (1983) even go so far as to say that one is not tractable. Therefore, using simulation and simple linear regression, they derived an approximate correction factor given by ˆ 1 ( 1.541 1.017 0.414 ) n C = + n p , where ˆ log P2 n SC Cn is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above. 19 2.5 TEST OF INDEPENDENCE OF SETS OF VARIATES In some situations it may be of interest to determine whether k groups of variables are mutually independent. Let xi , i =1,K,n , be partitioned into k subvectors with 1 2 , , ,k p p K p ( ) m p = p components so that ( (1) , (2) , , (k ) ) i i i i x = x x K x . Also, let μ and be partitioned accordingly; (1) (2) (k ) ! " = ! " ! " !! "" # $ μ μ μ μ M , and 11 12 1 21 22 2 1 2 k k k k kk ! " = ! " ! " ! " # $ K K M M O M K where ij ji = . The null hypothesis of interest is that the subvectors (1) , (2) , , (k ) i i i x x K x are mutually independent. This is equivalent to testing I = , where 11 22 I kk ! " = ! " ! " ! " # $ 0 0 0 0 0 0 K K M M O M K . Wilks (1935) is credited with developing a likelihood ratio criterion for testing this hypothesis. Consider the likelihood ratio criterion given by ( ) ( ) , , max , max , I I I L L = μ μ μ μ . As shown at the beginning of Chapter 2, the MLEs of μ and in the numerator and denominator of I are given by μˆ = x and ( )( ) 1 ˆ 1 n i i n i= = x x x x , 20 where x is partitioned as x = (x(1) , x(2) ,K, x(k ) ). Under the null hypothesis, the likelihood function becomes ( ( ) ) 1 , k m I m mm m L L = =) μ , where ( ( ) ) 2 2 1 ( ( ) ( ) ) 1 ( ( ) ( ) ) 2 1 , (2 ) m exp n m np n m m m m m mm mm i mm i i L = = μ x μ x μ . Maximizing I L is equivalent to maximizing ( ( ) ) 1 , k m m mm m L = ) μ and, since the likelihood function is strictly nonnegative, ( ) ( ) ( ) 0 ( ) ( ) , 1 1 , max , max , m mm k k m m m mm m mm m m L L = = ) =) μ μ μ μ . Therefore, each ( (m) , ) m mm L μ can be maximized separately. Thus, the MLEs of μ(m) and mm can be found as shown at the beginning of Chapter 2. That is, μˆ (m) = x(m) and ( ( ) ( ) )( ( ) ( ) ) 1 ˆ 1 n m m m m mm i i n i= = x x x x . By a similar argument to (2.1.3), the likelihood ratio criterion then becomes ( ) ( ) ( ) ( ) 2 2 ( ) ( ) 1 ( ) ( ) 1 2 2 1 1 2 2 2 1 1 2 1 1 (2 ) ˆ exp ˆ ˆ (2 ) ˆ exp ˆ ˆ m k n n np m m m m n mm i mm i m i I n n k n np i i mm i m = = = = = = ) ) x x x x x x x x . This can be further reduced by recognizing that each element of ˆ (and consequently each element of ˆ mm ) can be expressed as ij ij ii jj s = r s s , where ij s and ij r are the sample covariance and sample correlation, respectively, of the ith and jth variables. After 21 calculating the determinants and canceling like terms in the numerator and denominator, I can be expressed entirely in terms of the sample correlation matrix, ˆR , as 2 2 2 2 1 1 ˆ ˆ ˆ ˆ n n I k n k n mm mm m= m= = = ) ) R R Wilks (1935), Wald and Brookner (1941), and Consul (1967b) have determined the exact distribution of I for various values of k and m p (m =1,Kk ). However, the asymptotic distribution determined by Box (1949) is much more practical and is applicable to any combination of k and m p . Box (1949) found that log P2 n I n is asymptotically distributed as a chisquared random variable with 1 ( ) 1 ( ) 1 2 2 2 2 2 1 1 1 1 k k m m m m m p p p p p p = = + + = ! " # $ degrees of freedom. As with other likelihood ratio tests, this approximation is very poor for small sample sizes. Consequently, Box (1949) derived a correction factor such that ( 1) log P2 n I n C is asymptotically distributed as a chisquared random variable with the same degrees of freedom as above where ( ) ( ) 3 3 1 2 2 1 1 1 2 1 3 1 k m m k m m p p C n n p p = = = ! " # $ . 2.6 FACTOR ANALYSIS / STRUCTURAL EQUATION MODELING Factor analysis is a multivariate procedure in which one tries to account for the covariances among the observed variables by a smaller number of underlying 22 hypothetical variables, called factors. Let xi , i =1,K,n , be pcomponent vectors of observations from a population with mean μ and covariance matrix . The factor analysis model is given by ix = μ + f + e , where f is an m×1 (m < p ) vector of underlying factors, is a p×m matrix of factor loadings, and e is a p×1 vector of residuals. It is assumed that the underlying factors are independently and identically distributed with mean 0 and covariance matrix I, that the residuals are independently distributed with mean 0 and covariance matrix , and that f and e are independent. Therefore, Cov( ) Cov( ) Cov( ) Cov( ) i= + + = + = + = + x μ f e f e I In most applications, factor analysis is performed on the centered data, i x μ , since Cov( ) Cov( ) i i x μ = x . Therefore, for the remainder of this section, the pcomponent vector i x will represent the centered data. In factor analysis, the researcher hypothesizes an adequate number of underlying factors, then chooses one of many methods to estimate , based on the chosen number of factors. One such method of estimation is the maximum likelihood method. The advantage of using this procedure is, assuming the data come from a multivariate normal population, that it allows the computation of a likelihood ratio test statistic that can be used to test the goodness of fit of the chosen number of factors. This is a test of 0 H : there are m underlying factors, or in matrix form, 0 : F H = + . The details of the 23 derivation of this likelihood ratio test statistic can be found in Lawley & Maxwell (1971). Briefly, consider the likelihood ratio criterion given by ( ) ( ) ( ) { ( ( ) )} ( ) { ( )} { ( ( ) )} { } 2 2 1 1 2 , , 2 2 1 1 , 2 2 1 1 2 2 1 2 max , 2 ˆ ˆ ˆ exp tr ˆ ˆ ˆ ˆ max , 2 ˆ exp tr ˆ ˆ ˆ ˆ ˆ exp tr ˆ ˆ ˆ ˆ ˆ exp np n F F np n n n L n L n n np + + = = + + = μ μ μ μ where ˆ ˆ , , and ˆ are the MLE’s of , , and , respectively. The MLE of can be found as shown at the beginning of Chapter 2. Namely, ( )( ) 1 ˆ 1 n i i n i= = x x x x . Typically, the MLE’s of and are derived by maximizing the loglikelihood function with respect to and . However, Lawley and Maxwell (1971) state that it is more convenient to minimize 2log * log tr ( ( ) 1 ) log Fn p = + + + S S where the * indicates that the unbiased sample covariance matrix, S, is used in place of ˆ in F . Minimizing 2log *F instead of maximizing 1 1 1 ( ( ) 1 ) 2 2 2 log log(2 ) log tr ˆ FL np n n = + + is acceptable since they differ by a constant, 1 2 np log(2 ) , and a function of the data, log S p , and the remaining terms of log F L are just 1 2 times the corresponding terms of 2log *F . The only other difference between log F L and 2log *F is in the use 24 of S in 2log *F rather than ˆ . Since 1 n ˆ n S = , these matrices will be essentially equivalent for large n. To find ˆ and ˆ let ij and ij be the elements in the ith row and jth column of and , respectively. Also, let i , be the ith diagonal element of . (The nondiagonal elements of are all zero, since the residuals are independently distributed.) Then, 2 1 m ii ik i k= = + , and 1 , m ij ik jk k i j = = , and the MLE’s of and can be found by setting the following expressions equal to zero and solving simultaneously for ij and i , ; 1 log log and log log p F F ij F F ii il j ij il i ii i L L L L = = !!  "" =  # $ , , . In most cases, these equations cannot be solved directly. Therefore, an iterative numerical procedure, such as NewtonRaphson, scoring, or steepest descent must be used to find the MLE’s. To perform this likelihood ratio test, we must know how the test statistic is distributed. Lawley and Maxwell (1971) found that 2log *F is asymptotically distributed as a chisquared random variable with 1 ( )2 ( ) 2 p m p + m degrees of freedom. For many years, this likelihood ratio test was the primary criterion used to determine the goodness of fit of the hypothesized number of factors. However, in the early 1980’s researchers discovered through Monte Carlo simulation that in large 25 samples, goodfitting models were rejected too often, and in small samples the type I error rates were too large and power was very poor (Gerbing & Anderson, 1993). 26 CHAPTER 3 BOOTSTRAPPING At least one nonparametric procedure has been applied to the problem of testing the structure of a covariance matrix. Specifically, bootstrapping has been used to estimate the distribution of the likelihood ratio test statistic used in structural equation modeling. Consider testing the hypothesis H0 : = ( ) , where ( ) is the hypothesized covariance structure expressed as a function of the vector of parameters, . To perform a bootstrap test, the resampling must be done from a bootstrap population with covariance structure specified by the null hypothesis. Therefore, the observed data must be transformed as follows before resampling (Bollen & Stine, 1993). Let X be the n× p matrix of centered data, let S = X X/(n 1) denote the sample covariance matrix of X, and let ˆ = ( ˆ ) be the estimated hypothesized covariance structure. Also, let M1 2 denote the lower triangular matrix resulting from the Cholesky decomposition of a positive definite matrix M. Then, the sample covariance matrix of Y = XS 1 2 ˆ 1 2 is given by 27 ( )( ) ( )( ) ( )( )( ) 1 1 1 2 1 2 1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 12 12 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ . n n = = = = = YY S XXS S SS S S S S Therefore, the transformed data matrix, Y, has sample covariance structure ˆ . To find the bootstrap distribution of the likelihood ratio test statistic, resample the rows of the transformed data matrix, Y, with replacement, compute the sample covariance matrix S* of the resampled data, find the MLE’s of the parameters based on the resampled data, and compute the likelihood ratio test statistic for the resampled data. When the null hypothesis is true, the bootstrap distribution is approximately the same as the distribution of the likelihood ratio test statistic, and a pvalue can be computed by dividing the number of bootstrap samples resulting in a test statistic value at least as large as the one resulting from the observed data by the total number of bootstrap samples. 28 CHAPTER 4 PERMUTATION TESTS Permutation tests have long been studied as alternatives to parametric procedures when the assumptions of such procedures are violated. The idea behind permutation tests is to generate the sampling distribution of a test statistic from the values obtained by calculating the test statistic for all possible permutations of the data under the null hypothesis. Consider a simple example given by Edgington (1995). Suppose five subjects are randomly assigned to two treatments, A and B, with the following results. A: 18, 30, 54 B: 6, 12 We wish to test the null hypothesis of no difference in the treatment means. Assuming this null hypothesis is true, we would expect each subject to have the same result regardless of their treatment assignment. Therefore, under the null hypothesis, there are 5C3 =10 possible arrangements of subjects to treatments, where n k C is the number of combinations of k items chosen from a total of n items. These arrangements, as well as the corresponding test statistic values, are displayed in Table 4.0.1. In this example, the test statistic is the absolute value of the pooled ttest statistic (displayed as t in Table 4.0.1). 29 Table 4.0.1. All Possible Permutations of the Observed Data Trt A Trt B t 18, 30, 54 6, 12 1.81 6, 12, 18 30, 54 3.00 6, 12, 30 18, 54 1.22 6, 12, 54 18, 30 0.00 6, 18, 30 12, 54 0.83 6, 18, 54 12, 30 0.25 6, 30, 54 12, 18 0.83 12, 18, 30 6, 54 0.52 12, 18, 54 6, 30 0.52 12, 30, 54 6, 18 1.22 The distribution of the test statistic values calculated from each possible permutation of the data is the sampling distribution of the test statistic. Consequently, a pvalue can be found by dividing the number of observations greater than or equal to the test statistic value obtained from the observed data, by the total number of permutations of the data under the null hypothesis. In this example, the pvalue is 2 /10 = 0.20 since there are two values of t that are greater than or equal to 1.81 (bolded numbers in Table 4.0.1), the value obtained from the observed data. Notice that the 1.81 corresponds to the most significant data configuration in which the value for treatment A is greater than that of treatment B and the 3.00 corresponds to the most significant data configuration in which the value for treatment A is less than that of treatment B. Therefore, a onetailed pvalue is 1/10 = 0.10 . Fisher (1971) is often credited with developing the first permutation tests; however, Edgington (1995) claims that permutation tests based on the ranks of data have been used since the 1880’s. Even if Fisher was not the first to develop permutation tests 30 in general, he was the first to suggest permuting the raw data as opposed to the ranks of the data. He is also responsible for generating considerable interest in the merits of permutation tests, namely the lack of distributional assumptions required by parametric tests. Fisher (1971) writes, “The utility of such nonparametric (permutation) tests consists in their being able to supply confirmation whenever, rightly or, more often, wrongly, it is suspected that the simpler (parametric) tests have been appreciably injured by departures from normality” (p. 48). Fisher (1936) even goes on to write in a later article that conclusions from parametric tests “have no justification beyond the fact that they agree with those which could have been arrived at by this elementary method (permutation tests)” (p. 59). Although Fisher (1971) showed that permutation tests eliminate the need for normality, it was another statistician, Pitman (1937a, 1937b, 1938), who recognized that permutation tests also eliminate the need for random samples. In these three papers, Pitman developed much of the theory of permutation tests and showed that random sampling is not necessary for a valid test. Rather, random assignment of experimental units to treatments is sufficient. Given the benefits of permutation tests, one would assume that the majority of analyses would be performed utilizing these procedures. However, the ability to determine the test statistic value for all possible permutations of the observed data was virtually impossible (except for the smallest sample sizes) due to the lack of computer technology in Fisher’s and Pitman’s day. Consequently, permutation tests based on ranks continued to be developed, since these tests do not require the generation of a new 31 sampling distribution for each new set of observed data. Tables exhibiting critical values were readily available for such tests. It took significant improvements in technology before interest in permutation tests based on raw data was renewed; however, the computing time required to generate all possible permutations of the data was, and still is, prohibitive except for the smallest sample sizes. Finally, in 1957, Dwass proposed “the almost obvious procedure of examining a ‘random sample’ of permutations and making the decision to accept or reject (the null hypothesis) on the basis of those permutations only” (p. 182). He called this new class of tests randomization tests and found that the power of these tests is ‘close’ to the power of the corresponding permutation tests. In his 1957 paper, Dwass restricts attention to the two sample case, but indicates that these randomization tests can be applied in more general situations. In more recent years, Edgington (1995), Manly (1997), and Good (1994) have applied permutation and randomization tests to factorial designs, randomized block designs, and multivariate designs, among others. Several statisticians have even used permutation tests to test the equality of correlation or covariance matrices from multiple populations (Krzanowski, 1993; Shipley, 2000; Zhu, Ng, & Jing, 2003). However, neither permutation nor randomization tests have been applied to testing the structure of a covariance matrix. As a side note, permutation tests are very similar to bootstrapping described in Chapter 3. The primary difference is that in bootstrapping the resampling is done with replacement, whereas in permutation tests the resampling is done without replacement. Because of this, permutation tests are exact and unbiased, whereas Good (1994) writes, 32 “The bootstrap is neither exact nor conservative. Generally, but not always, a nonparametric bootstrap is less powerful than a permutation test… If the observations are independent and from distributions with identical values of the parameter of interest, then the bootstrap is asymptotically exact” (p. 20). 33 CHAPTER 5 PROPOSED TEST Recognizing the benefits of permutation tests and the limitations of parametric procedures for testing the structure of a covariance matrix, it is the purpose of this research to develop a permutation test for the structure of a covariance matrix. To develop such a test, it must be established that the observations are exchangeable under the null hypothesis. Good (2002) gives a simple definition of exchangeability. He writes that observations are considered exchangeable if, “under the (null) hypothesis, the joint distribution of the observations is invariant under permutations of the subscripts” (p. 243). He then goes on to say “It is easy to see that a set of i.i.d. variables is exchangeable. Or that the joint distribution of a set of normally distributed random variables whose covariance matrix is such that all diagonal elements have the same value 2 and all of the offdiagonal elements have the same value is invariant under permutations of the variable subscripts” (p. 244). Good (2002) focuses on permuting variable subscripts rather than the actual observations so as to include permutation tests in which residuals are permuted, but these conditions for exchangeability also apply to cases in which the raw data are permuted. It will be argued in the following sections that all of the proposed permutation tests satisfy at least one of the criteria for exchangeability given by Good. 34 Before describing the permutation tests for the structure of a covariance matrix, note that covariance matrices are invariant to changes in location. Therefore, it will be assumed throughout this chapter that the variable means are all equal. If the variable means are unequal or it is unknown whether the means are equal, the raw data can easily be centered by calculating xi μ or i x x depending on whether μ is assumed known or unknown, respectively. This centering is necessary to eliminate the effect of the mean vector when permuting the data. For example, consider a situation in which two variables are assumed to have equal variances, but one has a mean of 100 and the other a mean of 1. If the values were permuted between the variables, the assumption of equal variances would be violated since the relatively ‘large’ values of the first variable would be combined with the relatively ‘small’ values of the second. This problem, however, can be remedied without affecting the variance or covariance assumptions by centering the raw data as described above. 5.1 PERMUTATION TESTS OF SPHERICITY AND COMPOUND SYMMETRY Consider first a permutation test for compound symmetry. Let i x , i =1,K,n , be equally distributed, pvariate vectors of observations taken on n subjects. We wish to test : o CS H = where is the covariance matrix of the distribution of i x , 2 (1 ) CS p p p = I + 1 1 , (5.1.1) 2 is the common population variance, is the common pairwise correlation, p I is the p× p identity matrix, and p 1 is a p×1 vector of ones. Under the null hypothesis, the variances are assumed equal and the pairwise correlations are assumed equal, but no 35 distributional assumptions have been made. To completely satisfy one of the conditions for exchangeability given previously, we would also need to assume joint normality. However, the simulation results shown in Chapter 6 indicate that this assumption might be too strict. It does not appear necessary to assume joint normality, but rather that each of the marginal distributions is from the same family of distributions, i.e. all uniform, all exponential, etc. Consequently, the values within each vector i x can be permuted without altering the covariance matrix. Before developing a test statistic, it is necessary to discuss the estimation of 2 and . Consider using the MLEs. Call these 2 1 2 1 p p j j s s = = and 1 ( ) 2 1 1 1 1 1 p p p p j k j jk r r = =+ = , where 2j s and jk r are the usual MLEs of 2j and jk , respectively. Since covariance matrices are symmetric, one possible test statistic can be computed by summing the absolute differences between the elements on or above the diagonal of the covariance matrices obtained from each possible permutation and the elements on or above the diagonal of the covariance matrix estimated as described above. In matrix notation this test statistic can be expressed as ( ) ( ( ) ) 1 2 2 1 vec 1 p p perm p p p D s r r += 1 I + 1 1 where perm is the covariance matrix obtained after permuting the data and vec(M) is a vector of the elements on or above the diagonal of a matrix M. This test statistic is computed for each possible permutation of the data and the proportion of test statistic values greater than or equal to the one obtained from the observed data is the corresponding pvalue. This test for compound symmetry can also be used to test for sphericity by setting r = 0 . 36 Consider a very simple example of the proposed permutation test in which there are three measurements taken on each of three subjects resulting in the data and sample statistics shown in Table 5.1.1 below. We are interested in testing whether the covariance matrix has the compound symmetry structure. This is equivalent to testing Ho : = CS versus : a CS H =/ , where CS is of the form of (5.1.1) and s2 and r are the MLEs of 2 and given by 2 1 2 1 ( ) 1 3 1.416 1.389 0.436 1.080 p p j j s s = = . + + . and ( ) ( ) 1 2 1 1 1 1 1 1 3 0.69 0.99 0.79 0.823 p p p p j k j jk r r = =+ = . + + . . In this case, since μ is unknown, the data will be centered by subtracting i x x for each subject, i. This results in the centered data shown in Table 5.1.1. Notice that the sample variances and correlations are invariant to centering. Table 5.1.1. Observed Data Raw Data Centered Data 6.4 4.8 1.8 1.17 1.67 0.73 3.6 2.3 0.2 1.63 0.83 0.87 5.7 2.3 1.2 0.47 0.83 0.13 1 x = 5.23 2 x = 3.13 3 x =1.07 1 x = 0 2 x = 0 3 x = 0 2 1 s =1.416 2 2 s =1.389 2 3 s = 0.436 2 1 s =1.416 2 2 s =1.389 2 3 s = 0.436 1 0.69 0.99 0.69 1 0.79 0.99 0.79 1 R = 1 0.69 0.99 0.69 1 0.79 0.99 0.79 1 R = 37 In this example, there are p!= 3!= 6 possible permutations of each row, resulting in ( !) 63 216 n p = = possible permutations of the observed data. Four of these permutations along with the corresponding test statistic values, D, are displayed in Table 5.1.2 below. The bolded data shown in Table 5.1.2 are the observed data. Table 5.1.2. Some Permutations of the Centered Observed Data 1.17 1.67 0.73 0.73 1.67 1.17 1.63 0.83 0.87 1.63 0.87 0.83 0.47 0.83 0.13 0.47 0.83 0.13 D = 1.761422 D =1.064711 1.17 0.73 1.67 1.17 1.67 0.73 0.83 0.87 1.63 0.87 1.63 0.83 0.13 0.47 0.83 0.47 0.13 0.83 D = 2.587289 D = 2.345689 The pvalue can be found by computing the proportion of test statistic values that are greater than or equal to the one obtained from the observed data. The distribution of the test statistic values is shown in Figure 5.1.1. The vertical line at 1.761422 represents the value of the test statistic resulting from the observed data. In this example, 96 of the 216 possible permutations result in test statistic values greater than or equal to 1.761422. Therefore, the pvalue is 96 / 216 . 0.4444 and at any reasonable type I error rate there is not enough evidence to conclude that the covariance matrix does not have the compound symmetry structure. 38 Figure 5.1.1. Distribution of the Test Statistic for Compound Symmetry Test Statistic Relative Frequency 0.5 1.0 1.5 2.0 2.5 0.0 0.2 0.4 0.6 0.8 5.2 PERMUTATION TEST OF TYPE H STRUCTURE The type H covariance structure does not satisfy either of the criteria for exchangeability mentioned previously. Therefore, the transformation described in Section 2.3 can be applied to the data so that the permutation test for sphericity described in Section 5.1 can be used. Specifically, assume we wish to test H0 : = TH versus : a TH H , where TH has the form of (2.3.1). Let C be a ( p 1)× p matrix of 39 normalized orthogonal contrasts on the p repeated measures and let Y be an n× p matrix of centered data. Under 0 H , ( ) ( ) 1 Var ' Var ' ' TH p YC = C Y C = C C = %I as shown in Section 2.3. Therefore, the permutation test for sphericity can be applied to the transformed data, YC' . As an example, return to the sample given in Table 5.1.1. This time we wish to test 0 : TH H = . The matrix of normalized orthogonal contrasts is given by 0.7071068 0.7071068 0 0.4082483 0.4082483 0.8164966 = C . Postmultiplying the centered data shown in Table 5.1.1 by C' yields 1.17 1.67 0.73 0.7071 0.4082 1.63 0.83 0.87 0.7071 0.4082 0.47 0.83 0.13 0 0.8165 0.3536 0.5634 0.5657 0.2939 . 0.9192 0.2531 =  = YC' (5.2.1) The MLE of the covariance matrix of this transformed data is then given by 0.4300 0.0885 0.0885 0.1559 and the MLE of p 1 %I is given by 1 0.29295 0 p 0 0.29295 % = I . The permutation test for sphericity is applied to the transformed data to test 0 1 : ' p H C C = %I versus 1 : ' a p H C C %I by finding all possible within row permutations of the transformed data. The test statistic is then calculated by 40 ( ) ( ) 1 2 1 vec ˆ p p perm p D + = 1 %I . A pvalue can be found by determining the proportion of test statistic values greater than or equal to the one resulting from the original set of transformed data. For the transformed data shown in (5.2.1), there are only ( !) 23 8 n p = = possible permutations of which all 8 result in test statistic values greater than or equal to 0.3626011, the test statistic value resulting from the original set of transformed data. Therefore, the pvalue is 8/8 =1 and at any reasonable type I error rate there is not enough evidence to conclude that the covariance matrix does not have the type H structure. One drawback with this permutation test is that the transformed data matrix has only p1 as opposed to p columns as in the original data matrix. For large combinations of n and p this is not a problem. However, if the combination of n and p is small, as in the example shown above, there may be too few possible permutations for the permutation test to be meaningful or even useful at all. 5.3 PERMUTATION TEST OF ALL OTHER COVARIANCE STRUCTURES Neither of the remaining covariance structures discussed in Chapter 2 (serial correlation and independence of sets of variates) satisfy either of the conditions for exchangeability described previously. Therefore, a data transformation is required to achieve exchangeability. The following theorem given by Graybill (1983), can be used to transform a data set with covariance matrix to one with covariance matrix D, where D is a diagonal matrix. Finally, one additional calculation, described in the following paragraphs, enables any test of the structure of a covariance matrix to be accomplished by a test for sphericity. 41 Theorem 5.3.1. Let A be any n× n matrix. There exists an orthogonal matrix P such that P AP = D, where D is a diagonal matrix, if and only if A is symmetric (p. 19). Consider the linear model Y = X$ + e where Y is an n× p matrix of observations, X is a known matrix of constants, $ is a matrix of unknown parameters, and e is a matrix of unknown errors such that Var(e) = (and consequently, Var(Y) = ). Covariance matrices are symmetric; therefore, by Theorem 5.3.1, there exists an orthogonal matrix P such that P P = D, where D is a diagonal matrix. Specifically, P consists of the eigenvectors of and the eigenvalues of are on the diagonal of D. Then, postmultiplying the data matrix, Y, by P yields, Var(YP) = P Var(Y)P = P P = D. Therefore, any test of 0 : o H = is equivalent to testing : o H P P = D, where the columns of P are the eigenvectors of 0 . Then the previously described permutation test for sphericity can be performed on the postmultiplied data, YP, after dividing each column of YP by the square root of the respective eigenvalue. To illustrate this test, return to the sample of data given in Table 5.1.1. This time we are interested in testing 0 : SC H = , where SC has the serial correlation form shown in (2.4.1). The problem with testing for this structure is that even though the variances are assumed to be equal, the covariances are not. Therefore, neither of the previously described criteria for exchangeability is satisfied. Consequently, the centered data must be transformed by applying Theorem 5.3.1 before permuting. Before 42 transforming the data, consider the following preliminary calculations used to find the MLEs of X2 and H in SC as described in Section 2.4. ( ) ( ) 1 1 1 4.166667 = = = n i i i S x x C x x , ( ) ( ) 2 1 2 9.5 = = = n i i i S x x C x x , and ( )( ) 3 1 9.72 = = = n i i i S x x x x where 1 C and 2 C are as shown in (2.4.3). Then, the MLE of H can be found by substituting 1 S , 2 S , 3 S , and p = 3 into (2.4.5) to get 16.666668 ˆ3 9.5 ˆ 2 44.440002 ˆ + 28.5 = 0 . The only solution to this equation in the interval (1, 1) is Hˆ = 0.6549899 . Substituting this value into (2.4.4) yields ˆ 2 = 0.5872383. Therefore, under the null hypothesis the MLE of the covariance matrix is 1.0284596 0.6736306 0.4412212 ˆ 0.6736306 1.0284596 0.6736306 0.4412212 0.6736306 1.0284596 SC = and the eigenvectors and eigenvalues of ˆ SC are 17 0.5535 0.7071 0.4400 0.6223 5.4400 10 0.7828 0.5535 0.7071 0.4400 × and [2.2269 0.5872 0.2712] , respectively. Postmultiplying the centered data matrix shown in Table 5.1.1 by the matrix of eigenvectors yields 43 17 1.17 1.67 0.73 0.5535 0.7071 0.4400 1.63 0.83 0.87 0.6223 5.4400 10 0.7828 0.47 0.83 0.13 0.5535 0.7071 0.4400 2.0888 0.3064 0.4687 1.9024 0.5421 0.4477 ; 0.1864 0.2357 0.9163  × = (5.3.1) however, this data matrix still cannot be permuted since the variances of the variables, given by the eigenvalues, are not equivalent. Therefore, each column of this data matrix must be divided by the square root of its respective eigenvalue before the data can be permuted. Refer to the data matrix found by dividing each column of (5.3.1) by the square root of its respective eigenvalue. We will refer to this matrix as the matrix of transformed data. This matrix is given by 1.3997 0.3999 0.9000 1.2748 0.7074 0.8596 0.1249 0.3076 1.7596 . (5.3.2) We can now perform a test for sphericity on the matrix of transformed data. This is done by finding all possible permutations of the transformed data such that the data are permuted within each row, and calculating the test statistic given by ( ) ( ) 1 2 1 vec p p perm p D + = 1 I . A pvalue can be found by determining the proportion of test statistic values greater than or equal to the one resulting from the original set of transformed data. For the transformed data shown in (5.3.2), there are ( !) 63 216 n p = = possible permutations of which only 6 result in test statistic values greater than or equal to 0.6095927 which is the value resulting from the original set of transformed data. Therefore, the pvalue is 6 / 216 . 0.0278 and at Y = 0.05 there is enough evidence to conclude that does not 44 have the serial correlation structure. This conclusion is expected since the sample correlation matrix shown in Table 5.1.1 does not suggest serial correlation. Figure 5.3.1 below shows the distribution of the test statistic for this set of data. The vertical line at 0.6095927 represents the test statistic value resulting from the original set of transformed data. Figure 5.3.1. Distribution of the Test Statistic for Serial Correlation Test Statistic Relative Frequency 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 45 CHAPTER 6 SIMULATIONS Onethousand simulations were run using R version 2.3.1 for all combinations of n ( = 5,10,25 ) and p ( = 3,5,10 ). The R code for each test can be found in Appendix A.1. Due to the extremely large number of permutations required to perform the permutation tests described in Chapter 5 for any reasonable values of n and p, randomization tests were primarily used in the simulations. Permutation tests were only run for the test of type H structure when n = 5 or 10 and p = 3 (see Section 6.3). Within each simulation, a pvariate data set was generated and the randomization test (RT) (or permutation test [PT] in the cases described above), likelihood ratio test (LRT), and corrected likelihood ratio test (CLRT) were all run for comparison. Onethousand random permutations of the centered and/or transformed data were sampled for each RT. The number of randomly selected permutations was chosen according to the suggestions of Manly (1997). For the LRTs, the asymptotic chisquared distributions described in Chapter 2 were used to determine approximate 5% critical values. Three different multivariate distributions (normal, uniform, and double exponential) were investigated. For the multivariate normal distribution, data were generated using the builtin R functions by specifying the desired covariance structure. For the multivariate double exponential distributions data were generated using a 46 procedure described in Vale and Maurelli (1983). For univariate data, this procedure first involves the generation of a random sample from a standard normal distribution. Each of these data values, X, is then substituted into the polynomial Y = a + bX + cX 2 + dX 3 , where the constants a, b, c, and d are determined by expressing the first four moments of the desired nonnormal distribution in terms of the first four moments of the standard normal distribution and solving algebraically. Vale and Maurelli (1983) provide a system of equations that can be used to find these constants. In extending this to the multivariate case, there are issues with specifying the desired covariance structure. Initially, data can be generated from the ( , ) p N 0 distribution, however, once the polynomial transformation is applied, the resulting data no longer have the same covariance structure. Therefore, it is necessary to determine intermediate correlations to be used to generate the multivariate normal data that will result in multivariate double exponential data with the desired covariance structure. Again, Vale and Maurelli (1983) provide a system of equations that can be solved to determine these intermediate correlations. There exists a more recent extension of the Vale and Maurelli (1983) procedure developed by Headrick (2002) in which the first six moments of the desired nonnormal distribution are used instead of just the first four. Headrick (2002) argues that specifying two additional moments results in much more accurate nonnormal distributions, but the inclusion of these additional moments places restrictions on the possible correlations that can be simulated. Specifically, once one of the correlations in the desired covariance matrix is specified, the remaining correlations cannot differ from the first too drastically, and the amount of difference changes for each desired distribution. For example, in trying to simulate threevariate uniform data with an unstructured covariance matrix, the 47 largest and smallest of the three correlations could not be varied by more than approximately 0.3. Differences larger than this resulted in intermediate correlations that were greater than one. This restriction is not a problem when generating data with a compound symmetry covariance structure; however, it severely limits the number of alternative covariance structures that can be explored when estimating power, especially as the number of variables is increased. Therefore, the method given by Vale and Maurelli (1983) was used to generate the multivariate double exponential data for the simulations in this chapter. Although convenient for generating data from many multivariate distributions, the Vale and Maurelli (1983) procedure cannot be used to generate multivariate uniform data. This is due to the fact that this procedure restricts the lower bound of the kurtosis of the desired marginal distributions. Specifically, if the skewness of the desired marginal distribution is 0, the lower bound for kurtosis is 1.15132 (Headrick, 2002); whereas the kurtosis of the UNIF(a,b) distribution is 1.2. Therefore, to generate the multivariate uniform data, a procedure described in Falk (1999) was used. This procedure consists of generating a random sample, xi , i =1,...n , from the ( , ) p N 0 R% distribution where ( ) 6 R% = 2sin R and R is the desired correlation matrix. Then the standard normal CDF, 1, is applied to the ( , ) p N 0 R% data so that ( ) i 1 x has a multivariate UNIF( , ) p p 0 1 distribution with correlation matrix R, where p 0 and p 1 are p×1 vectors of zeros and ones, respectively, that represent the lower and upper bounds of the marginal uniform distributions. To achieve the desired variances, note that the variance of the univariate UNIF(a,b) distribution is given by 2 = (b a)2 12 . Setting a = 0 we have 2 = b2 12 48 which implies that b = 12 . Multiplying each column of the multivariate UNIF( , ) p p 0 1 data by 12 j jj b = , j =1,..., p , with the desired standard deviation, jj , results in multivariate UNIF( , ) p 0 b data with covariance matrix , where ( ) 1,..., ' p b = b b . The type I error rate and power will be investigated for five different randomization tests: the tests of sphericity, compound symmetry, type H, serial correlation, and independence of sets of variates. The tests of sphericity and compound symmetry will be performed by permuting the raw data; the test of type H structure will be performed by first postmultiplying the data matrix by a matrix of normalized orthogonal contrasts and then running the randomization test for sphericity; and the tests of serial correlation and independence of sets of variates will be performed by first postmultiplying the data matrix by the eigenvectors of the estimated hypothesized covariance matrix, then dividing the columns of the resulting matrix by the square root of the respective eigenvalues, and finally running the randomization test for sphericity. The values of the various parameters used to simulate the type I error rates for the different tests are as follows. For sphericity, the covariance structure is given by 2 S p = I where 2 = 1, 9, or 25; for Compound Symmetry, the covariance structure is given by 2 (1 ) CS p p p = I + 1 1 (6.0.1) where 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9; for Type H, the covariance structure is given by 49 ( ) ( ( ) ) ( ) ( ) ( ( ) ) ( ) ( ) 1 1 2 2 1 1 2 2 ( ) 1 1 2 2 1 2 2 1 2 1 2 2 1 2 1 1 TH p d p d d d pd p d pd p d + % + % + % + + % = + % + % + L L M M O M L (6.0.2) where d > 0 and % > 0 (See Appendix A.2 for the exact parameter values and a description of how they were chosen.); for serial correlation, the covariance structure is given by 2 1 2 2 2 1 2 3 1 1 1 1 p p SC p p p = L L M M M O M L (6.0.3) where 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9; and for the test of independence of sets of variates, the covariance structure is given by 11 22 I kk ! " = ! " ! " ! " # $ 0 0 0 0 0 0 K K M M O M K where the number of variates in each of the mm , m =1,K, k , are (1, 2), (2, 3), (5, 5), or (3, 3, 4) depending on whether p = 3, 5, or 10; and mm has the compound symmetry structure with 2 =1 and = 0.2, 0.5, or 0.8. For example, for p = 5 , number of variates (2, 3), and = 0.2 the simulated covariance structure is 1 0.2 0 0 0 0.2 1 0 0 0 0 0 1 0.2 0.2 0 0 0.2 1 0.2 0 0 0.2 0.2 1 . 50 The various covariance structures used to investigate power are detailed in the sections to follow. 6.1 TEST OF SPHERICITY Simulated type I error rates for the test of sphericity are displayed in Table 6.1.1. For normally distributed data, the CLRT performs better than the other two tests with respect to the simulated type I error rates. For uniform data, the CLRT underestimates the nominal type I error rate and for double exponential data the CLRT overestimates the nominal type I error rate. These results are consistent with those of Huynh and Mandeville (1979) who performed a simulation study of Mauchly’s (1940) test of sphericity and found that for lighttailed distributions the LRTs were conservative and for heavytailed distributions, the simulated type I error rates exceeded the nominal rate. This same pattern is slightly seen in the results of the RT, however, the simulated type I error rates of the RT appear to be converging to 0.05 as n increases, whereas the simulated type I error rates of the LRTs do not. The simulated type I error rates for the RT seem to be unaffected by changes in the variance, however, they appear to increase as p increases. This latter pattern is also seen in the LRTs, but not as greatly as for the RT. One definite benefit of the RT is that it is applicable in situations for which the LRTs do not exist, specifically when p 2 n . However, the simulated type I error rates for these cases are much too large. Overall, the RT appears to be a viable alternative when the data are not normally distributed, but it is not beneficial in small sample situations or in cases where n is close to p. Clearly, the RT is preferred over the LRTs for cases in which p 2 n for the simple 51 fact that a pvalue exists for the RT when it does not for the LRTs. However, the simulated type I error rates are much too large to be of any practical use. The CLRT does not appear to be a level 3 test in nonnormal situations and clearly outperforms the LRT. Similarly, the RT does not appear to be a level 3 test when p 2 n . Therefore, all three tests will be largely ignored in these situations for the power discussions to follow. For completeness all three tests were included in the simulations. 52 Table 6.1.1. Simulated Type I Error Rates for the Test of Sphericity a. Normal p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.095 0.079 0.078 0.169 0.175 0.183 0.550 0.553 0.550 LRT 0.303 0.309 0.320 5 NA NA NA NA NA NA CLRT 0.054* 0.070 0.058* NA NA NA NA NA NA RT 0.070 0.075 0.079 0.100 0.122 0.093 0.204 0.202 0.183 10 LRT 0.125 0.136 0.138 0.296 0.295 0.317 NA NA NA CLRT 0.043* 0.059* 0.056* 0.051* 0.066 0.071 NA NA NA RT 0.064 0.053* 0.044* 0.086 0.079 0.075 0.108 0.098 0.100 25 LRT 0.085 0.078 0.073 0.130 0.091 0.117 0.327 0.317 0.322 CLRT 0.062* 0.050* 0.049* 0.058* 0.050* 0.060* 0.079 0.060* 0.069 b. Uniform p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.095 0.095 0.098 0.166 0.158 0.160 0.449 0.471 0.476 5 LRT 0.278 0.288 0.277 NA NA NA NA NA NA CLRT 0.054* 0.057* 0.059* NA NA NA NA NA NA RT 0.056* 0.061* 0.067 0.082 0.095 0.088 0.170 0.150 0.146 10 LRT 0.092 0.075 0.087 0.189 0.204 0.191 NA NA NA CLRT 0.035 0.032 0.021 0.046* 0.043* 0.037* NA NA NA RT 0.048* 0.056* 0.050* 0.071 0.063* 0.063* 0.077 0.082 0.076 25 LRT 0.036 0.037* 0.037* 0.061* 0.058* 0.054* 0.202 0.198 0.194 CLRT 0.021 0.024 0.025 0.025 0.028 0.020 0.031 0.026 0.019 c. Double Exponential p=3 p=5 p=10 n X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 X²=1 X²=9 X²=25 RT 0.105 0.120 0.104 0.238 0.212 0.226 0.598 0.613 0.609 5 LRT 0.422 0.440 0.407 NA NA NA NA NA NA CLRT 0.087 0.126 0.096 NA NA NA NA NA NA RT 0.083 0.081 0.083 0.112 0.122 0.113 0.239 0.238 0.226 10 LRT 0.262 0.243 0.252 0.470 0.454 0.451 NA NA NA CLRT 0.131 0.126 0.134 0.152 0.148 0.143 NA NA NA RT 0.062* 0.067 0.064 0.070 0.080 0.072 0.102 0.111 0.103 25 LRT 0.226 0.223 0.190 0.288 0.305 0.292 0.587 0.604 0.602 CLRT 0.169 0.180 0.143 0.183 0.200 0.186 0.239 0.250 0.239 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 53 Table 6.1.2 contains the simulated power of the test of sphericity versus nonhomoscedasticity. Specifically, multivariate data were generated from distributions with covariance matrices having diagonal elements given by 1, 1+d/(p1), 1+2d/(p1), …, 1+d and zero off diagonal elements, where d=4, 8, or 12 represents the difference between the first and last (or smallest and largest) diagonal elements. As expected, the power of both the RT and CLRT increases as d and/or n increases, and the power of both tests decreases as p approaches n. For normally distributed data the power of the CLRT is greater than that for the RT in most cases, but the RT does seem to perform fairly well, achieving a power of at least 0.75 for five of the nine cases when n=25. The true benefit of the RT is seen in the nonnormal cases. For these cases, it appears that the CLRT is outperforming the RT; however, recall from Table 6.1.1 that there is evidence that neither of the LRTs are 3level tests for nonnormal data. There does appear to be a distributional effect on the simulated power of the RT with the greatest power resulting from uniformly distributed data and the least from double exponential data in most cases. For uniform data, the RT achieves a power of at least 0.75 in all nine of the cases when n=25 and for double exponential data, in one of the nine cases. 54 Table 6.1.2. Simulated Power vs. NonHomoscedasticity for the Test of Sphericity a. Normal p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.178 0.191 0.217 0.255 0.326 0.313 0.604 0.627 0.622 LRT 0.513 0.614 0.685 5 NA NA NA NA NA NA CLRT 0.130 0.197 0.236 NA NA NA NA NA NA RT 0.283 0.383 0.426 0.299 0.401 0.384 0.408 0.467 0.470 10 LRT 0.513 0.778 0.899 0.599 0.803 0.887 NA NA NA CLRT 0.315 0.600 0.746 0.219 0.403 0.532 NA NA NA RT 0.730 0.915 0.967 0.654 0.803 0.866 0.616 0.749 0.817 25 LRT 0.873 0.997 1.000 0.834 0.987 0.998 0.932 0.992 0.999 CLRT 0.843 0.995 1.000 0.735 0.965 0.996 0.639 0.912 0.988 b. Uniform p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.195 0.243 0.282 0.299 0.299 0.375 0.580 0.608 0.588 5 LRT 0.449 0.603 0.689 NA NA NA NA NA NA CLRT 0.091 0.131 0.212 NA NA NA NA NA NA RT 0.433 0.560 0.623 0.401 0.488 0.573 0.433 0.515 0.521 10 LRT 0.442 0.796 0.932 0.534 0.743 0.874 NA NA NA CLRT 0.212 0.537 0.777 0.145 0.303 0.478 NA NA NA RT 0.957 0.988 1.000 0.868 0.965 0.989 0.787 0.896 0.920 25 LRT 0.933 1.000 1.000 0.844 0.997 1.000 0.900 0.997 1.000 CLRT 0.892 0.999 1.000 0.713 0.991 1.000 0.532 0.900 0.985 c. Double Exponential p=3 p=5 p=10 n d=4 d=8 d=12 d=4 d=8 d=12 d=4 d=8 d=12 RT 0.165 0.220 0.244 0.263 0.338 0.320 0.634 0.629 0.614 5 LRT 0.573 0.657 0.747 NA NA NA NA NA NA CLRT 0.173 0.221 0.297 NA NA NA NA NA NA RT 0.236 0.327 0.360 0.244 0.314 0.344 0.358 0.404 0.421 10 LRT 0.554 0.777 0.880 0.688 0.823 0.889 NA NA NA CLRT 0.378 0.632 0.761 0.353 0.497 0.638 NA NA NA RT 0.470 0.665 0.756 0.457 0.601 0.667 0.429 0.578 0.638 25 LRT 0.843 0.984 0.996 0.858 0.977 0.991 0.945 0.989 0.999 CLRT 0.804 0.976 0.995 0.808 0.958 0.983 0.779 0.942 0.978 55 Table 6.1.3 contains the simulated power for the test of sphericity versus nonzero correlation. Data were generated from multivariate distributions with marginal variances of 1 and pairwise correlations of = 0.3, 0.6, or 0.9. As seen in Table 6.1.3, the power of the RT for sphericity versus nonzero correlation is very low, virtually indistinguishable from the simulated type I error rates shown in Table 6.1.1. The CLRT, however, has much greater power. For n=10, the CLRT has power greater than 0.9 when = 0.9 and for n=25 when = 0.6 or 0.9, but the CLRT seems to have trouble detecting a correlation of = 0.3 even for samples as large as 25. The simulated power of the CLRT increases as p increases. This pattern is also evident for the RT, but the increase is not as extreme. The simulated power of the RT appears to be unaffected by the increase in , whereas the power of the CLRT clearly increases as increases. Just as in Table 6.1.2, there appears to be a distributional effect on the simulated power of the RT. However, in Table 6.1.3 the pattern is reversed with the greatest power resulting from double exponential data and the least from uniform data in most cases. 56 Table 6.1.3. Simulated Power vs. NonZero Correlation for the Test of Sphericity a. Normal p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.075 0.044 0.061 0.128 0.093 0.072 0.338 0.156 0.101 LRT 0.402 0.616 0.944 5 NA NA NA NA NA NA CLRT 0.094 0.185 0.688 NA NA NA NA NA NA RT 0.058 0.042 0.054 0.074 0.050 0.070 0.119 0.043 0.050 10 LRT 0.288 0.722 0.999 0.540 0.911 1.000 NA NA NA CLRT 0.146 0.553 0.993 0.210 0.750 1.000 NA NA NA RT 0.039 0.040 0.046 0.045 0.041 0.045 0.054 0.062 0.053 25 LRT 0.455 0.986 1.000 0.742 1.000 1.000 0.975 1.000 1.000 CLRT 0.393 0.980 1.000 0.645 0.999 1.000 0.877 1.000 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.062 0.045 0.057 0.128 0.077 0.077 0.292 0.114 0.086 5 LRT 0.338 0.590 0.936 NA NA NA NA NA NA CLRT 0.080 0.207 0.696 NA NA NA NA NA NA RT 0.047 0.028 0.059 0.053 0.040 0.059 0.093 0.035 0.029 10 LRT 0.235 0.706 0.998 0.501 0.903 1.000 NA NA NA CLRT 0.125 0.541 0.996 0.178 0.709 1.000 NA NA NA RT 0.040 0.042 0.037 0.044 0.041 0.042 0.048 0.028 0.053 25 LRT 0.392 0.975 1.000 0.714 0.999 1.000 0.960 1.000 1.000 CLRT 0.338 0.966 1.000 0.603 0.999 1.000 0.813 1.000 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.107 0.064 0.108 0.137 0.107 0.126 0.370 0.208 0.189 5 LRT 0.501 0.645 0.959 NA NA NA NA NA NA CLRT 0.134 0.243 0.721 NA NA NA NA NA NA RT 0.073 0.063 0.077 0.077 0.061 0.082 0.133 0.074 0.098 10 LRT 0.397 0.784 0.998 0.673 0.943 1.000 NA NA NA CLRT 0.231 0.619 0.997 0.325 0.795 0.998 NA NA NA RT 0.055 0.048 0.049 0.057 0.051 0.059 0.067 0.049 0.064 25 LRT 0.568 0.986 1.000 0.835 0.998 1.000 0.991 1.000 1.000 CLRT 0.516 0.981 1.000 0.747 0.997 1.000 0.933 1.000 1.000 57 The simulated power of the test of sphericity versus the type H structure seen in (6.0.2) is displayed in Tables 6.1.4 through 6.1.6. See Appendix A.2 for a description of how and why the values of d and D were chosen. As expected the power of both the RT and CLRT increases as d and/or n increases, but there does not appear to be much of a change in the power of the tests as D increases. It is difficult to determine the effect of p on the simulated power of the tests due to the fact that it was necessary to use radically different values of d and D as p increased (See Appendix A.2), but there are two cases with equal parameter values. These are when d=0.1, D=1, and p=5 (Table 6.1.5) or p=10 (Table 6.1.5). From these two cases, it appears that the power of both the RT and CLRT increases as p increases. The CLRT is more powerful than the RT in most cases, but recall that the CLRT is not an Y level test for nonnormal data (See Table 6.1.1). Even with normally distributed data, the RT appears to have a slight edge over the CLRT with respect to power when d, n, and p are all small. Overall, the ability of the RT for sphericity to detect the type H structure is fairly good with simulated power values greater than 0.75 when n=25 in fifteen of the 27 cases for normal data, 21 of the 27 cases for uniform data, and nine of the 27 cases for double exponential data. Just as in previous tables there appears to be a distributional effect on the simulated power of the RT, but the pattern is again reversed from that seen in the previous table (Table 6.1.3) with the greatest power values resulting from uniform data and the lowest from double exponential data in most cases. 58 Table 6.1.4. Simulated Power vs. Type H for the Test of Sphericity ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.153 0.182 0.212 0.240 0.195 0.235 0.292 0.245 0.265 5 LRT 0.453 0.674 0.904 0.651 0.744 0.923 0.843 0.846 0.975 CLRT 0.111 0.241 0.527 0.217 0.255 0.518 0.347 0.366 0.626 RT 0.274 0.419 0.594 0.484 0.565 0.663 0.555 0.694 0.757 10 LRT 0.437 0.806 0.991 0.800 0.921 1.000 0.982 0.993 1.000 CLRT 0.261 0.663 0.975 0.622 0.806 0.992 0.936 0.961 0.999 RT 0.690 0.943 0.993 0.930 0.995 1.000 0.994 1.000 1.000 25 LRT 0.797 0.996 1.000 0.996 1.000 1.000 1.000 1.000 1.000 CLRT 0.741 0.993 1.000 0.994 1.000 1.000 1.000 1.000 1.000 b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.185 0.204 0.285 0.284 0.260 0.303 0.350 0.308 0.319 5 LRT 0.438 0.665 0.913 0.641 0.722 0.934 0.851 0.875 0.982 CLRT 0.103 0.183 0.511 0.177 0.256 0.504 0.338 0.370 0.681 RT 0.349 0.580 0.803 0.597 0.774 0.859 0.707 0.852 0.911 10 LRT 0.354 0.804 0.999 0.816 0.939 0.999 0.983 0.995 1.000 CLRT 0.180 0.606 0.979 0.616 0.833 0.993 0.940 0.971 1.000 RT 0.894 0.999 1.000 0.998 1.000 1.000 1.000 1.000 1.000 25 LRT 0.803 0.999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.744 0.998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.162 0.146 0.201 0.255 0.200 0.239 0.282 0.253 0.265 5 LRT 0.509 0.701 0.924 0.713 0.766 0.932 0.854 0.878 0.973 CLRT 0.141 0.262 0.535 0.254 0.353 0.537 0.378 0.430 0.712 RT 0.216 0.332 0.455 0.385 0.441 0.561 0.508 0.561 0.642 10 LRT 0.500 0.834 0.996 0.812 0.914 0.995 0.980 0.986 1.000 CLRT 0.347 0.696 0.988 0.674 0.809 0.984 0.929 0.955 0.999 RT 0.471 0.707 0.907 0.804 0.927 0.939 0.907 0.976 0.980 25 LRT 0.806 0.996 1.000 0.992 1.000 1.000 1.000 1.000 1.000 CLRT 0.766 0.994 1.000 0.991 1.000 1.000 1.000 1.000 1.000 59 Table 6.1.5. Simulated Power vs. Type H for the Test of Sphericity ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.148 0.161 0.205 0.166 0.216 0.240 0.217 0.222 0.240 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.110 0.207 0.465 0.140 0.256 0.479 0.185 0.331 0.543 10 LRT 0.430 0.874 0.999 0.379 0.855 0.996 0.504 0.844 1.000 CLRT 0.122 0.615 0.980 0.086 0.553 0.973 0.137 0.549 0.980 RT 0.105 0.524 0.952 0.146 0.664 0.957 0.331 0.823 0.989 25 LRT 0.416 0.997 1.000 0.211 0.997 1.000 0.631 0.995 1.000 CLRT 0.290 0.989 1.000 0.116 0.987 1.000 0.502 0.981 1.000 b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.153 0.162 0.287 0.173 0.201 0.315 0.222 0.233 0.311 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.100 0.318 0.681 0.138 0.386 0.667 0.211 0.479 0.722 10 LRT 0.346 0.846 0.998 0.280 0.819 1.000 0.417 0.856 1.000 CLRT 0.098 0.581 0.989 0.055 0.517 0.986 0.099 0.522 0.991 RT 0.130 0.865 0.996 0.172 0.940 0.999 0.411 0.987 1.000 25 LRT 0.307 0.993 1.000 0.119 0.995 1.000 0.516 0.998 1.000 CLRT 0.186 0.986 1.000 0.063 0.981 1.000 0.367 0.989 1.000 c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.185 0.178 0.225 0.234 0.180 0.245 0.263 0.241 0.271 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.096 0.154 0.333 0.147 0.248 0.362 0.189 0.280 0.399 10 LRT 0.552 0.908 0.998 0.525 0.892 0.998 0.638 0.907 1.000 CLRT 0.205 0.699 0.990 0.185 0.643 0.987 0.261 0.648 0.997 RT 0.081 0.309 0.753 0.133 0.437 0.749 0.250 0.562 0.793 25 LRT 0.587 0.997 1.000 0.468 0.989 1.000 0.770 0.999 1.000 CLRT 0.485 0.994 1.000 0.333 0.983 1.000 0.658 0.997 1.000 60 Table 6.1.6. Simulated Power vs. Type H for the Test of Sphericity ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.153 0.185 0.220 0.223 0.231 0.244 0.349 0.331 0.345 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.168 0.213 0.315 0.175 0.190 0.312 0.211 0.263 0.375 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.394 0.623 0.863 0.278 0.484 0.796 0.335 0.578 0.803 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.992 0.999 1.000 CLRT 1.000 1.000 1.000 0.999 0.999 1.000 0.937 0.997 1.000 b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.154 0.181 0.257 0.214 0.249 0.258 0.304 0.311 0.310 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.212 0.321 0.523 0.175 0.302 0.447 0.212 0.324 0.509 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.739 0.911 0.987 0.580 0.860 0.966 0.551 0.899 0.992 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.996 1.000 1.000 CLRT 1.000 1.000 1.000 0.999 1.000 1.000 0.929 0.995 1.000 c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.212 0.241 0.282 0.327 0.301 0.295 0.393 0.395 0.386 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.135 0.179 0.244 0.135 0.175 0.242 0.216 0.261 0.268 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.202 0.309 0.558 0.186 0.269 0.454 0.242 0.344 0.494 25 LRT 1.000 1.000 1.000 1.000 1.000 1.000 0.999 1.000 1.000 CLRT 1.000 1.000 1.000 1.000 1.000 1.000 0.978 0.997 1.000 61 Table 6.1.7 contains the simulated power for the test of sphericity versus serial correlation shown in (6.0.3). Since the serial correlation structure has equal variances as does the sphericity structure, only one value was simulated for the marginal variances. Data were generated from multivariate distributions with marginal variances of 1 and serial correlations of = 0.3, 0.6, or 0.9. As expected, the power of the CLRT increases as H increases, but there does not appear to be any relationship between the power of the RT and the value of H. On the other hand, the power of both tests increases as p and/or n increases. Overall, the power of the RT is very poor, beating the power of the CLRT in only two cases (normal, n=10, p=5, H=0.3 and uniform, n=5, p=3, H=0.3), but in both cases the power is much too low (0.122 and 0.076, respectively). Again, there appears to be a distributional effect on the power of the RT, but the pattern is again reversed from the previous tables (Tables 6.1.4 through 6.1.6). This time the greatest power values result from double exponential data and the lowest from uniform data in most cases 62 Table 6.1.7. Simulated Power vs. Serial Correlation for the Test of Sphericity ( 2 =1) a. Normal p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.075 0.065 0.062 0.199 0.158 0.117 0.554 0.528 0.289 5 LRT 0.378 0.578 0.919 NA NA NA NA NA NA CLRT 0.082 0.150 0.659 NA NA NA NA NA NA RT 0.076 0.053 0.044 0.122 0.147 0.107 0.320 0.416 0.180 10 LRT 0.243 0.674 0.996 0.432 0.882 1.000 NA NA NA CLRT 0.111 0.476 0.993 0.121 0.618 0.999 NA NA NA RT 0.078 0.054 0.071 0.142 0.118 0.088 0.312 0.533 0.121 25 LRT 0.373 0.970 1.000 0.535 0.996 1.000 0.834 1.000 1.000 CLRT 0.300 0.956 1.000 0.393 0.990 1.000 0.460 1.000 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.076 0.069 0.061 0.175 0.157 0.113 0.495 0.460 0.251 5 LRT 0.327 0.554 0.927 NA NA NA NA NA NA CLRT 0.069 0.175 0.647 NA NA NA NA NA NA RT 0.064 0.042 0.056 0.103 0.116 0.082 0.284 0.401 0.141 10 LRT 0.198 0.626 0.998 0.382 0.858 1.000 NA NA NA CLRT 0.099 0.441 0.993 0.106 0.569 0.997 NA NA NA RT 0.072 0.043 0.054 0.135 0.127 0.062 0.343 0.575 0.100 25 LRT 0.309 0.960 1.000 0.409 1.000 1.000 0.772 1.000 1.000 CLRT 0.250 0.950 1.000 0.298 0.996 1.000 0.370 1.000 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.108 0.082 0.068 0.218 0.183 0.159 0.591 0.607 0.324 5 LRT 0.469 0.614 0.941 NA NA NA NA NA NA CLRT 0.128 0.214 0.682 NA NA NA NA NA NA RT 0.068 0.070 0.088 0.142 0.138 0.094 0.324 0.460 0.211 10 LRT 0.332 0.732 0.998 0.600 0.916 0.999 NA NA NA CLRT 0.180 0.542 0.994 0.272 0.680 0.999 NA NA NA RT 0.063 0.048 0.069 0.141 0.101 0.088 0.316 0.467 0.146 25 LRT 0.492 0.981 1.000 0.715 0.998 1.000 0.930 1.000 1.000 CLRT 0.421 0.967 1.000 0.576 0.997 1.000 0.697 1.000 1.000 63 6.2 TEST OF COMPOUND SYMMETRY The simulated type I error rates for the test of compound symmetry are displayed in Tables 6.2.1 through 6.2.3. Varying X2 and/or H does not seem to have much, if any, affect on the simulated type I error rates of any of the tests of compound symmetry. For normally distributed data the CLRT clearly performs better than either of the other tests with respect to the simulated type I error rates; however, for uniform data this test is too conservative and for double exponential data it results in rates that are much too large. This is the same pattern seen in the simulated type I error rates for the test of sphericity (Table 6.1.1). There also appears to be a distributional effect on the simulated type I error rates of the RT for compound symmetry. Specifically, these rates are generally highest for the double exponential data and lowest for uniform data. Unlike the LRTs, however, the simulated type I error rates of the RT appear to be converging to the nominal rate as n increases. Just as for the test of sphericity, the RT exists in cases for which the LRTs do not. That is when p 2 n . However, the simulated type I error rates in these situations are much too large, especially when p=10 and n=5 (Table 6.2.3). The simulated type I error rates of the LRTs seem to increase as p increases, but this pattern is not seen in the type I error rates of the RT for compound symmetry as it was in the RT for sphericity. For normally distributed data, the CLRT is clearly the best choice with respect to type I error rates. However, in nonnormal situations, the RT performs very well, especially as n increases. Seeing that the LRTs are not level Y tests for nonnormally distributed data and the RT is not a level Y test when p 2 n , these tests will be primarily 64 excluded in these situations in the following power discussions. They were included in the simulations, however, for completeness. Table 6.2.1. Simulated Type I Error Rates for the Test of Compound Symmetry ( p = 3 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.080 0.085 0.089 0.108 0.078 0.093 0.079 0.103 0.091 5 LRT 0.342 0.320 0.351 0.374 0.313 0.338 0.311 0.345 0.338 CLRT 0.059* 0.062* 0.081 0.058* 0.065 0.056* 0.053* 0.080 0.065 RT 0.075 0.072 0.055* 0.070 0.061* 0.068 0.066 0.074 0.057* 10 LRT 0.136 0.148 0.108 0.127 0.125 0.132 0.124 0.126 0.128 CLRT 0.043* 0.063* 0.037 0.059* 0.049* 0.045* 0.045* 0.057* 0.050* RT 0.062* 0.067 0.048* 0.056* 0.054* 0.063* 0.056* 0.054* 0.058* 25 LRT 0.069 0.067 0.065 0.075 0.082 0.077 0.073 0.058* 0.061* CLRT 0.048* 0.049* 0.041* 0.045* 0.051* 0.052* 0.045* 0.039* 0.044* b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.100 0.089 0.092 0.078 0.079 0.080 0.096 0.112 0.076 5 LRT 0.262 0.304 0.297 0.254 0.254 0.300 0.272 0.272 0.276 CLRT 0.053* 0.038* 0.058* 0.042* 0.041* 0.060* 0.053* 0.050* 0.047* RT 0.068 0.074 0.057* 0.068 0.063* 0.060* 0.057* 0.062* 0.054* 10 LRT 0.078 0.095 0.099 0.076 0.090 0.114 0.073 0.084 0.093 CLRT 0.026 0.037* 0.043* 0.033 0.030 0.049* 0.024 0.027 0.030 RT 0.050* 0.058* 0.040* 0.037* 0.052* 0.050* 0.054* 0.056* 0.049* 25 LRT 0.029 0.045* 0.060* 0.028 0.045* 0.050* 0.032 0.041* 0.049* CLRT 0.017 0.030 0.047* 0.014 0.032 0.040* 0.024 0.032 0.035 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.125 0.113 0.103 0.109 0.098 0.113 0.111 0.104 0.110 5 LRT 0.396 0.434 0.387 0.431 0.401 0.420 0.437 0.430 0.425 CLRT 0.095 0.113 0.093 0.085 0.074 0.094 0.102 0.091 0.096 RT 0.074 0.079 0.091 0.068 0.067 0.091 0.076 0.079 0.095 10 LRT 0.231 0.266 0.287 0.247 0.270 0.274 0.256 0.277 0.285 CLRT 0.112 0.132 0.159 0.121 0.140 0.146 0.137 0.150 0.147 RT 0.057* 0.059* 0.053* 0.044* 0.058* 0.056* 0.067 0.065 0.061* 25 LRT 0.227 0.237 0.243 0.200 0.244 0.254 0.244 0.233 0.252 CLRT 0.183 0.189 0.205 0.156 0.198 0.200 0.192 0.190 0.215 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 65 Table 6.2.2. Simulated Type I Error Rates for the Test of Compound Symmetry ( p = 5 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.159 0.103 0.086 0.155 0.115 0.109 0.131 0.140 0.109 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.078 0.081 0.053* 0.085 0.080 0.074 0.088 0.085 0.076 10 LRT 0.289 0.298 0.284 0.299 0.303 0.309 0.296 0.293 0.300 CLRT 0.057* 0.064 0.056* 0.053* 0.068 0.076 0.069 0.057* 0.065 RT 0.060* 0.070 0.046* 0.078 0.048* 0.063* 0.071 0.060* 0.061* 25 LRT 0.104 0.097 0.102 0.115 0.093 0.107 0.111 0.107 0.089 CLRT 0.055* 0.050* 0.044* 0.053* 0.050* 0.059* 0.055* 0.047* 0.046* b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.137 0.120 0.100 0.139 0.129 0.088 0.126 0.114 0.119 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.069 0.075 0.059* 0.083 0.074 0.068 0.090 0.067 0.089 10 LRT 0.202 0.273 0.360 0.218 0.264 0.334 0.231 0.267 0.346 CLRT 0.038* 0.049* 0.094 0.031 0.042* 0.085 0.038* 0.052* 0.096 RT 0.048* 0.058* 0.050* 0.066 0.061* 0.047* 0.065 0.059* 0.054* 25 LRT 0.054* 0.076 0.150 0.049* 0.073 0.127 0.045* 0.071 0.131 CLRT 0.023 0.037* 0.081 0.024 0.036 0.069 0.021 0.035 0.072 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.182 0.163 0.161 0.180 0.163 0.166 0.189 0.166 0.155 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.104 0.090 0.082 0.094 0.087 0.093 0.085 0.095 0.091 10 LRT 0.482 0.504 0.517 0.462 0.493 0.498 0.454 0.483 0.530 CLRT 0.155 0.174 0.197 0.174 0.170 0.195 0.140 0.185 0.190 RT 0.072 0.046* 0.062* 0.073 0.053* 0.068 0.055* 0.080 0.060* 25 LRT 0.310 0.343 0.398 0.343 0.346 0.393 0.320 0.357 0.386 CLRT 0.209 0.244 0.269 0.220 0.248 0.273 0.199 0.238 0.289 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 66 Table 6.2.3. Simulated Type I Error Rates for the Test of Compound Symmetry ( p =10 ) a. Normal X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.344 0.217 0.186 0.338 0.225 0.143 0.325 0.208 0.171 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.138 0.082 0.084 0.129 0.126 0.080 0.122 0.099 0.075 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.096 0.053* 0.072 0.067 0.065 0.054* 0.067 0.060* 0.055* 25 LRT 0.333 0.325 0.312 0.326 0.334 0.329 0.315 0.299 0.311 CLRT 0.057* 0.067 0.050* 0.057* 0.076 0.063* 0.060* 0.059* 0.066 b. Uniform X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.334 0.211 0.152 0.298 0.226 0.129 0.314 0.178 0.136 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.115 0.081 0.059* 0.111 0.082 0.072 0.132 0.074 0.052* 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.066 0.057* 0.065 0.069 0.053* 0.080 0.072 0.053* 0.051* 25 LRT 0.238 0.322 0.571 0.230 0.320 0.585 0.243 0.320 0.564 CLRT 0.035 0.087 0.194 0.033 0.072 0.198 0.029 0.070 0.188 c. Double Exponential X²=1 X²=9 X²=25 n H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 H=0.3 H =0.6 H =0.9 RT 0.404 0.311 0.285 0.408 0.295 0.271 0.405 0.285 0.267 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.189 0.124 0.126 0.165 0.133 0.123 0.232 0.140 0.142 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.080 0.067 0.061* 0.070 0.076 0.065 0.095 0.071 0.090 25 LRT 0.620 0.704 0.748 0.625 0.727 0.788 0.645 0.697 0.760 CLRT 0.272 0.350 0.417 0.268 0.357 0.454 0.275 0.310 0.432 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 67 The simulated power of the test of compound symmetry versus the type H structure is shown in Tables 6.2.4 through 6.2.6. For these simulations, data were generated from distributions having type H covariance structures as shown in (6.0.2). See Appendix A.2 for a description of how and why the values of d and D were chosen. As expected, the power of all three tests increases as d and/or n increases, but there doesn’t seem to be much effect, if any, on the power of the tests as D increases. It is difficult to determine the effect of increasing p on the power of the tests since very different parameter values were simulated for the different values of p (See Appendix A.2). However, there are two cases for which the parameter values are equal. These are when d=0.1, D=1, and p=5 (Table 6.2.5) or p=10 (Table 6.2.10). From these two cases, it appears that the power of both the RT and CLRT increases as p increases. For normally distributed data there are many cases when the RT is more powerful than the CLRT. Specifically, for 25 of the 54 total cases, the power of the RT is greater than or equal to the power of the CLRT. These are typically when d is small and n is close to p. Overall, the power of the RT exceeds 0.75 in 54 of the 81 cases when n=25. Again, there appears to be a slight distributional effect on the power of the RT with the greatest power values usually resulting from uniformly distributed data and the lowest from double exponential data. This relationship is the opposite of that seen in Tables 6.2.1 through 6.2.3. 68 Table 6.2.4. Simulated Power vs. Type H for the Test of Compound Symmetry ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.180 0.232 0.343 0.253 0.289 0.367 0.312 0.336 0.390 5 LRT 0.471 0.614 0.860 0.690 0.766 0.915 0.864 0.878 0.977 CLRT 0.119 0.170 0.359 0.200 0.253 0.449 0.372 0.382 0.603 RT 0.269 0.472 0.682 0.424 0.577 0.704 0.549 0.649 0.734 10 LRT 0.469 0.754 0.988 0.848 0.935 0.996 0.991 0.993 1.000 CLRT 0.266 0.565 0.951 0.675 0.817 0.987 0.956 0.964 0.997 RT 0.699 0.969 1.000 0.980 0.998 1.000 0.999 1.000 1.000 25 LRT 0.829 0.996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.770 0.991 1.000 1.000 1.000 1.000 1.000 1.000 1.000 b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.204 0.264 0.380 0.277 0.333 0.405 0.327 0.388 0.424 5 LRT 0.424 0.624 0.874 0.646 0.746 0.941 0.868 0.894 0.993 CLRT 0.082 0.155 0.367 0.178 0.238 0.449 0.351 0.371 0.632 RT 0.330 0.657 0.876 0.592 0.758 0.875 0.748 0.843 0.899 10 LRT 0.394 0.786 0.994 0.846 0.962 1.000 0.994 0.996 1.000 CLRT 0.202 0.537 0.958 0.671 0.830 0.999 0.960 0.984 1.000 RT 0.896 0.997 1.000 1.000 1.000 1.000 1.000 1.000 1.000 25 LRT 0.851 0.998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 CLRT 0.789 0.993 1.000 1.000 1.000 1.000 1.000 1.000 1.000 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 RT 0.195 0.267 0.351 0.263 0.300 0.375 0.321 0.359 0.404 5 LRT 0.559 0.671 0.878 0.730 0.793 0.938 0.889 0.888 0.977 CLRT 0.166 0.228 0.427 0.274 0.322 0.531 0.446 0.455 0.707 RT 0.244 0.402 0.614 0.409 0.491 0.627 0.523 0.567 0.660 10 LRT 0.561 0.798 0.993 0.850 0.922 0.999 0.987 0.991 0.999 CLRT 0.389 0.632 0.960 0.707 0.813 0.990 0.942 0.962 0.999 RT 0.468 0.823 0.972 0.811 0.924 0.981 0.946 0.979 0.982 25 LRT 0.832 0.985 1.000 0.993 1.000 1.000 1.000 1.000 1.000 CLRT 0.781 0.977 1.000 0.989 1.000 1.000 1.000 1.000 1.000 69 Table 6.2.5. Simulated Power vs. Type H for the Test of Compound Symmetry ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.179 0.222 0.360 0.204 0.233 0.363 0.223 0.288 0.382 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.105 0.284 0.620 0.109 0.335 0.629 0.174 0.409 0.634 10 LRT 0.322 0.527 0.977 0.334 0.602 0.987 0.480 0.691 0.995 CLRT 0.079 0.181 0.784 0.085 0.230 0.826 0.149 0.294 0.891 RT 0.111 0.676 0.996 0.128 0.804 0.995 0.288 0.883 0.995 25 LRT 0.155 0.709 1.000 0.201 0.865 1.000 0.568 0.954 1.000 CLRT 0.083 0.563 1.000 0.103 0.753 1.000 0.417 0.905 1.000 b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.150 0.231 0.412 0.166 0.271 0.416 0.201 0.302 0.422 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.112 0.391 0.796 0.121 0.469 0.800 0.201 0.543 0.802 10 LRT 0.225 0.483 0.991 0.253 0.554 0.994 0.415 0.678 0.999 CLRT 0.043 0.135 0.839 0.048 0.176 0.878 0.099 0.241 0.953 RT 0.158 0.879 0.999 0.134 0.956 1.000 0.416 0.976 1.000 25 LRT 0.098 0.659 1.000 0.122 0.843 1.000 0.487 0.959 1.000 CLRT 0.042 0.499 1.000 0.067 0.726 1.000 0.334 0.894 1.000 c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.205 0.239 0.372 0.219 0.268 0.372 0.258 0.295 0.381 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.133 0.268 0.498 0.140 0.314 0.511 0.178 0.370 0.531 10 LRT 0.493 0.686 0.994 0.505 0.753 0.996 0.613 0.799 0.999 CLRT 0.148 0.291 0.892 0.164 0.367 0.917 0.250 0.448 0.963 RT 0.126 0.481 0.914 0.119 0.605 0.916 0.236 0.704 0.928 25 LRT 0.374 0.820 1.000 0.415 0.914 1.000 0.730 0.953 1.000 CLRT 0.255 0.710 1.000 0.291 0.842 1.000 0.615 0.929 1.000 70 Table 6.2.6. Simulated Power vs. Type H for the Test of Compound Symmetry ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.296 0.330 0.389 0.358 0.367 0.388 0.403 0.418 0.437 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.277 0.369 0.518 0.269 0.358 0.492 0.286 0.386 0.507 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.550 0.810 0.965 0.453 0.748 0.947 0.491 0.776 0.960 25 LRT 0.699 0.891 1.000 0.639 0.836 0.999 0.672 0.881 1.000 CLRT 0.288 0.533 0.998 0.246 0.462 0.950 0.257 0.514 0.974 b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.279 0.329 0.400 0.323 0.343 0.401 0.384 0.400 0.429 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.365 0.517 0.725 0.320 0.483 0.677 0.327 0.489 0.685 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.832 0.964 1.000 0.738 0.929 0.999 0.684 0.929 0.997 25 LRT 0.747 0.922 1.000 0.603 0.860 0.997 0.550 0.863 0.998 CLRT 0.350 0.621 0.995 0.197 0.474 0.952 0.176 0.463 0.971 c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.387 0.407 0.462 0.399 0.434 0.488 0.478 0.476 0.514 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.257 0.337 0.443 0.261 0.334 0.436 0.276 0.366 0.469 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.405 0.570 0.835 0.379 0.551 0.807 0.423 0.625 0.845 25 LRT 0.904 0.962 1.000 0.855 0.945 1.000 0.850 0.952 1.000 CLRT 0.643 0.818 1.000 0.551 0.741 0.984 0.536 0.758 0.996 71 Table 6.2.7 displays the simulated power of the test of compound symmetry versus the serial correlation structure shown in (6.0.3). Data were generated from distributions having the serial correlation covariance structure with 2 =1 and = 0.3, 0.6, or 0.9. Only one value of 2 was simulated since both the compound symmetry and serial correlation structures have equal variances. As expected, the power of the CLRT increases as H increases. However, the power of the RT is greatest when = 0.6 in all but three cases (uniform, n=5, p=3; double exponential, n=5, p=3; and double exponential, n=5, p=10). The power of both tests increases as p increases. This is anticipated since as p increases there are more observations for which to estimate . For normally distributed data, the RT is more powerful than the CLRT in seven of the 27 cases. Most of these cases (five of the seven) are when n=5 or 10 and p=3. Even though the RT is more powerful in these situations, the power is still not very high, only reaching 0.398 in the most powerful case (n=25, p=10, = 0.3). In fact, neither the CLRT nor RT are very powerful except when n=25, p=10, and = 0.6 or 0.9. 72 Table 6.2.7. Simulated Power vs. Serial Correlation for the Test of Compound Symmetry ( 2 =1) a. Normal p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.099 0.107 0.106 0.181 0.217 0.199 0.546 0.559 0.368 5 LRT 0.349 0.371 0.415 NA NA NA NA NA NA CLRT 0.055 0.074 0.085 NA NA NA NA NA NA RT 0.085 0.100 0.093 0.163 0.189 0.151 0.295 0.520 0.351 10 LRT 0.146 0.213 0.290 0.394 0.605 0.805 NA NA NA CLRT 0.055 0.085 0.145 0.114 0.221 0.462 NA NA NA RT 0.080 0.109 0.078 0.155 0.294 0.170 0.398 0.828 0.580 25 LRT 0.142 0.298 0.505 0.278 0.808 0.983 0.787 0.996 1.000 CLRT 0.094 0.227 0.442 0.176 0.685 0.969 0.361 0.987 1.000 b. Uniform p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.116 0.101 0.107 0.167 0.185 0.161 0.507 0.512 0.373 5 LRT 0.279 0.315 0.398 NA NA NA NA NA NA CLRT 0.054 0.051 0.085 NA NA NA NA NA NA RT 0.071 0.104 0.086 0.136 0.206 0.122 0.270 0.507 0.417 10 LRT 0.083 0.163 0.253 0.293 0.535 0.784 NA NA NA CLRT 0.032 0.065 0.118 0.053 0.163 0.476 NA NA NA RT 0.088 0.143 0.097 0.188 0.426 0.215 0.392 0.907 0.809 25 LRT 0.073 0.239 0.463 0.210 0.763 0.973 0.643 0.999 1.000 CLRT 0.049 0.187 0.382 0.119 0.641 0.939 0.258 0.985 1.000 c. Double Exponential p=3 p=5 p=10 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.132 0.126 0.120 0.178 0.263 0.226 0.594 0.589 0.466 5 LRT 0.430 0.462 0.446 NA NA NA NA NA NA CLRT 0.104 0.098 0.117 NA NA NA NA NA NA RT 0.082 0.107 0.092 0.156 0.160 0.180 0.329 0.523 0.309 10 LRT 0.298 0.347 0.464 0.515 0.717 0.890 NA NA NA CLRT 0.165 0.192 0.267 0.207 0.353 0.612 NA NA NA RT 0.081 0.089 0.073 0.151 0.230 0.144 0.357 0.736 0.424 25 LRT 0.311 0.465 0.670 0.543 0.887 0.992 0.895 1.000 1.000 CLRT 0.240 0.385 0.597 0.404 0.808 0.986 0.594 0.997 1.000 73 6.3 TEST OF TYPE H The simulated type I error rates of the test of type H are shown in Tables 6.3.1 through 6.3.3. Recall from Section 2.3 that the data transformation required for the test of type H results in an n×( p 1) data matrix. Therefore, the number of permutations required to perform a permutation test (PT) for n=5 or 10 and p=3 are (2!)5 = 32 and (2!)10 =1024, respectively. Since neither of these situations requires a very large number of permutations, PTs rather than RTs were performed in these cases. The simulated type I error rates of the PT are very low for n=5 and p=3, but due to the small number of possible permutations, the only pvalues less than 0.05 are 0/32=0 and 1/32=0.03125. Therefore, we would expect lower type I error rates in these cases. The simulated type I error rates of the CLRT and PT/RT seem to be unaffected by increases in either d or D. However, they appear to increase as n approaches p and as n exceeds p in the case of the PT/RT. Just as with previous tests, the CLRT performs very well with respect to type I error rates for normally distributed data, but the CLRT is too conservative for uniformly distributed data and the simulated type I error rates are too high for double exponential data. This pattern is also seen with the PT/RT, but unlike the CLRT, the type I error rates for the PT/RT seem to be converging to 0.05 as n increases. Also similar to previous tests, the RT exists in cases when the CLRT does not, specifically when p 2 n , but the type I error rates of the RT are much too high in these cases for the RT to be of any practical use. Due to the inability of the CLRT and the RT to maintain the nominal type I error rate in these cases, these tests will be excluded for these cases in the power 74 discussions to follow, but the simulation results have been included in the tables for completeness. Table 6.3.1. Simulated Type I Error Rates for the Test of Type H ( p = 3 ) a. Normal D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.015 0.014 0.014 0.012 0.010 0.020 0.011 0.015 0.012 5 LRT 0.167 0.156 0.169 0.171 0.149 0.178 0.158 0.184 0.181 CLRT 0.055* 0.053* 0.045* 0.054* 0.043* 0.057* 0.047* 0.062* 0.066 †PT 0.051* 0.057* 0.059* 0.069 0.066 0.062* 0.060* 0.059* 0.056* 10 LRT 0.077 0.091 0.099 0.097 0.076 0.094 0.080 0.082 0.082 CLRT 0.040* 0.042* 0.053* 0.062* 0.033 0.053* 0.039* 0.043* 0.051* RT 0.046* 0.045* 0.046* 0.044* 0.047* 0.045* 0.049* 0.048* 0.050* 25 LRT 0.067 0.066 0.067 0.068 0.067 0.064 0.064 0.063* 0.064 CLRT 0.056* 0.057* 0.055* 0.055* 0.056* 0.052* 0.054* 0.053* 0.053* b. Uniform D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.012 0.013 0.016 0.018 0.015 0.011 0.012 0.020 0.014 5 LRT 0.128 0.163 0.160 0.145 0.150 0.135 0.145 0.127 0.130 CLRT 0.037* 0.049* 0.051* 0.044* 0.054* 0.042* 0.041* 0.036 0.043* †PT 0.060* 0.065 0.062* 0.044* 0.055* 0.063* 0.068 0.052* 0.050* 10 LRT 0.057* 0.073 0.057* 0.050* 0.056* 0.072 0.053* 0.046* 0.074 CLRT 0.028 0.045* 0.032 0.022 0.026 0.039* 0.027 0.026 0.041* RT 0.057* 0.053* 0.056* 0.056* 0.051* 0.046* 0.053* 0.064 0.049* 25 LRT 0.046* 0.051* 0.040* 0.045* 0.051* 0.050* 0.042* 0.032 0.044* CLRT 0.038 0.035 0.027 0.036 0.043* 0.038* 0.034 0.024 0.035 c. Double Exponential D=2 D=3 D=4 n d=1 d=2 d=3 d=2 d=3 d=4 d=3 d=4 d=5 †PT 0.022 0.015 0.019 0.011 0.017 0.017 0.017 0.026 0.019 5 LRT 0.214 0.233 0.213 0.241 0.215 0.224 0.185 0.224 0.238 CLRT 0.080 0.060* 0.073 0.084 0.069 0.082 0.064 0.083 0.067 †PT 0.066 0.084 0.081 0.067 0.077 0.070 0.072 0.065 0.075 10 LRT 0.155 0.151 0.175 0.157 0.160 0.150 0.167 0.177 0.181 CLRT 0.090 0.084 0.109 0.105 0.101 0.095 0.096 0.106 0.111 RT 0.059* 0.046* 0.060* 0.062* 0.051* 0.062* 0.061* 0.063* 0.053* 25 LRT 0.126 0.154 0.176 0.137 0.142 0.165 0.142 0.172 0.157 CLRT 0.109 0.123 0.146 0.114 0.116 0.142 0.121 0.149 0.124 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 †Permutation tests rather than randomization tests were run for n=5, 10 and p=3 75 Table 6.3.2. Simulated Type I Error Rates for the Test of Type H ( p = 5 ) a. Normal D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.150 0.144 0.156 0.150 0.138 0.147 0.143 0.150 0.150 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.081 0.084 0.089 0.087 0.089 0.095 0.089 0.088 0.096 10 LRT 0.199 0.208 0.190 0.209 0.207 0.190 0.191 0.204 0.192 CLRT 0.060* 0.063* 0.062* 0.066 0.067 0.060* 0.062* 0.064 0.060* RT 0.067 0.068 0.065 0.073 0.071 0.070 0.066 0.077 0.062* 25 LRT 0.081 0.093 0.097 0.087 0.096 0.097 0.096 0.098 0.095 CLRT 0.053* 0.051* 0.059* 0.053* 0.054* 0.058* 0.061* 0.057* 0.059* b. Uniform D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.109 0.124 0.137 0.131 0.131 0.130 0.121 0.134 0.128 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.074 0.070 0.082 0.082 0.089 0.083 0.087 0.077 0.081 10 LRT 0.151 0.193 0.161 0.138 0.174 0.154 0.154 0.152 0.151 CLRT 0.039* 0.058* 0.048* 0.030 0.049* 0.048* 0.036 0.034 0.043* RT 0.076 0.062* 0.048* 0.057* 0.059* 0.051* 0.058* 0.061* 0.050* 25 LRT 0.067 0.084 0.075 0.037* 0.081 0.072 0.040* 0.054* 0.067 CLRT 0.043* 0.052* 0.043* 0.019 0.048* 0.039* 0.022 0.028 0.038* c. Double Exponential D=1 D=1.25 D=1.5 n d=0.1 d=0.4 d=0.8 d=0.1 d=0.5 d=0.9 d=0.2 d=0.6 d=1 RT 0.161 0.168 0.178 0.156 0.173 0.170 0.164 0.160 0.171 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.082 0.087 0.108 0.076 0.093 0.108 0.085 0.093 0.108 10 LRT 0.286 0.288 0.343 0.330 0.313 0.333 0.300 0.322 0.335 CLRT 0.114 0.117 0.139 0.108 0.123 0.136 0.105 0.125 0.130 RT 0.066 0.062* 0.072 0.081 0.058* 0.071 0.068 0.069 0.074 25 LRT 0.206 0.226 0.300 0.257 0.225 0.297 0.236 0.240 0.297 CLRT 0.152 0.158 0.233 0.179 0.169 0.224 0.182 0.188 0.229 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 76 Table 6.3.3. Simulated Type I Error Rates for the Test of Type H ( p =10 ) a. Normal D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.485 0.443 0.475 0.474 0.492 0.481 0.443 0.467 0.482 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.185 0.170 0.180 0.181 0.194 0.161 0.185 0.184 0.183 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.097 0.092 0.092 0.095 0.091 0.100 0.089 0.094 0.102 25 LRT 0.273 0.280 0.279 0.272 0.278 0.277 0.277 0.278 0.275 CLRT 0.065 0.067 0.063* 0.070 0.065 0.065 0.065 0.066 0.060* b. Uniform D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.488 0.503 0.498 0.452 0.465 0.466 0.430 0.446 0.437 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.201 0.204 0.195 0.187 0.194 0.183 0.170 0.182 0.166 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.078 0.072 0.084 0.077 0.071 0.078 0.074 0.069 0.075 25 LRT 0.379 0.381 0.364 0.284 0.308 0.305 0.198 0.221 0.243 CLRT 0.105 0.105 0.113 0.069 0.071 0.081 0.048* 0.051* 0.061* c. Double Exponential D=0.5 D=0.75 D=1 n d=0.1 d=0.13 d=0.17 d=0.1 d=0.14 d=0.19 d=0.1 d=0.15 d=0.21 RT 0.496 0.509 0.517 0.469 0.493 0.519 0.486 0.510 0.511 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.203 0.215 0.206 0.194 0.200 0.203 0.186 0.193 0.197 10 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.112 0.114 0.114 0.110 0.108 0.107 0.103 0.102 0.106 25 LRT 0.594 0.613 0.638 0.556 0.566 0.607 0.532 0.547 0.589 CLRT 0.279 0.313 0.350 0.244 0.269 0.319 0.227 0.256 0.300 *Value is contained within 0.05 ±1.96 (0.05)(0.95) /1000 77 The simulated power of the test of type H versus the serial correlation structure is displayed in Tables 6.3.4 through 6.3.6. For these simulations, data were generated from distributions with the serial correlation covariance structure given by (6.0.3) with 2 = 1, 9, or 25 and = 0.3, 0.6, or 0.9. Again, PTs rather than RTs were performed when n=5 or 10 and p=3. The power of both the CLRT and PT/RT increases as H increases, but seems to be unaffected by an increase in 2 . The power of both tests decreases as p approaches n. Overall the power of both tests is fairly low with the CLRT achieving a power greater than 0.75 in ten of the 27 normally distributed cases and the PT/RT achieving a power greater than 0.75 in only sixteen of the 81 cases regardless of the distribution. All of these cases are when n=25 and = 0.6 or 0.9. For normally distributed data, the PT/RT is more powerful than the CLRT in only four of the 27 cases. All of these are when n=10 and = 0.3. However, the power of the PT/RT in these cases is extremely low even though greater than the power of the CLRT. 78 Table 6.3.4. Simulated Power vs. Serial Correlation for the Test of Type H ( p = 3 ) a. Normal X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.011 0.011 0.022 0.015 0.013 0.020 0.011 0.016 0.017 5 LRT 0.193 0.198 0.221 0.168 0.226 0.236 0.190 0.227 0.251 CLRT 0.064 0.062 0.071 0.059 0.068 0.086 0.062 0.083 0.084 †PT 0.065 0.075 0.102 0.056 0.072 0.108 0.071 0.077 0.094 10 LRT 0.119 0.164 0.305 0.113 0.175 0.286 0.105 0.192 0.311 CLRT 0.063 0.108 0.199 0.065 0.106 0.178 0.065 0.113 0.216 RT 0.058 0.112 0.245 0.082 0.133 0.217 0.064 0.120 0.218 25 LRT 0.129 0.339 0.613 0.131 0.328 0.613 0.138 0.338 0.624 CLRT 0.101 0.302 0.556 0.112 0.293 0.558 0.115 0.301 0.583 b. Uniform X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.013 0.011 0.014 0.013 0.016 0.022 0.013 0.017 0.017 5 LRT 0.168 0.235 0.309 0.156 0.213 0.313 0.161 0.229 0.333 CLRT 0.051 0.067 0.115 0.043 0.091 0.104 0.042 0.070 0.127 †PT 0.078 0.073 0.103 0.058 0.070 0.109 0.060 0.084 0.101 10 LRT 0.118 0.199 0.368 0.097 0.191 0.345 0.103 0.201 0.356 CLRT 0.061 0.138 0.257 0.057 0.118 0.238 0.057 0.134 0.258 RT 0.072 0.114 0.184 0.058 0.113 0.172 0.072 0.135 0.185 25 LRT 0.119 0.340 0.618 0.103 0.325 0.603 0.100 0.377 0.652 CLRT 0.096 0.298 0.576 0.081 0.284 0.566 0.087 0.338 0.624 c. Double Exponential X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 †PT 0.015 0.013 0.014 0.014 0.014 0.018 0.015 0.023 0.019 5 LRT 0.219 0.243 0.274 0.188 0.246 0.289 0.215 0.257 0.271 CLRT 0.067 0.088 0.097 0.073 0.097 0.094 0.071 0.093 0.092 †PT 0.073 0.081 0.095 0.054 0.075 0.123 0.068 0.073 0.102 10 LRT 0.176 0.237 0.323 0.163 0.219 0.357 0.162 0.231 0.331 CLRT 0.110 0.160 0.223 0.097 0.163 0.239 0.103 0.170 0.237 RT 0.058 0.116 0.191 0.069 0.099 0.194 0.064 0.119 0.181 25 LRT 0.194 0.377 0.594 0.198 0.402 0.612 0.198 0.370 0.607 CLRT 0.165 0.349 0.556 0.166 0.349 0.570 0.165 0.329 0.560 †Permutation tests rather than randomization tests were run for n=5, 10 and p=3 79 Table 6.3.5. Simulated Power vs. Serial Correlation for the Test of Type H ( p = 5 ) a. Normal X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.1400 0.131 0.165 0.152 0.150 0.164 0.113 0.147 0.169 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.090 0.152 0.239 0.101 0.156 0.238 0.115 0.164 0.220 10 LRT 0.237 0.502 0.771 0.297 0.540 0.788 0.288 0.526 0.795 CLRT 0.072 0.237 0.533 0.107 0.292 0.544 0.091 0.265 0.565 RT 0.136 0.321 0.528 0.139 0.334 0.522 0.133 0.319 0.508 25 LRT 0.304 0.837 0.988 0.319 0.800 0.985 0.276 0.808 0.987 CLRT 0.203 0.751 0.975 0.219 0.738 0.975 0.213 0.725 0.975 b. Uniform X2=1 X2=9 X2=25 n H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 H=0.3 H=0.6 H=0.9 RT 0.127 0.170 0.192 0.140 0.152 0.189 0.108 0.148 0.196 5 LRT NA NA NA NA NA NA NA NA NA CLRT NA NA NA NA NA NA NA NA NA RT 0.110 0.165 0.232 0.105 0.157 0.209 0.089 0.172 0.222 10 LRT 0.226 0.523 0.826 0.266 0.548 0.830 0.234 0.532 0.831 CLRT 0.077 0.247 0.623 0.087 0.272 0.623 0.087 0.283 0.619 RT 0.158 0.323 0.452 0.156 0.313 0.464 0.141 0.327 0.472 25 LRT 0.254 0.799 0.989 0.232 



A 

B 

C 

D 

E 

F 

I 

J 

K 

L 

O 

P 

R 

S 

T 

U 

V 

W 


