Chapter 5: Some Methods Based on Ranks
|
Chapter 6: Statistics of the Kolmogorov-Smirnov Type
|
---|---|
5.1 Two Independent Samples | 6.1 The Kolmogorov Goodness of Fit Test Let S(x) be the empirical distribution function based on the random sample of Xs. The test statistic is defined differently for the three different sets of hypotheses, A, B, and C. Let F*(x) be a completely specific hypothesized distribution function. Two Sided Test Let the test statistic T be the greatest (denoted by 'sup for supremum) vertical distance between S(x) and F*(x). In symbols we say T = supx|F*(x) - S(x)| which is read T equals the supremum over all of x of the absolate value of the difference F*(x) - S(x). One-Sided Test Denote this test statistic by T+ = supx {F*(x) - S(x)] which is similar to T except that we consider only the greatest difference where the function F*(x) is above the function S(x). One-Sided Test For this test use the test statistic T-, defined as the greatest vertical distance attained by S(x) above F*(x). Formally this becmes T- = supx [S(x) - F*(x)] |
5.2 Several Independent Samples | 6.2 Goodness of Fit Tests for Families of Distributions Obtaining a test statistic: Ordinarily the test statistic is the usual two-sided Kolmogorov test statistic, defined as the maximum vertical distance between the empirical distribution function of the X values and the normal distribution function with mean (xbar) and standard deviation (s) as given by Equations 1 and 2. However, the following method of computing the test statistic is slightly easier and is equivalent to the method indicated. Draw a graph of the standard normal distribution function and call it F*(x). Actually, only the values of F*(x) at the observed Zs are needed. Table A1 may be off assistance. Also draw a graph of the empirical distribution function of the normalized sample, the Z values defined by Equation 3, using the same set of coordinates as just used for F*(x). Find the maximum vertical distance between the two graphs, F*(x) and the empirical distribution function of the Zs, which we will call S(x). This distance is the test statistic. That is, the Lilliefors test statistic T1 is defined as T1 = supx|F*(x) - S(x| The difference between T1 and the Kolmogorov test statistic is that the empirical distribution function S(x) in Equation 4 was obtained from the normalized sample, while S(x) in the Kolmogorov test was based on the original unadjusted observations. First, the empirical distribution function S(x) based on the Z values plotted on a graph. On the same graph the function F*(x) = 1-e^(-x) is plotted for x > 0; actually, only values at n points need to be determined, the points being at x = Z1, z = Z2, and so on. Tables are available for evaluating e^(-x); calculators that have this function may also be used. The maximum vertical distance between the two functions T2 = supx |F*(x) - S(x)| is the test statistic. Although this is only the two sided-version of the test, one sided version are presented by Durbin along with tables. The Shapiro Wilk Test for Normality First compute the denominator D of the test statistic D = sum from i to n (Xi - Xbar)^2 where Xbar is the sample mean. Then order the sample from smallest to largest, X^(1) <= X^(2) <= X(n) and let X(i) denote the ith order statistic. From Table A16, for the observed sample size n, obtain the coefficients, a1, a2, a3, ak where k is approximately (n/2). The test statistic T3 = (1/D) [sum from i =1 to k (ai (X^(n-i+1) - X(i))^2 Note that this test statistic is often denoted by, and the test is often called the W test. |
5.3 A Test for Equal Variances | 6.3 Tests on Two Independent Samples |
5.4 Measures of Rank Correlation | |
5.7 The One Sample or Matched Pairs Case | |
5.8 Several Related Samples |