# Nonparametric Statistics Test 4

• The Mann Whitney Test

H null: Farm boys do not tend to be more fit, physically, than town boys

H1: Farm boys tend to be more fit than town boys
• The Kruskal-Wallis Test

H null: The four methods are equivalent

H1: Some methods of growing corn tend to furnish higher yields than others.
• A Test for Equal Variances

H null: Both machines have equal variability

H1: The new machine has a smaller variance
• Spearman's Rho

H null: GPAs are independent of GMAT scores

H1: High GPAs tend to be associated with high GMAT scores.
• Kendall's Tau

H null: X and Y are indepdendent

H1: Pairs of observations either tend to be concordant, or tend to be discordant.
• The Wilcoxon Signed Ranks Test

H null:The firstborn twin does not tend to be more aggresive than the other (E(Xi)<= E(yi)

H1: The firstborn twin tends to be more aggresive than the second twin (E(Xi)> E(yi)
• The Friedman's Test

H null: Each ranking of the random variables within a block is equally likely (the treatments have identical effects)

H1: At least one of the treatments tends to yield larger observed values than at least one other treatment
• The Kolmogorov Goodness-of-Fit Test

H null: F(x) = F*(x)

H1: F(x) not equal to F*(x)
• Lilliefors Test for Normality

H null: The random sample comes from a population with the normal distribution, with unknown mean and standard deviation

H1: The distribution function of the Xi's is nonnormal
• Lilliefors Test for the Exponential Distribution

H null: The random sample has the exponential distribution

H1: The distribution of X is not exponential
• The Shapiro-Wilk Test for Normality

H null:F(x) is a normal distribution function with unspecified mean and variance

H1: F(x) is nonnormal
• Chapter 5: Some Methods Based on Ranks
Chapter 6: Statistics of the Kolmogorov-Smirnov Type
5.1 Two Independent Samples 6.1 The Kolmogorov Goodness of Fit Test

Let S(x) be the empirical distribution function based on the random sample of Xs. The test statistic is defined differently for the three different sets of hypotheses, A, B, and C. Let F*(x) be a completely specific hypothesized distribution function.

Two Sided Test

Let the test statistic T be the greatest (denoted by 'sup for supremum) vertical distance between S(x) and F*(x). In symbols we say T = supx|F*(x) - S(x)| which is read T equals the supremum over all of x of the absolate value of the difference F*(x) - S(x).

One-Sided Test

Denote this test statistic by T+ = supx {F*(x) - S(x)] which is similar to T except that we consider only the greatest difference where the function F*(x) is above the function S(x).

One-Sided Test

For this test use the test statistic T-, defined as the greatest vertical distance attained by S(x) above F*(x). Formally this becmes T- = supx [S(x) - F*(x)]
5.2 Several Independent Samples 6.2 Goodness of Fit Tests for Families of Distributions

Obtaining a test statistic: Ordinarily the test statistic is the usual two-sided Kolmogorov test statistic, defined as the maximum vertical distance between the empirical distribution function of the X values and the normal distribution function with mean (xbar) and standard deviation (s) as given by Equations 1 and 2. However, the following method of computing the test statistic is slightly easier and is equivalent to the method indicated.

Draw a graph of the standard normal distribution function and call it F*(x). Actually, only the values of F*(x) at the observed Zs are needed. Table A1 may be off assistance. Also draw a graph of the empirical distribution function of the normalized sample, the Z values defined by Equation 3, using the same set of coordinates as just used for F*(x). Find the maximum vertical distance between the two graphs, F*(x) and the empirical distribution function of the Zs, which we will call S(x). This distance is the test statistic. That is, the Lilliefors test statistic T1 is defined as T1 = supx|F*(x) - S(x|

The difference between T1 and the Kolmogorov test statistic is that the empirical distribution function S(x) in Equation 4 was obtained from the normalized sample, while S(x) in the Kolmogorov test was based on the original unadjusted observations.

First, the empirical distribution function S(x) based on the Z values plotted on a graph. On the same graph the function F*(x) = 1-e^(-x) is plotted for x > 0; actually, only values at n points need to be determined, the points being at x = Z1, z = Z2, and so on. Tables are available for evaluating e^(-x); calculators that have this function may also be used. The maximum vertical distance between the two functions

T2 = supx |F*(x) - S(x)|

is the test statistic. Although this is only the two sided-version of the test, one sided version are presented by Durbin along with tables.

The Shapiro Wilk Test for Normality

First compute the denominator D of the test statistic D = sum from i to n (Xi - Xbar)^2

where Xbar is the sample mean. Then order the sample from smallest to largest,

X^(1) <= X^(2) <= X(n)

and let X(i) denote the ith order statistic. From Table A16, for the observed sample size n, obtain the coefficients, a1, a2, a3, ak where k is approximately (n/2).

The test statistic T3 = (1/D) [sum from i =1 to k (ai (X^(n-i+1) - X(i))^2

Note that this test statistic is often denoted by, and the test is often called the W test.
5.3 A Test for Equal Variances 6.3 Tests on Two Independent Samples
5.4 Measures of Rank Correlation
5.7 The One Sample or Matched Pairs Case
5.8 Several Related Samples