The One Sample or Matched Pairs Case

This is a rank test. We are dealing with the single random sample and the random sample of matched pairs that is reduced to a single sample by considering differences. Basically this means that we are doing another rank-based test and dealing with a single sample, that is assumed to have been selected at random from a population. We intend to reduce a sample of matched pairs to a single sample by comparing differences. We make the matched pair (X,Y) into a single observation on a bivariate random variable. This means that this is a single observation on a variable that is duo-fold. In Section 3.4 we learned about the sign test. We analyzed matched pairs of data similarly, but by reducing each pair to a plus or a minus, or a tie. Once the problem has been reduced, we can apply the binomial test (a distribution that requires there be only two possible outcomes) to the resultant single sample [of +s and -s]. Similarly, the test of this section reduces a matched pair (X,Y) to a single observation by considering the difference between the values in the matched pair. We subtract the Y value by X. (Y-X). We then can perform an analysis on the Differences as a sample of single observations. Whereas the sign test simply noted whether the difference was postive, negative, or zero, the test of this section notes the sizes of the differences relative to the negative differences. The model resembles that of the sign test. The important difference is an additional assumption of symmetry of the distribution of differences. We should clarify the meaning of symmetric as it applies to a distribution and discuss the influence of such symmetry on the scale of measurement. Symmetry is easy to define if the distribution is discrete (countable). It is symmetric if the left half of the graph of the probability function is the mirror image of the right half. For example, the binomial distribution is symmetric if p = 1/2. The discrete uniform distribution is always symmetric. For other than discrete distributions, there is a more abstract definition of symmetry.

The distribution of a random variable X is symmetric about a line x=c for some constant c, if the probability of X <= (c-x) equals the probability of X>= (c+x) for each possible value of x.

If a distribution is symmetric, the mean coincides with the median because both are located exactly in the middle of the distribution, at the line of symmetry. Also, the required scale of measurement is changed from ordinal to interval. With an ordinal scale of measurement, two observations of the random variable need only to be distinguished on the basis of which is larger and which is smaller. It is not necessary to know which one is farthest from the median, such as when two observations are on either side of the median. If the assumption of symmetry is a meaningful measurement, the distance between two observations is a meaningful one. The scale of measurement is therefore more than just ordinal, it is interval. A test presented by Wilcoxon in 1945 is designed to test whether a particular sample came from a population with a specified mean or median. It may also be used in situations where observations are paired, such as "before" and "after" observations on each of the several subjects, to see if the second random variable in the pair has the same mean as the first. Note than in symmetric distribution the mean equals the median, so the two terms can be used interchangably.
Several Related Samples

We presented the Kruskal-Wallis rank test for several independent samples, which is an extension of the Mann-Whitney test for two independent samples. In this section we consider the problem of analyzing several related samples, which is an extension of the problem of matched pairs, or two related samples. First we will present the Friedman test, which is an extension of the sign test. Then we will present the Quade test, which is an extension of the Wilcoxon signed-ranks test. The Friedman test is the better known of the two and requires fewer assumptions, but suffers from a lack of power when there are only three treatments, just as the sign test has less power than the Wilcoxon signed ranks test when there are only two treatments. When there are four or five treatments, the Friedman test has about the same power as the Quade test, but when the number of treatments is six or more, Friedman's test has more power.

The problem of several related samples arises in an experiment that is designed to detect differences in different treatments, at least two. The observations are arranged in blocks, which are groups of units corresponding to the amount of treatments, all related to one another in some repsects. The treatments are administered only once to each value in the block. We compare the treatments to each other. The experimental arrangement described here is usually called a randomized complete block design. This design may be compared with the incomplete block design, in which the blocks do not contain enough experimental units to enable all the treatments to be applied in all the blocks, and so each treatment appears in some blocks but not in others.

Here are some applications of randomized block designs: By now you should have some idea of the nature of a randomized complete block design. The usual parametric method of testing the null hypothesis of no treatment differences is called the two way analysis of variance. The following nonparametric method depends only on the ranks of the observations within each block. Therefore it may be considered a two-way analysis of variance on ranks. This test is named after its inventor, the noted economist Milton Friedman.