Entering and Listing Data

Tests to Compare k Treatment and b Block Means for a Randomized Design

Problem

A supermarket advertisement in the Gainesville Sun states: “You’ll save up to 21% with Albertson’s lower price.” To substantiate the claim, Albertson’s supermarket compared the price of 49 grocery items at three competing supermarkets with its prices on a given day. The survey results for 7 items randomly selected from 49 are shown in the following table. Determine whether the mean prices of grocery items differ among the four supermarkets. Test using a = 0.5.

Grocery Item	Albertson’s	Kash’n Karry	Publix	Food 4 Less
Cheerios Cereal	1.1	1.18	1.39	1.18
Jell-O geletin	0.24	0.24	0.31	0.26
Dial soap	0.52	0.6	0.63	0.55
Crisco oil	1.26	1.7	2.27	1.29
Kleenex	0.67	0.7	0.79	0.7
Star-Kist_tuna	0.63	0.66	0.79	0.63
Del Monte peas	0.43	0.47	0.65	0.47
Cheerios Cereal	1.1	1.18	1.39	1.18

Solution

We need to conduct an analysis of variance for a randomized block design. The columns of above table correspond to k=4 treatments (supermarkets) and the rows corresponds to b=7 blocks (grocery items), each consists of 4 observations. The observations within a block are matched because all process within a block are for the same item on the same day. (A randomized block design is necessary to ensure that the same items are compared at the four supermarkets.)

Since the supermarkets represent the treatments, we want to test

H₀: μ₁ = μ₂ = μ₃= μ₄

Ha: At least two of the treatment means differ

Where μ₁ = mean price charged at Albertson’s, μ₂= mean price at Kash’n Karry, μ₃ = mean price at Publix, and μ₄ = mean price at Food 4 Less.

The SAS program and Output are shown below.

The test statistic, F= MST/MSE, is found by substituting the values of MST = 0.1117 and MSE = .0246 obtained from the SAS output:

F = MST/MSE = 0.1117/0.0246 = 4.540

The F statistic will have the numerator degrees of freedom (k-1) = 3 (df for MST) and denominator degrees of freedom (n-b-k+1) = 18 (df for MSE). The tabulated value of F_0.05 with 3 and 18 df is 3.16. Therefore, we will reject H0 if the calculated value of F is F > 3.16. Since the computed value of the test statistic, F = 4.54, exeeds 3.16, we have sufficient evidence to reject H₀ at a=.05. There appear to be significant difference among the mean prices of grocery items at the four supermarkets.

F statistic for testing block means is F = MSB/MSE. Substituting the values of MSB and MSE found in the SAS output, we have

F = MSB/MSE = 0.8718/0.0246 = 35.40

The F statistic will have numerator degrees of freedom (b-1) = 6, and the denominator degrees of freedom will be the df associated with MSE – namely, 18. Therefore, the rejection region for the test is

Reject H₀ if F > F_0.05= 2.66

Since the F value of 35.40 is falls well within the rejection region, there is sufficient evidence at a = 0.05 to conclude that the block (item) means differ. It appears that blocking was effective in removing the item-to-item variation in prices.

SAS program: Randomized_Block.SAS

options pageno=1;

*---Readin data to SAS;

data grocery;

input @1 item $1-15 @;

do market="ALBERTSON'S","KASH'N KARRY","PUBLIX","FOOD 4 LESS";

input price @;

output;

end;

cards;

Cheerios_Cereal 1.1 1.18 1.39 1.18

Jell-O_geletin .24 .24 .31 .26

Dial_soap .52 .6 .63 .55

Crisco_oil 1.26 1.7 2.27 1.29

Kleenex .67 .7 .79 .7

Star-Kist_tuna .63 .66 .79 .63

Del_Monte_peas .43 .47 .65 .47

;

run;

proc print data=grocery;

title2 "Supermarket Survey Results";

run;

proc anova data=grocery;

title2 "Analysis of Variance";

class market item;

model price=market item;

means market/bon;

quit;

Notes

The ANOVA procedure is used to conduct a parametric analysis of variance.
The CLASS statement identifies the sources of variation for the experiment.
The sources of variation are specified to the right of the equals sing (=) in the MODEL statement, the dependent variable to the left.
The MEANS commend produces a multiple comparisons analysis of the means of the specified source. The BON option selects the Bonferroni multiple comparisons procedure.
The output from this SAS program is shown below.

SAS Output

Supermarket Survey Results

Obs item market price

1 Cheerios_Cereal ALBERTSON'S 1.10

2 Cheerios_Cereal KASH'N KARR 1.18

3 Cheerios_Cereal PUBLIX 1.39

4 Cheerios_Cereal FOOD 4 LESS 1.18

5 Jell-O_geletin ALBERTSON'S 0.24

6 Jell-O_geletin KASH'N KARR 0.24

7 Jell-O_geletin PUBLIX 0.31

8 Jell-O_geletin FOOD 4 LESS 0.26

9 Dial_soap ALBERTSON'S 0.52

10 Dial_soap KASH'N KARR 0.60

11 Dial_soap PUBLIX 0.63

12 Dial_soap FOOD 4 LESS 0.55

13 Crisco_oil ALBERTSON'S 1.26

14 Crisco_oil KASH'N KARR 1.70

15 Crisco_oil PUBLIX 2.27

16 Crisco_oil FOOD 4 LESS 1.29

17 Kleenex ALBERTSON'S 0.67

18 Kleenex KASH'N KARR 0.70

19 Kleenex PUBLIX 0.79

20 Kleenex FOOD 4 LESS 0.70

21 Star-Kist_tuna ALBERTSON'S 0.63

22 Star-Kist_tuna KASH'N KARR 0.66

23 Star-Kist_tuna PUBLIX 0.79

24 Star-Kist_tuna FOOD 4 LESS 0.63

25 Del_Monte_peas ALBERTSON'S 0.43

26 Del_Monte_peas KASH'N KARR 0.47

27 Del_Monte_peas PUBLIX 0.65

28 Del_Monte_peas FOOD 4 LESS 0.47

Analysis of Variance

The ANOVA Procedure

Class Level Information

Class Levels Values

market 4 ALBERTSON'S FOOD 4 LESS KASH'N KARR PUBLIX

item 7 Cheerios_Cereal Crisco_oil Del_Monte_peas Dial_soap Jell-O_geletin Kleenex Star-Kist_tuna

Number of observations 28

Analysis of Variance

The ANOVA Procedure

Dependent Variable: price

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 9 5.56626786 0.61847421 25.11 <.0001

Error 18 0.44334286 0.02463016

Corrected Total 27 6.00961071

R-Square Coeff Var Root MSE price Mean

0.926228 19.69664 0.156940 0.796786

Source DF Anova SS Mean Square F Value Pr > F

market 3 0.33518214 0.11172738 4.54 0.0155

item 6 5.23108571 0.87184762 35.40 <.0001

Analysis of Variance

The ANOVA Procedure

Bonferroni (Dunn) t Tests for price

NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ.

Alpha 0.05

Error Degrees of Freedom 18

Error Mean Square 0.02463

Critical Value of t 2.96273

Minimum Significant Difference 0.2485

Means with the same letter are not significantly different.

Bon Grouping Mean N market

A 0.97571 7 PUBLIX

B A 0.79286 7 KASH'N KARR

B 0.72571 7 FOOD 4 LESS

B 0.69286 7 ALBERTSON'S

Overview of PROC ANOVA

The ANOVA procedure performs analysis of variance (ANOVA) for balanced data from a wide variety of experimental designs. In analysis of variance, a continuous response variable, known as a dependent variable, is measured under experimental conditions identified by classification variables, known as independent variables. The variation in the response is assumed to be due to effects in the classification, with random error accounting for the remaining variation.

The ANOVA procedure is designed to handle balanced data (that is, data with equal numbers of observations for every combination of the classification factors), whereas the GLM procedure can analyze both balanced and unbalanced data. Because PROC ANOVA takes into account the special structure of a balanced design, it is faster and uses less storage than PROC GLM for balanced data.

Use PROC ANOVA for the analysis of balanced data only, with the following exceptions: one-way analysis of variance, Latin square designs, certain partially balanced incomplete block designs, completely nested (hierarchical) designs, and designs with cell frequencies that are proportional to each other and are also proportional to the background population. These exceptions have designs in which the factors are all orthogonal to each other. PROC ANOVA works for designs with block diagonal X'X matrices where the elements of each block all have the same value. The procedure partially tests this requirement by checking for equal cell means. However, this test is imperfect: some designs that cannot be analyzed correctly may pass the test, and designs that can be analyzed correctly may not pass. If your design does not pass the test, PROC ANOVA produces a warning message to tell you that the design is unbalanced and that the ANOVA analyses may not be valid; if your design is not one of the special cases described here, then you should use PROC GLM instead. Complete validation of designs is not performed in PROC ANOVA since this would require the whole X'X matrix; if you're unsure about the validity of PROC ANOVA for your design, you should use PROC GLM.

Caution: If you use PROC ANOVA for analysis of unbalanced data, you must assume responsibility for the validity of the results.