Comparing Multinomial Proportions:
Problem
According to research reported in the Journal of the National Cancer
Institute (Apr. 1991), eating foods high in fiber may help protect against
breast cancer. The researchers randomly divided 120 laboratory rats into four
groups of 30 each. All rats were injected with a drug that causes breast
cancer; then each rat was fed a diet of fat and fiber for 15 weeks. However,
the levels of fat and fiber varied from group to group. At the end of the feeding
period, the number of rats with cancer tumors was determined for each group.
The data are summarized in the contingency table.
Contingency Table
|
Cancer Tumors |
High Fat / No Fiber |
High Fat / Fiber |
Low Fat / No Fiber |
Low Fat / Fiber |
Total |
|
Yes |
27 |
20 |
19 |
14 |
80 |
|
No |
3 |
10 |
11 |
16 |
40 |
|
Totals |
30 |
30 |
30 |
30 |
120 |
Question
Solution
The following SAS program generates all statistics need for solving this
problem.
*---Create SAS data set in the contingency table ;
data cancer;
input tumor $ diet $ count;
cards;
YES HF_NF 27
YES HF_F 20
YES LF_NF 19
YES LF_F 14
NO
HF_NF 3
NO
HF_F 10
NO
LF_NF 11
NO
LF_F 16
;
run;
proc freq data=cancer;
tables tumor*diet / expected chisq ;
weight count;
run;
Note:
The output from the above SAS program is shown below:
The FREQ Procedure
Table of tumor by diet
tumor diet
Frequency
Expected
Percent
Row Pct
Col Pct HF_F HF_NF
LF_F LF_NF
Total
NO
10 3
16 11 40
10
10 10 10
8.33
2.50 13.33
9.17 33.33
25.00 7.50
40.00 27.50
33.33 10.00
53.33 36.67
YES
20 27
14 19 80
20 20
20 20
16.67 22.50
11.67 15.83 66.67
25.00 33.75
17.50 23.75
66.67 90.00
46.67 63.33
Total 30 30 30 30 120
25.00 25.00 25.00 25.00 100.00
Statistics for Table of tumor by
diet
Statistic DF Value Prob
Chi-Square
3 12.9000 0.0049
Likelihood Ratio Chi-Square 3
14.1827 0.0027
Mantel-Haenszel
Chi-Square 1 1.9040
0.1676
Phi Coefficient 0.3279
Contingency Coefficient 0.3116
Cramer's V 0.3279
Sample Size = 120
Answer:
|
Cancer Tumors |
High Fat / No Fiber |
High Fat / Fiber |
Low Fat / No Fiber |
Low Fat / Fiber |
|
Yes |
20 |
20 |
20 |
20 |
|
No |
10 |
10 |
10 |
10 |
p1 (the percentage of rats on high fat/no fiber diet with cancer) =
27/30 = 0.9
p2 (the percentage of rats on a high fat/fiber diet with cancer) = 20/30
= 0.6667
Hence
95% confidence interval = (p1-p2) ± 1.96sqrt*(p1(1-p1)/n1+
p2(1-p2)/n2)
=
(0.9-0.667) ± 1.96*sqrt((0.9*0.1)/30+(0.667*0.333)/30)
= 0.233 ± 0.2