Describing Quantitative Data (2)
Problem
An automated system for marking large numbers of student computer
programs, called AUTOMARK, has been used successfully at
|
AUTOMARK GRADE x |
INSTRUCTOR GRADE y |
AUTOMARK GRADE x |
INSTRUCTOR GRADE y |
AUTOMARK GRADE x |
INSTRUCTOR GRADE y |
|
12.2 |
10 |
18.2 |
15 |
19.3 |
17 |
|
10.6 |
11 |
15.1 |
16 |
19.5 |
17 |
|
15.1 |
12 |
17.2 |
16 |
19.7 |
17 |
|
16.2 |
12 |
17.5 |
16 |
18.6 |
18 |
|
16.6 |
12 |
18.6 |
16 |
19 |
18 |
|
16.6 |
13 |
18.8 |
16 |
19.2 |
18 |
|
17.2 |
14 |
17.8 |
17 |
19.4 |
18 |
|
17.6 |
14 |
18 |
17 |
19.6 |
18 |
|
18.2 |
14 |
18.2 |
17 |
20.1 |
18 |
|
16.5 |
15 |
18.4 |
17 |
19.2 |
19 |
|
17.2 |
15 |
18.6 |
17 |
19.3 |
17 |
|
12.2 |
10 |
19 |
17 |
19.5 |
17 |
Question
Answer Using SAS
*--- SAS program: DESCRIBING_QUANTITATIVE_DATA_2.SAS ;
options nodate pageno=1;
*---Create SAS data set;
data automark;
input automark_grade instructor_grade @@;
cards;
12.2 10 18.2 15 19.3 17
10.6 11 15.1 16 19.5 17
15.1 12 17.2 16 19.7 17
16.2 12 17.5 16 18.6 18
16.6 12 18.6 16 19 18
16.6 13 18.8 16 19.2 18
17.2 14 17.8 17 19.4 18
17.6 14 18 17 19.6 18
18.2 14 18.2 17 20.1 18
16.5 15 18.4 17 19.2 19
17.2 15 18.6 17 19.3 17
12.2 10 19 17 19.5 17
;
run;
*---Run PROC UNIVARIATE on automark_grade*instructor_grade;
proc univariate data=automark;
title 'Univariate Descriptive Statistics on automark_grade
and instructor_grade';
var automark_grade instructor_grade;
run;
SAS Output (description of variables of interest)
Univariate
Descriptive Statistics on automark_grade and instructor_grade
The UNIVARIATE
Procedure
Variable: automark_grade
Moments
N 36 Sum Weights 36
Mean 17.6111111 Sum Observations 634
Std
Deviation 2.21537198 Variance 4.90787302
Skewness -1.7230407 Kurtosis 2.87688939
Uncorrected SS 11337.22 Corrected SS 171.775556
Coeff
Variation 12.5793993 Std Error
Mean 0.36922866
Basic Statistical
Measures
Location Variability
Mean 17.61111 Std
Deviation 2.21537
Median 18.20000
Variance 4.90787
Mode 17.20000 Range 9.50000
Interquartile Range
2.30000
NOTE: The mode displayed is the smallest
of 3 modes with a count of 3.
Tests for Location:
Mu0=0
Test -Statistic- -----p Value------
Student's t t 47.69703 Pr > |t| <.0001
Sign M 18
Pr >= |M| <.0001
Signed Rank S
333 Pr >= |S| <.0001
Quantiles (Definition 5)
Quantile
Estimate
100% Max 20.1
99% 20.1
95% 19.7
90% 19.5
75% Q3 19.2
50% Median 18.2
Univariate
Descriptive Statistics on automark_grade and instructor_grade
The UNIVARIATE
Procedure
Variable: automark_grade
Moments
N 36 Sum Weights 36
Mean 17.6111111 Sum Observations 634
Std
Deviation 2.21537198 Variance 4.90787302
Skewness -1.7230407 Kurtosis 2.87688939
Uncorrected SS 11337.22 Corrected SS 171.775556
Coeff
Variation 12.5793993 Std Error
Mean 0.36922866
Basic Statistical
Measures
Location Variability
Mean 17.61111 Std
Deviation 2.21537
Median 18.20000
Variance 4.90787
Mode 17.20000 Range 9.50000
Interquartile Range
2.30000
NOTE: The mode displayed is the smallest
of 3 modes with a count of 3.
Tests for Location:
Mu0=0
Test -Statistic- -----p Value------
Student's t t 47.69703 Pr > |t| <.0001
Sign M 18
Pr >= |M| <.0001
Signed Rank S
333 Pr >= |S| <.0001
Quantiles (Definition 5)
Quantile
Estimate
100% Max 20.1
99% 20.1
95% 19.7
90% 19.5
75% Q3 19.2
50% Median 18.2
Univariate
Descriptive Statistics on automark_grade and instructor_grade
The UNIVARIATE
Procedure
Variable: instructor_grade
Moments
N 36 Sum Weights 36
Mean 15.5833333 Sum Observations 561
Std
Deviation 2.43046145 Variance 5.90714286
Skewness -0.9607591 Kurtosis -0.0398179
Uncorrected SS 8949 Corrected SS 206.75
Coeff
Variation 15.5965441 Std Error
Mean 0.40507691
Basic Statistical
Measures
Location Variability
Mean 15.58333 Std
Deviation 2.43046
Median
16.50000 Variance 5.90714
Mode 17.00000 Range 9.00000
Interquartile Range
3.00000
Tests for Location:
Mu0=0
Test -Statistic- -----p Value------
Student's t t 38.47006 Pr > |t| <.0001
Sign M 18
Pr >= |M| <.0001
Signed Rank S
333 Pr >= |S| <.0001
Quantiles
(Definition 5)
Quantile Estimate
100% Max 19.0
99% 19.0
95% 18.0
90% 18.0
75% Q3 17.0
50% Median 16.5
25% Q1 14.0
10% 12.0
The UNIVARIATE Procedure
Variable: instructor_grade
Quantiles
(Definition 5)
Quantile Estimate
5% 10.0
1% 10.0
0% Min 10.0
Extreme Observations
----Lowest---- ----Highest---
Value Obs Value Obs
10 34 18 18
10 1 18 21
11 4 18 24
12 13 18 27
12 10 19 30
*---Construct a scattergram
using PROC PLOT;
proc plot data=automark;
title 'Scattergram';
plot automark_grade*instructor_grade;
quit;
SAS Output (appear to be positive correlation
between the variables)
Scattergram
Plot of automark_grade*instructor_grade.
Legend: A = 1 obs, B = 2 obs,
etc.
20 ˆ A
‚
A
A
‚ D A
‚
A B A
‚ A
‚
A A A
‚ A A B
18 ˆ
A
‚
A
A
‚
A
automark_grade
‚
A A A
‚
‚ A A A
‚ A
16 ˆ
‚
‚
‚ A A
‚
‚
‚
14 ˆ
‚
‚
‚
‚
‚
‚B
12 ˆ
‚
‚
‚
‚
‚ A
‚
10 ˆ
Šˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆ
10 11 12 13 14 15 16 17 18 19
instructor_grade
*---Run PROC CORR to examine the correlation between automark_grade and instructor_grade;
proc corr data=automark;
title
'Correlation between automark_grade and instructor_grade';
var automark_grade instructor_grade;
run;
SAS Output (correlation coefficient =
0.86)
Correlation between
automark_grade and instructor_grade
The CORR Procedure
2 Variables: automark_grade instructor_grade
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
automark_grade 36 17.61111 2.21537 634.00000 10.60000 20.10000
instructor_grade 36 15.58333 2.43046 561.00000 10.00000 19.00000
Pearson
Correlation Coefficients, N = 36
Prob > |r| under H0:
automark_
instructor_
grade
grade
automark_grade
1.00000 0.86051
<.0001
instructor_grade
0.86051 1.00000
<.0001