Selected Essays And Book Reviews

COUN 585 - Introduction To Research Methods

Lesson 15. Overview of Instrumentation {903 words}

1. What are two characteristics of all "good" instruments? The two characteristics are reliability and validity. Reliability determines how "consistently" the instrument measures an attribute, and the validity determines how "well" the instrument measures the intended attribute.

2. How does error affect reliability and validity? Random error (which looks at reliability) involves a change of person, subject, or test environment. Systematic error (which looks at validity) involves test bias (certain cultures might not do as well or might do better than other cultures) and makes sure that the test is appropriate (given a 6th grade math test to 4th graders would not be appropriate).

3. How are reliability and validity related? First, an instrument may be reliable but not valid (teacher will give better grade to shorter people (height is a very reliable instrument, but the test is not valid)). Second, the instrument cannot be valid if it is not reliable.

4. What are the major methods for establishing reliability? Some ways to establish reliability are: (1) test/retest to establish the reliability of the instrument and get a coefficient of stability ((a type of correlation) some problems with this approach, though, are that the reliability is hurt if the questions from the first test are remembered and that people change over time, thus affecting a retest in the more distant future), (2) equivalent/alternate form gives the coefficient of equivalence (researcher gives an "A" and "B" version of the test, but fatigue can be a factor with this approach and also the effect of test differences), (3) split-half test which gives the coefficient of internal consistency (researcher gives two half-tests (Spearman-Brown), but the problem is once again fatigue and length of tests (look at only odd or even questions)), (4) inner-item consistency which gives the coefficient of homogeneity ((Kuder-Richardson or Cronbach Alpha), looks at how well one question is answered in relation to the others to determine if some questions are consistently answered either correctly or incorrectly), and (5) inter-rater consistency (hiring people to observe and grade the subjects causes a concern for the consistency of the raters (need to hire multiple raters)).

5. What are some considerations in interpreting the reliability coefficient? The reliability coefficients are: (1) a function of the length of the test, (2) a function of the heterogeneity of the subjects, (3) a function of the ability of the subjects who take the tests, (4) a function of the specific techniques used for its estimation, and (5) a function of the nature of the variable being measured.

6. How are the standard error of measurement and reliability related? The standard error of measurement gives an interval plus or minus some number of points. If there is not much standard deviation, then there will not be much variability. A higher standard deviation means a higher standard error, but the standard deviation and standard error have an inverse relationship with reliability.

7. What are the evidences for validity? Validity is a unitary concept that is no longer broken into different types of validity. Either, the statistic is valid, or it is not. Some types of evidences for validity are: (1) content-related evidence (how well do the sample questions on a test represent the universe of all possible questions (95% of the questions coming from Lesson 6 and 5% from the other lessons is not a valid test; word math problems that require word skills and math skills is not a valid test for testing math skills only)), (2) criterion-related evidence (establish one test based on another – the problem is that both tests may be valid but may not come together very well or both test may be bad but compare well, thus again leading to faulty conclusions), and (3) construct-related evidence (how well the instrument measures the element of the construct while eliminating unrelated constructs (purity). The researcher must ask all the relevant questions to show conditions and not ask irrelevant questions. This needs to be done quickly, such as for a comparison between two well-defined groups (depressed vs. not depressed people). The problems with this latter kind of evidence are the lack of purity and a disagreement among experts on the nature of the constructs.

Criterion-related evidence consists of two types of testing, concurrent testing and predictive testing. The first shows how well one score correlates with another instrument (take one valid test, then the one to be tested), and the second shows how well it correlates now with a future method (test in 2nd grade to show musical aptitude for future and then later verifying the results in those students). Construct-related evidence consists of one type of test, intra-test analysis, which shows how well the items of the test hang together.

Assessing validity is as much a function of how the test is used as it is of what the test intends to measure. If a test is administered incorrectly or to the wrong group, it can invalidate the test.

				Tom of Bethany

"He that hath the Son hath life; and he that hath not the Son of God hath not life." (I John 5:12)

"And ye shall seek me, and find me, when ye shall search for me with all your heart." (Jeremiah 29:13)

 

Back To TLEE's Home Page

Index to Selected Essays And Book Reviews

Lesson 16. Instrumentation

Reaction to Video Lesson 15

 

Send email to: tlee6040@aol.com