Quiz Over Chapter 4 of the 6th Edition Reliability and Validity The Meaning of Reliability and the Reliability Coefficient What is really good one-word synonym for "reliable"? Reliability coefficients can extend anywhere between ___ and ___ . Different Approaches to Reliability Whereas the number produced by test-retest reliability is called the "coefficient of ______," the number produced by parallel-forms reliability is called the "coefficient of ______ ." What two other terms are sometimes used in place of the term "parallel-forms reliability"? Name three reliability procedures that assess a test's "internal consistency." (T/F) In estimating a test's split-half reliability, half of the examinees respond to the odd-numbered items while the other half respond to the even-numbered items. When the K-R #20 reliability method is used, examinees will be tested ___ time(s). Who invented the reliability procedure that's often called "alpha" or "coefficient alpha"? Which internal consistency reliability procedure--K-R #20 OR Cronbach's alpha--fits a situation where data are collected on a 5-point scale that goes from "Strongly Agree" to "Strongly Disagree"? Which of the reliability procedures that focus on internal consistency makes use of the Spearman-Brown formula? (T/F) If applied to the same right/wrong (i.e., 0 or 1) data from a test, the split-half reliability coefficient will always turn out the same as the K-R#20 reliability coefficient. Interrater Reliability The coefficient of concordance is symbolized by the letter ___ . Interrater reliability will turn out equal to ____ if all raters are in full agreement. What name is associated with the interrater reliability technique that produces a coefficient of concordance? This technique uses what kind of data? What name is associated with the interrater reliability technique that produces a kappa coefficient? This technique uses what kind of data? What do the letters ICC stand for? (T/F) Although many techniques can be used to assess interrater reliability, Pearson's r is not one of them. The Standard Error of Measurement The size of the SEM is ____ (directly/indirectly) related to the amount of reliability present in the data. If a student's score on a test is 82 and if the SEM = 4, that student's 68% confidence band would extend from __ to __. Now, change 68% to 95% and reanswer this question. Which assessment of consistency--reliability OR SEM--is expressed "in" the same units as the scores around which confidence bands are built? Warnings About Reliability (T/F) High test-retest reliability implies high internal consistency reliability; conversely, low test-retest reliability implies low internal consistency reliability. (T/F) Reliability "resides" in the measuring instrument itself, not in the scores obtained by using the measuring instrument. Measures of internal consistency will be too ___ (high/low) if a test is administered under highly speeded conditions. Which of these statements is better: (a) We estimated the reliability of our data. (b) We determined the reliability of our data. Validity What is a really good one-word synonym for "valid"? The Relationship Between Reliability and Validity (T/F)  If reliability is very, very high . . . then validity must also be very, very high. (T/F)  If validity is very, very high . . . then reliability must also be very, very high. Different Kinds of Validity Content validity normally ____ (is/is not) expressed by means of a numerical coefficient. The term "criterion-related validity" covers two approaches: predictive and _____. A validity coefficient normally takes the form of a ______ . mean SD correlation Which of these is a construct: Height Hair color Happiness Date of birth (T/F)  To support the convergent and discriminant validity of a new test, correlation coefficients must turn out to be positive and negative in sign, respectively. Which of the main validity procedures (content, criterion-related, or construct) is sometimes dealt with by the statistical technique of factor analysis? Warnings About Reliability and Validity (T/F)  Reliability is a necessary but not sufficient condition for validity. Where does the validity of a new test reside, in the test itself or in the scores produced by an administration of the test? What might cause an honest researcher to claim that his/her test has high content validity when in fact it has very little content validity? What might cause an honest researcher to claim that his/her test has high criterion-related validity when in fact it has very little criterion-related validity? Two Final Comments How high should reliability and validity coefficients be before we can confidently call them "high enough?" .5 .75 .90 .95 It depends (T/F) If a researcher conducts a study wherein the data are perfectly reliable and valid, it's still possible for the researcher's data-based conclusions to be utterly worthless, even if it's the case that an important research question was under investigation. Click here for answers.
