IRT readings: Chapter 2: The new rules of measurement

Rule 1: The standard error of measurement (SEM)

Old Rule 1: The standard error of measurement applies to all scores in a particular population
New Rule 1: The standard error of measurement differs across scores but generalizes across populations.

SEM for CTT is constant while that of IRT is variable.

CTT

Standard error of measurement for CTT:

SEM= (1-r_tt)^1/2σ

The estimated true score is derived from a linear transformation of raw score and the confidence interval also represented as straight lines.

IRT

The relationship between raw score and transformed score is non-linear. In addition, the confidence interval becomes wider at the extreme values.

Rule 2: Test length and reliability

Old Rule 2: Longer tests are more reliable than shorter tests
New Rule 2: Shorter tests can be more reliable than longer tests

CTT

Spearman-Brown prophecy formula:

Given r_ttis the reliability for the original test and n the number of parallel parts

r_nn=

An adaptive test, by nature fails to meet the assumptions because the test difficulties vary substantially.

Rule 3: Interchangeable Test Forms

Old Rule 3: Comparing test scores across multiple forms is optimal when test forms are parallel
New Rule 3: Comparing test scores across multiple forms is optimal when test difficulty levels vary between persons.

Gulliksen (1950) defined strict conditions for test parallelism in his exposition of CTT:

Equality of means and variances across test forms
Equality of covariance with external variables

Rule 4: Unbiased Assessment of Item Properties

Old Rule 4: Unbiased assessment of item properties depends on having representative samples
New Rule 4: Unbiased estimates of item properties may be obtained from unrepresentative samples

CTT

Item difficulty is p-value or the proportion of passing
Item discrimination is item-total correlation (e.g., biserial correlation)

Rule 5: Establishing Meaningful Scale Scores

Old Rule 5: Test scores obtain meaning by comparing their position in a norm group
New Rule 5: Test score obtain meaning by comparing their distance from items.

Rule 6: Establishing Scale Properties

Old rule 6: Interval scale properties are achieved by obtaining normal score distributions
New Rule 6: Interval scale properties are achieved by applying justifiable measurement models

Rule 7: Mixing Item Formats

Old Rule 7: Mixed item formats leads to unbalanced impact on test total scores.
New Rule 7: Mixed item formats can yield optimal test scores.

CTT

Z score

Rule 8: The Meaning of Change Scores

Old Rule 8: Change scores cannot be meaningfully compared when initial score levels differ.
New Rule 8: Change scores can be meaningfully compared when initial score levels differ.

X_J,Change=X_J2-X_J1

Beteiter (1963)- 3 fundamental problems with change scores:

Paradoxical reliabilities, such that the lower the pretest to posttest correlations, the higher the change score reliability
Spurious negative correlations between initial status and change (due to the subtraction
Different meaning from different initial levels

Rule 9: Factor Analysis of Binary Items

Old Rule 9: Factor analysis on binary items produces artifacts rather than factors
New Rule 9: Factor analysis on raw item data yields a full information factor analysis.

Phi correlation
Tetrachoric correlation
Full information factor analysis

Rule 10: Importance of Item Stimulus Features

Old Rule 10: Item stimulus features are unimportant compared to psychometric properties
New Rule 10: Item stimulus features can be directly related to psychometric properties.

This chapter should be in the end of the book.

IRT readings

Wednesday, February 4, 2009

Chapter 2: The new rules of measurement

Rule 1: The standard error of measurement (SEM)

Rule 2: Test length and reliability

Rule 3: Interchangeable Test Forms

Rule 4: Unbiased Assessment of Item Properties

Rule 5: Establishing Meaningful Scale Scores

Rule 6: Establishing Scale Properties

Rule 7: Mixing Item Formats

Rule 8: The Meaning of Change Scores

Rule 9: Factor Analysis of Binary Items

Rule 10: Importance of Item Stimulus Features

No comments:

Post a Comment

Labels

Blog Archive

Followers