UPSC MainsPSYCHOLOGY-PAPER-I201230 Marks400 Words
Q9.

In which way IRT is an improvement over classical test theory? Compare the two approaches and critically evaluate Rasch's model of IRT.

How to Approach

This question requires a comparative analysis of Classical Test Theory (CTT) and Item Response Theory (IRT), culminating in a critical evaluation of Rasch's model within IRT. The answer should begin by defining both CTT and IRT, highlighting their core principles. A detailed comparison should focus on their assumptions, strengths, and weaknesses. The evaluation of Rasch's model should cover its specific characteristics, advantages, and limitations. Structure the answer into an introduction, a comparative body, a focused section on Rasch's model, and a conclusion.

Model Answer

0 min read

Introduction

Psychometrics, the science of psychological measurement, relies heavily on robust statistical models to ensure the validity and reliability of tests. Classical Test Theory (CTT) has historically been the dominant paradigm, but in recent decades, Item Response Theory (IRT) has emerged as a powerful alternative. CTT focuses on the overall test score, while IRT focuses on the relationship between an individual's latent trait and their performance on each item. This shift represents a move from a test-centric to an item-centric approach, offering more nuanced and informative insights into test-taker abilities. This answer will compare and contrast these two approaches, with a specific focus on evaluating Rasch's model, a prominent example of IRT.

Classical Test Theory (CTT): A Test-Centric Approach

CTT assumes that an observed score on a test is composed of two components: the true score (representing the individual's actual ability) and error. Key parameters in CTT include test reliability (Cronbach's alpha, test-retest reliability) and item difficulty. CTT relies heavily on sample-dependent statistics, meaning that estimates of reliability and item difficulty are specific to the particular sample tested.

Item Response Theory (IRT): An Item-Centric Approach

IRT, in contrast, views test performance as a function of both the individual's latent trait (e.g., intelligence, anxiety) and the characteristics of the test item. IRT models estimate item parameters (difficulty, discrimination, guessing) that are assumed to be sample-independent. This means these parameters are believed to be consistent across different populations. IRT allows for the creation of adaptive tests, where items are selected based on the test-taker's performance, leading to more efficient and precise measurement.

Comparing CTT and IRT

The following table summarizes the key differences between CTT and IRT:

Feature Classical Test Theory (CTT) Item Response Theory (IRT)
Focus Test score Individual item and latent trait
Error Random error affecting the total score Item-specific error
Item Parameters Item difficulty (sample-dependent) Item difficulty, discrimination, guessing (sample-independent)
Sample Dependence High Low
Adaptive Testing Not possible Possible

Rasch Model: A Core IRT Model

The Rasch model, developed by Georg Rasch, is a one-parameter IRT model focusing solely on item difficulty. It posits that the probability of a correct response is solely determined by the difference between the test-taker's ability and the item difficulty. A key principle of the Rasch model is "separation," meaning that items should clearly differentiate between individuals with different ability levels. The model produces interval-level measurements, allowing for meaningful comparisons of ability across individuals and items.

Strengths of the Rasch Model

  • Simplicity: Its single parameter makes it relatively easy to understand and implement.
  • Fundamental Measurement: It aims to provide a true interval scale of measurement, unlike CTT which provides ordinal scales.
  • Sample Independence: Item difficulty estimates are less affected by the specific sample tested.

Limitations of the Rasch Model

  • Unidimensionality: The Rasch model assumes that the test measures a single underlying trait. Violations of this assumption can lead to inaccurate results.
  • Local Independence: Responses to items should be independent of each other, given the test-taker's ability. Item clustering or content overlap can violate this assumption.
  • Limited Information: The single-parameter nature of the model provides less information than more complex IRT models (e.g., 2PL, 3PL).

Conclusion

In conclusion, IRT represents a significant advancement over CTT by shifting the focus from the test to the individual item and latent trait. Rasch's model, as a foundational IRT model, offers simplicity and the potential for fundamental measurement, but its limitations regarding unidimensionality and limited information necessitate careful consideration. While more complex IRT models address some of these limitations, the Rasch model remains a valuable tool for test development and analysis, particularly when the assumption of unidimensionality is reasonably met. The choice between CTT and IRT, and among different IRT models, depends on the specific research question and the characteristics of the test being used.

Answer Length

This is a comprehensive model answer for learning purposes and may exceed the word limit. In the exam, always adhere to the prescribed word count.

Additional Resources

Key Definitions

Latent Trait
A latent trait is a psychological construct that cannot be directly observed, but is inferred from observable behaviors or test responses. Examples include intelligence, anxiety, and personality.
Item Discrimination
Item discrimination refers to the extent to which an item differentiates between individuals with high and low levels of the latent trait. A high-discriminating item will be answered correctly more often by those with high ability than by those with low ability.

Key Statistics

A meta-analysis by Embretson & Reise (1997) found that IRT models generally provide more accurate and efficient measurement than CTT, particularly for high-stakes testing.

Source: Embretson, S. E., & Reise, S. P. (1997). Item response theory for psychologists. Lawrence Erlbaum Associates.

According to a study by Hambleton et al. (1991), IRT models can reduce test length by 50% while maintaining the same level of measurement precision as CTT-based tests.

Source: Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Foundations of item response theory. Sage Publications.

Examples

GRE Test

The Graduate Record Examinations (GRE) utilizes IRT to adaptively select questions based on a test-taker’s performance, ensuring a more precise assessment of their verbal reasoning, quantitative reasoning, and analytical writing skills.

Frequently Asked Questions

What is the role of sample size in IRT?

While IRT item parameters are theoretically sample-independent, large sample sizes are still crucial for obtaining stable and accurate estimates of these parameters. Smaller samples can lead to parameter instability and reduced precision.

Topics Covered

PsychometricsStatisticsTest ConstructionItem AnalysisReliabilityValidity