Charles Explorer logo
🇬🇧

Item purification versus adjustements for multiple comparisons in DIF detection

Publication at Central Library of Charles University, Faculty of Mathematics and Physics |
2017

Abstract

Most classical Differential Item Functioning (DIF) detection methods rely on the basic principle of testing for DIF one item after each other, which can have very strong impact on the identification of DIF in terms of power, rejection and Type I Error. In an extensive simulation study we compared six different scenarios of controlling Type I Error, including item purification and multiple comparison adjustement methods (Holm's and Benjamini-Hochberg).

Three DIF detection methods were selected: the Mantel-Haenszel test, the logistic regression and Lord's chi-square test based on 2PL IRT model. Examining empirical rates as well as using beta regression models, early results suggest that the effect of used correction methods is various in different DIF detection methods.

Generally, adjustement procedures caused decrease of both rejection and power rates. On the contrary, purification led to slight increase of power when rejection rates were not really affected.

Our results can be summarized in suggestions under which circumstances item purification and when multiple comparison correction methods (or even their combination) should be used.