Document Type : Research Paper

Authors

1 Psychometrics, Faculty of psychology and educational sciences, Allameh Tabataba'i University , Tehran, Iran

2 Statistics, Faculty of Mathematical Sciences and Computer , Allameh Tabataba'i University , Tehran, Iran

3 Psychometrics, Faculty of psychology and educational sciences, Allameh Tabataba’i University , Tehran, Iran

Abstract

Structural equation modeling (SEM) is a powerful multivariate statistical approach for assessing complex relationships between latent variables in many human and behavioral sciences. A common challenge in estimating structural equation models, which is based on hypothesis testing, is the presence of missing data. Deleting subjects with missing values on each of items is the usual way of handling missing data, which leads to biased estimators and lose a considerable amount of sample information as the percentage of missing values increases. In estimating SEM with missing values, one can apply the full information maximum likelihood (FIML) approach that makes maximal use of all available data from every subject in the sample. In this paper, the performance of FIML is investigated under three missing value mechanisms, missing completely at random, missing at random, and missing not at random, in a simulation study. Two confirmatory factor analysis models are considered, where the data is generated under three mechanisms and the impact of two indexes, sample size (100,500) and percentage of missing values (2%,5%,10%,15%,20%,25%,30%,35%,40%), are evaluated based on the root mean square error of approximation (RMSEA) index. Results show that the performance of SEM using FIML approach is generally better than the performance of SEM without using this approach in terms of some goodness of fit index.

Keywords

 
Allison, P. D. (2003). Missing Data Techniques for Structural Equation Modeling. Journal of Abnormal Psychology, 112(4), 545–557.             
Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1-38.
Enders, C. K. (2001). A primer on maximum likelihood algorithms available for use with missing data.Structrual Equation Modeling, 8(1), 128-141.
Enders, C. K. (2010). Applied missing data analysis. The Guilford Press. New York, London.
Enders, C. K., Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structrual Equation Modeling, 8(3), 430–457.
Finkbeiner, C. (1979). Estimation for the multiple factor model when data are missing. Psychometrika, 44(4), 409–420.
Han, K. T., Guo, F. (2014). Impact of violation of the missing-at-random assumption on full-information maximum likelihood method in multidimensional adaptive testing. Practical Assessment, Research and Evaulation, 19(2).
Hoyle, R. H. (2012). Handbook of structural equation modeling. The Guilford Press. New York, London.
Little, R. J. A., Rubin, D. B. (2002). Statistical analysis with missing data, 2nd Edition. New York: John Wiley.
Khine, M. S. (2013). Application of structural equation modeling in educational research and practice. Sense Publishers.
Muthén, B., Kaplan, D., Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52(3), 431–462.
Olinsky, A., Chen, S., Harlow, L. (2003). The comparative efficacy of imputation methods for missing data in structural equation modeling. European Journal of Operational Research, 151, 53–79.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–59.