آلن، مری جی، ویدنی ام (1375) مقدمه ای بر نظریه های اندازه گیری (ترجمه علی دلاور) تهران: سمت
دلاور، علی (1388). احتمالات و آمار کاربردی در روانشناسی و علوم تربیتی، تهران: انتشارات رشد
دلاور، علی- روش تحقیق در روان شناسی و علوم تربیتی . تهران: ویرایش 1384
ثرندایک ، رابرت ال (1375) روانسنجی کاربردی (ترجمه حیدر علی هومن) تهران: انتشارات دانشگاه تهران (تاریخ انتشار به زبان اصلی 1982).
سیف، علی اکبر (1385). سنجش، اندازه گیری و ارزشیابی. تهران نشر دوران
شریفی، حسن پاشا (1384). اصول روانسنجی و روان آزمایی. تهران: انتشارات رشد
همبلتون ، رونالد.ک، سوامیناتان.اچ و راجرز، اچ. جین(1991).( ترجمه محمدرضا فلسفینژاد، 1389)تهران: انتشارات دانشگاه علامه طباطبایی
هومن، حیدر علی (1380). اندازه گیری های روانی و تربیتی و فن تهیه تست. تهران: نشر پارسا
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.) Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Educational Research Association.
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike, Educational Measurement (2nd ed., pp. 508-600). Washington, D. C.: American Council of Education.
Braun, H. I. & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland, & D. B. Rubin, Test equating (pp. 9 - 49). New York: Academic Press.
Brennan, R. L. (2010). Assumptions about true-scores and populations in equating. Measurement: Interdisciplinary Research and Perspectives, 8(1), 1-3.
Cui, Z., & Kolen, M. J. (2008). Comparison of parametric and nonparametric bootstrap methods for estimating random error in equipercentile equating. Applied Psychological Measurement, 32(4), 334-347.
Dorans, N. J. (1990). Equating methods and sampling designs. Applied Measurement in Education, 3, 3-17.Educational Measurement, 31, 113-123.
Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: principles and applications. Boston, MA: Kluwer.
Hambleton, R., Swaminathan, H., & Rogers, H. (1991). Fundamentals of item response theory. Newbury Park, CA: SAGE Publications, Inc.
Hanson, B. (2004, May 18). Equating Error. (Z. Cui, Ed.) Iowa City, IA, US: CASMA.
Hanson, B. A., & Beguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26(1), 2-24.
Hanson, B. A., & Zeng, L. (revised by Cui, Z.) (2004). PIE. A computer program for IRT equating.
Harris, D. J., & Kolen, M. J. (1990). A comparison of two equipercentile equating methods for common item equating. Educational and Psychological Measurement, 50, 61-71.
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 187-220). Westport, CT: Praeger.
Holland, P. W., & Thayer, D. T. (2000). Univariate and bivariate loglinear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25(2), 133-183.
Holland, P. W., Sinharay, S., von Davier, A. A., & Han, N. (2008). An approach to evaluating the missing data assumptions of the chain and post-stratification equating methods for the NEAT design. Journal of Educational Measurement, 45(1), 17-43.
Kendall, M., Stuart, A., & Ord, J. K. (1994). Kendall's Advanced Theory of Statistics, volume 2: Distribution Theory (6th ed.). A Hodder Arnold Publication.
Kim, D. I., Brennan, R. L., & Kolen, M. J. (2005). A comparison of IRT equating and beta 4 equating. Journal of educational measurement , 42 (1), 77-99.
Kim, S., von Davier, A. A., & Haberman, S. (2008). Small-sample equating using a synthetic linking function. Journal of Educational Measurement, 45(4), 325-342.
Kim, S., Walker, M. E., & McHale, F. (2010). Comparisons among designs for equating mixedformat tests in large-scale assessments. Journal of Educational Measurement, 47(1), 36-53.
Klein, L. W., & Jarjoura, D. (1985). The importance of content representation for common-item equating with nonrandom groups. Journal of Educational Measurement, 22(3), 197-206.
Kolen, M. J. & Brennan, R. L. (2004). Test equating, scaling and linking: Methods and practices. (2nd, Ed.) New York, NY: Springer-Verlag.
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and
Kolen, M. J., & Harris, D. J. (1990). Comparison of item preequating and random groups equating using IRT and equipercentile methods. Journal of Educational Measurement , 27 (1), 27-39. 218
Kolen, M. J., Hanson, B. A, & Brennan, R. L. (1992). Conditional standard errors of measurement for scale scores. Journal of Educational Measurement, 29, 285-307.
Livingston, S. A., & Kim, S. (2010). Random-groups equating with samples of 50 to 400 test takers. Journal of Educational Measurement, 47(2), 175-185.
Livingston, S. A.,Dorans, N. J., & Wright, N. K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3, 73-95.
Lord, F. M. & Wingersky, M. S. (1984). Comparision of IRT-true-score and equipercentile observed-score "Equatings.". Applied Psychological Measurement , 8, 453 - 461.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test score. Menlo Park, CA: Addison-Wesley.
Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings.” Applied Psychological Measurement, 8(4), 453-461.
Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scaling, norming, and equating. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp.221-262). New York: Macmillan.
Philips, S. E. (1985). Quantifying equating errors with item response theory methods. Applied 127 Psychological Measurement, 9(1), 59-71.
Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.
Rosenbaum, P. R., & Thayer, D. (1987). Smoothing the joint and marginal distributions of scored two-way contingency tables in test equating . British Journal of Mathematical and Statistical Psychology, 40, 43-49.
Sinharay, S., & Holland, P. W. (2007). Is it necessary to make anchor tests mini-versions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44(3), 249-275.
Tong, Y. & Kolen, M. J. (2005). Assessing equating results on different equating criteria. Applied Psychological Measurement , 29 (6), 418-432.
Tong, Y., & Kolen, M. J. (2005). Assessing equating results on different equating criteria. Applied Psychological Measurement, 29(6), 418-432. 128
van der Linden, w. J. (2005). Linear models for optimal test design. New York, NY: Springer-Verlag. 219
van der Linden, W. J. (2006a). Equating error in observed-score equating. Applied Psychological Measurement , 30 (5), 355-378.
van der Linden, W. J. (2010). Local observed-score equating. In A. A. von Davier (Ed.) Statistical models for equating. New York: Springer.
van der Linden, W. J., & Wiberg, M. (2010). Local observed-score equating with anchor-test designs. Applied Psychological Measurement, 34(8), 620-640.
von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the kernel equating method: A special study with pseudotests constructed from real test data. ETS Research Report. Princeton, NJ: Educational Testing Service.
Wang, T., Hanson, B. A., & Harris, D. J. (2000). The effectiveness of circular equating as a criterion for evaluating equating. Applied Psychological Measurement, 24(3), 195-210.
Wang, T., Lee, W-C., Brennan, R. L., & Kolen, M. J. (2008). A Comparison of the frequency estimation and chained equipercentile methods under the common-item nonequivalent groups design. Applied Psychological Measurement, 32(8), 632-651.
Zeng, L. & Hanson, B. (2005, Oct. 5). RAGE-RGEQUATE. (Z. Cui, Ed.) Iowa City, IA, US: CASMA.
Zimowski, M. F., Muraki, E., Mislevy, R. J., & Bock, R. D. (1996). BILOG-MG: Multiple-group IRT analysis and test maintenance for binary items [Computer software and manual]. Chicago: Scientific Software International.