سازمان سنجش آموزش کشور (1395). کارنامه آماری آزمون سراسری سال 1395. تهران: انتشارات سازمان سنجش آموزش کشور (دفتر طرح و آمار)
نقی زاده، سیما (1394). نمره کل سازی آزمون سراسری در گروه آزمایشی علوم ریاضی و فنی سال 1391 بر اساس توزیع واقعی نمرات و مقایسه آن با روش فعلی. تهران: مرکز تحقیقات ارزشیابی، اعتبار سنجی و تضمین کیفیت آموزش عالی (سازمان سنجش آموزش کشور).
Allen, M. J., & Wendy, Y. M. (1979). Introduction to Measurement Theory. California: Cole publishing company.
Angoff, W.H. (1971). Scales, norms, and equivalent scores. In RL. Thorndike (Ed.).
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, (2014). Standards for educational and psychological testing. Washington, DC: APA
Brennan, R. L., & Lee, W. C. (1999). Conditional scale-score standard errors of measurement under binomial and compound binomial assumptions. Educational and Psychological Measurement, 59(1), 5-24.
Brooks, G. P., & Johnson, G. A. (2003). TAP: Test Analysis Program. Applied Psychological Measurement. 27(4), 303-304.
Brooks, G. P., & Johnson, G. A. (2014). TAP: Test Analysis Program version (14.7.4) [computer software]. Retrieved from
http://www.ohio.edu/people/brooksg/software.htm.
Chang, S. W. (2006). Methods in Scaling the Basic Competence Test. Educational and Psychological Measurement, 66(6), 907-929.
Dorans N. J., Pommerich, M. & Holland P. W. (2007). A Framework and History for Score Linking. In Holland P. W. (Eds.), Linking and Aligning Scores and Scales (pp 5-30). New York: Springer.
Feldt, L. S., & Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 105-146). New York, NY: Macmillan.
Feldt, L. S., & Quails, A. L. (1996). Estimation of measurement error variance at specific score levels. Journal of Educational Measurement, 33, 141-156. 156.
Gulliksen, H. (1950). Theory of mental test. New York: John Wiley & sons.
Haertel, H. E. (2006). Reliability. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 65-86). CT: American Council on Education and Praeger.
Iowa Assessment (2016). Iowa Test of Basic Skills. Iowa City: Author Retrieved: itp.education.uiowa.edu
Kolen, M. J., Hanson, B. A., & Brennan, R. L. (1992). Conditional standard errors of measurement of scale scores. Journal of Educational Measurement, 29, 285-307.
Kolen, M. J., & Hanson, B. A. (1989). Scaling the ACT Assessment. In R. L. Brennan (Ed.), Methodology used in scaling the ACT Assessment and P-ACT+ (pp. 35-55). Iowa City, IA: American College Testing Program.
Kolen, M. J. (1991). Smoothing methods for estimating test score distributions. Journal of Educational Measurement, 28, 257-282.
Kolen, M. J., & Brennan, R. L. (2004). Test Equating, Scaling and Linking (2rd Ed.). New York: Springer.
Kolen, M. J, Wang, T., Lee, W. Chon. (2012). Conditional Standard Errors of Measurement for Composite Scores Using IRT. International Journal of Testing, 12, 1-20.
Kolen, M. J., & Brennan, R. L. (2014). Test Equating, Scaling and Linking, 3rd Ed. New York: Springer.
Lee, W. C., Brennan, R. L. & Kolen, M. J. (2000), Estimators of Conditional Scale-Score Standard Errors of Measurement: A Simulation Study. Journal of Educational Measurement, 37, 1–20.
Lord, F. M. (1965). A strong true-score theory with applications. Psychometrika, 30, 239-270.
Lord, F. M., & Novick, M. R. (1968). Statistical theory of mental test scores. MA: Adisson-wesley.
Lord, F. M. (1969). Estimating true-score distributions in psychological testing (An empirical Bayes estimation problem). Psychometrika, 34, 259-299.
Mood, M. A., Gray bill, A. F. & Boes, C. D. (2008). Introduction to the Theory of Statistics. C.A: McGraw-Hill.
Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scaling, norming, and equating. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 221-262). New York: American Council Education; and Macmillan.
The SAT. (2016). SAT technical manual.New York: Author. Retrieved from
collegereadiness.collegeboard.org.
The ACT. (2014). ACT assessment technical manual. Iowa City: Author. Retrieved from http://www.act.org/research/researchers/techmanuals.html
Woodruff, D., Traynor, A., Cui, Z., & Fang, Y. (2013). A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement. ACT Research Report Series, 2013 (7). ACT, Inc.