J
Pertanika Journal of Social Science and Humanities, Volume J, Issue J, January J
Keywords: J
Published on: J
J
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5-31. https://doi.org/10.1007/s11092-008-9068-5
Carrillo-de-la-Pena, M. T., & Perez, J. (2012). Continuous assessment improved academic achievement and satisfaction of psychology students in Spain. Teaching of Psychology, 39(1), 45-47. https://doi.org/10.1177/0098628311430312
Council of Europe. (2009). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197-221. https://doi.org/10.1207/s15434311laq0203_2
Engelhard, G. (1994). Examining rater errors in the assesment of written Ccmposition with a Many-Faceted Rasch Model. Journal of Educational Measurement, 31(2), 93-112. https://doi.org/10.1111/j.1745-3984.1994.tb00436.x
Erguvan, I. D., & Dunya, B. A. (2020). Analyzing rater severity in a freshman composition course using Many-Facet Rasch measurement. Language Testing in Asia, 10(1), 1-20. https://doi.org/10.1186/s40468-020-0098-3
Fahim, M., & Bijani, H. (2011). The effects of rater training on raters’ severity and bias in second language writing assessment. Iranian Journal of Language Testing, 1(1), 1-16.
Hack, S. (2019). How do examiners mark? An investigation of marking processes used in the assessment of extended written responses [Unpublished Doctoral dissertation]. University of Surrey.
Han, T., & Huang, J. (2017). Examining the impact of scoring methods on the institutional EFL writing assessment: A Turkish perspective. PASAA: Journal of Language Teaching and Learning in Thailand, 53, 112-147.
He, T. (2019). The impact of computers on marking behaviors and assessment: A many-facet Rasch measurement analysis of essays by EFL college students. SAGE Open, 9(2), 1-17. https://doi.org/10.1177/2158244019846692
Jiminez, C. E. (2015). Middle school students’ perceptions of fairness and trust in assessment scenarios (Doctoral dissertation). University of South Carolina, US.
Kayapınar, U. (2014). Measuring essay assessment: Intra-rater and inter-rater reliability. Eurasian Journal of Educational Research, 57, 113-136. https://doi.org/10.14689/ejer.2014.57.2
Lang, W. S., & Wilkerson, J. R. (2008, February 7-10). Accuracy vs. validity, consistency vs. reliability, and fairness vs. absence of bias: A call for quality. Paper presented at the Annual Meeting of the American Association of Colleges of Teacher Education (AACTE). New Orleans, LA.
Levey, D. (2020). Strategies and analyses of language and communication in multilingual and international context. Cambridge Scholars Publishing.
Linacre, J. M. (2014). A user guide to Facets, Rasch-model computer programs. Winsteps.com
Mahshanian, A., & Shahnazari, M. (2020). The effect of raters fatigue on scoring EFL writing tasks. Indonesian Journal of Applied Linguistics, 10(1), 1-13. https://doi.org/10.17509/ijal.v10i1.24956
McNamara, T., Knoch, U., Fan, J., & Rossner, R. (2019). Fairness, justice & language assessment - Oxford applied linguistics. Oxford University Press.
Meadows, M., & Billington, L. (2005). A review of the literature in marking reliability. National Assessment Agency.
Mikre, F. (2010). The roles of assessment in curriculum practice and enhancement of learning. Ethiopian Journal of Education and Sciences, 5(2), 101-114. https://doi.org/10.4314/ejesc.v5i2.65376
Morin, C., Black, B., Howard, E., & Holmes, S. D. (2018) A study of hard-to-mark responses: Why is there low mark agreement on some responses? Ofqual Publishing.
Nisbet, I., & Shaw, S. (2020). Is assessment fair? SAGE Publications Ltd.
Park, Y. S. (2011). Rater drift in constructed response scoring via latent class signal detection theory and item response theory [Doctoral dissertation]. Columbia University.
Prieto, G., & Nieto, E. (2014). Analysis of rater severity on written expression exam using Many-Faceted Rasch Measurement. Psicológica, 35, 385-397.
Sundqvist, P., Sandlund, E., Skar, G. B., & Tengberg, M. (2020). Effects of rater training on the assessment of L2 English oral proficiency. Nordic Journal of Modern Language Methodology, 8(10), 3-29. https://doi.org/10.46364/njmlm.v8i1.605
Tierney, R. D. (2016). Fairness in educational assessment. In M. A. Peters (Ed.), Encyclopedia of Educational Philosophy and Theory (pp. 1-6). Springer Science+Business Media. https://doi.org/10.1007/978-981-287-532-7_400-1
Walde, G. S. (2016). Assessment of the implementation of continuous assessment: The case of METTU University. European Journal of Science and Mathematics Education, 4(4), 534‐544. https://doi.org/10.30935/scimath/9492
Willey, K., & Gardner, A. (2010, November 18-19). Improving the standard and consistency of multi-tutor grading in large classes [Paper presented]. ATN Assessment Conference 2010. University of Technology Sydney, Australia.
Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31(4), 501-527. https://doi.org/10.1177/0265532214536171CES
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5-31. https://doi.org/10.1007/s11092-008-9068-5
Carrillo-de-la-Pena, M. T., & Perez, J. (2012). Continuous assessment improved academic achievement and satisfaction of psychology students in Spain. Teaching of Psychology, 39(1), 45-47. https://doi.org/10.1177/0098628311430312
Council of Europe. (2009). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197-221. https://doi.org/10.1207/s15434311laq0203_2
Engelhard, G. (1994). Examining rater errors in the assesment of written Ccmposition with a Many-Faceted Rasch Model. Journal of Educational Measurement, 31(2), 93-112. https://doi.org/10.1111/j.1745-3984.1994.tb00436.x
Erguvan, I. D., & Dunya, B. A. (2020). Analyzing rater severity in a freshman composition course using Many-Facet Rasch measurement. Language Testing in Asia, 10(1), 1-20. https://doi.org/10.1186/s40468-020-0098-3
Fahim, M., & Bijani, H. (2011). The effects of rater training on raters’ severity and bias in second language writing assessment. Iranian Journal of Language Testing, 1(1), 1-16.
Hack, S. (2019). How do examiners mark? An investigation of marking processes used in the assessment of extended written responses [Unpublished Doctoral dissertation]. University of Surrey.
Han, T., & Huang, J. (2017). Examining the impact of scoring methods on the institutional EFL writing assessment: A Turkish perspective. PASAA: Journal of Language Teaching and Learning in Thailand, 53, 112-147.
He, T. (2019). The impact of computers on marking behaviors and assessment: A many-facet Rasch measurement analysis of essays by EFL college students. SAGE Open, 9(2), 1-17. https://doi.org/10.1177/2158244019846692
Jiminez, C. E. (2015). Middle school students’ perceptions of fairness and trust in assessment scenarios (Doctoral dissertation). University of South Carolina, US.
Kayapınar, U. (2014). Measuring essay assessment: Intra-rater and inter-rater reliability. Eurasian Journal of Educational Research, 57, 113-136. https://doi.org/10.14689/ejer.2014.57.2
Lang, W. S., & Wilkerson, J. R. (2008, February 7-10). Accuracy vs. validity, consistency vs. reliability, and fairness vs. absence of bias: A call for quality. Paper presented at the Annual Meeting of the American Association of Colleges of Teacher Education (AACTE). New Orleans, LA.
Levey, D. (2020). Strategies and analyses of language and communication in multilingual and international context. Cambridge Scholars Publishing.
Linacre, J. M. (2014). A user guide to Facets, Rasch-model computer programs. Winsteps.com
Mahshanian, A., & Shahnazari, M. (2020). The effect of raters fatigue on scoring EFL writing tasks. Indonesian Journal of Applied Linguistics, 10(1), 1-13. https://doi.org/10.17509/ijal.v10i1.24956
McNamara, T., Knoch, U., Fan, J., & Rossner, R. (2019). Fairness, justice & language assessment - Oxford applied linguistics. Oxford University Press.
Meadows, M., & Billington, L. (2005). A review of the literature in marking reliability. National Assessment Agency.
Mikre, F. (2010). The roles of assessment in curriculum practice and enhancement of learning. Ethiopian Journal of Education and Sciences, 5(2), 101-114. https://doi.org/10.4314/ejesc.v5i2.65376
Morin, C., Black, B., Howard, E., & Holmes, S. D. (2018) A study of hard-to-mark responses: Why is there low mark agreement on some responses? Ofqual Publishing.
Nisbet, I., & Shaw, S. (2020). Is assessment fair? SAGE Publications Ltd.
Park, Y. S. (2011). Rater drift in constructed response scoring via latent class signal detection theory and item response theory [Doctoral dissertation]. Columbia University.
Prieto, G., & Nieto, E. (2014). Analysis of rater severity on written expression exam using Many-Faceted Rasch Measurement. Psicológica, 35, 385-397.
Sundqvist, P., Sandlund, E., Skar, G. B., & Tengberg, M. (2020). Effects of rater training on the assessment of L2 English oral proficiency. Nordic Journal of Modern Language Methodology, 8(10), 3-29. https://doi.org/10.46364/njmlm.v8i1.605
Tierney, R. D. (2016). Fairness in educational assessment. In M. A. Peters (Ed.), Encyclopedia of Educational Philosophy and Theory (pp. 1-6). Springer Science+Business Media. https://doi.org/10.1007/978-981-287-532-7_400-1
Walde, G. S. (2016). Assessment of the implementation of continuous assessment: The case of METTU University. European Journal of Science and Mathematics Education, 4(4), 534‐544. https://doi.org/10.30935/scimath/9492
Willey, K., & Gardner, A. (2010, November 18-19). Improving the standard and consistency of multi-tutor grading in large classes [Paper presented]. ATN Assessment Conference 2010. University of Technology Sydney, Australia.
Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31(4), 501-527. https://doi.org/10.1177/0265532214536171CES
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5-31. https://doi.org/10.1007/s11092-008-9068-5
Carrillo-de-la-Pena, M. T., & Perez, J. (2012). Continuous assessment improved academic achievement and satisfaction of psychology students in Spain. Teaching of Psychology, 39(1), 45-47. https://doi.org/10.1177/0098628311430312
Council of Europe. (2009). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197-221. https://doi.org/10.1207/s15434311laq0203_2
Engelhard, G. (1994). Examining rater errors in the assesment of written Ccmposition with a Many-Faceted Rasch Model. Journal of Educational Measurement, 31(2), 93-112. https://doi.org/10.1111/j.1745-3984.1994.tb00436.x
Erguvan, I. D., & Dunya, B. A. (2020). Analyzing rater severity in a freshman composition course using Many-Facet Rasch measurement. Language Testing in Asia, 10(1), 1-20. https://doi.org/10.1186/s40468-020-0098-3
Fahim, M., & Bijani, H. (2011). The effects of rater training on raters’ severity and bias in second language writing assessment. Iranian Journal of Language Testing, 1(1), 1-16.
Hack, S. (2019). How do examiners mark? An investigation of marking processes used in the assessment of extended written responses [Unpublished Doctoral dissertation]. University of Surrey.
Han, T., & Huang, J. (2017). Examining the impact of scoring methods on the institutional EFL writing assessment: A Turkish perspective. PASAA: Journal of Language Teaching and Learning in Thailand, 53, 112-147.
He, T. (2019). The impact of computers on marking behaviors and assessment: A many-facet Rasch measurement analysis of essays by EFL college students. SAGE Open, 9(2), 1-17. https://doi.org/10.1177/2158244019846692
Jiminez, C. E. (2015). Middle school students’ perceptions of fairness and trust in assessment scenarios (Doctoral dissertation). University of South Carolina, US.
Kayapınar, U. (2014). Measuring essay assessment: Intra-rater and inter-rater reliability. Eurasian Journal of Educational Research, 57, 113-136. https://doi.org/10.14689/ejer.2014.57.2
Lang, W. S., & Wilkerson, J. R. (2008, February 7-10). Accuracy vs. validity, consistency vs. reliability, and fairness vs. absence of bias: A call for quality. Paper presented at the Annual Meeting of the American Association of Colleges of Teacher Education (AACTE). New Orleans, LA.
Levey, D. (2020). Strategies and analyses of language and communication in multilingual and international context. Cambridge Scholars Publishing.
Linacre, J. M. (2014). A user guide to Facets, Rasch-model computer programs. Winsteps.com
Mahshanian, A., & Shahnazari, M. (2020). The effect of raters fatigue on scoring EFL writing tasks. Indonesian Journal of Applied Linguistics, 10(1), 1-13. https://doi.org/10.17509/ijal.v10i1.24956
McNamara, T., Knoch, U., Fan, J., & Rossner, R. (2019). Fairness, justice & language assessment - Oxford applied linguistics. Oxford University Press.
Meadows, M., & Billington, L. (2005). A review of the literature in marking reliability. National Assessment Agency.
Mikre, F. (2010). The roles of assessment in curriculum practice and enhancement of learning. Ethiopian Journal of Education and Sciences, 5(2), 101-114. https://doi.org/10.4314/ejesc.v5i2.65376
Morin, C., Black, B., Howard, E., & Holmes, S. D. (2018) A study of hard-to-mark responses: Why is there low mark agreement on some responses? Ofqual Publishing.
Nisbet, I., & Shaw, S. (2020). Is assessment fair? SAGE Publications Ltd.
Park, Y. S. (2011). Rater drift in constructed response scoring via latent class signal detection theory and item response theory [Doctoral dissertation]. Columbia University.
Prieto, G., & Nieto, E. (2014). Analysis of rater severity on written expression exam using Many-Faceted Rasch Measurement. Psicológica, 35, 385-397.
Sundqvist, P., Sandlund, E., Skar, G. B., & Tengberg, M. (2020). Effects of rater training on the assessment of L2 English oral proficiency. Nordic Journal of Modern Language Methodology, 8(10), 3-29. https://doi.org/10.46364/njmlm.v8i1.605
Tierney, R. D. (2016). Fairness in educational assessment. In M. A. Peters (Ed.), Encyclopedia of Educational Philosophy and Theory (pp. 1-6). Springer Science+Business Media. https://doi.org/10.1007/978-981-287-532-7_400-1
Walde, G. S. (2016). Assessment of the implementation of continuous assessment: The case of METTU University. European Journal of Science and Mathematics Education, 4(4), 534‐544. https://doi.org/10.30935/scimath/9492
Willey, K., & Gardner, A. (2010, November 18-19). Improving the standard and consistency of multi-tutor grading in large classes [Paper presented]. ATN Assessment Conference 2010. University of Technology Sydney, Australia.
Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31(4), 501-527. https://doi.org/10.1177/0265532214536171CES
ISSN 0128-7702
e-ISSN 2231-8534
Recent Articles