วันพุธที่ 11 มีนาคม พ.ศ. 2569

Raw scores vs Criterion-referenced evaluation vs Norm-referenced evaluation

In educational assessment, the choice between raw scores, criterion-referenced evaluation, and norm-referenced evaluation depends on the instructor's objectives, available resources, and the need for explainability.

Raw Scores

Raw scores are the direct numerical values measured during an assessment before any interpretation or conversion into grades occurs.

  • Pros: They provide a precise, granular measurement of an individual's performance on a specific task without the potential bias introduced by grading algorithms.
  • Cons: Raw scores alone are difficult to interpret as they do not inherently inform learners or instructors of the actual level of learning competence or the necessary improvements needed.

Criterion-Referenced Evaluation

This scheme translates performance into absolute rating labels (e.g., Excellent, A) based on a predetermined rubric or fixed standard.

  • Pros: It ensures that grades reflect a student's mastery of specific content regardless of how their peers perform. It provides a clear, absolute standard that is often easier for stakeholders to understand.
  • Cons: It is most suitable for examinations that cover all content topics, which typically requires significantly longer exam-taking times and more resources for checking answers. It can be difficult to apply if the assessment is not comprehensive.

Norm-Referenced Evaluation

This scheme converts scores into relative ranking labels by comparing an individual’s performance to the performance of their peers.

  • Pros: It is highly efficient for large classes or courses where instructors must meet strict time constraints and save on grading resources. It is the preferred "choice of necessity" when exams cannot comprehensively assess all topics due to limited resources. It inherently reflects the relative quality of performance within a specific group.
  • Cons: It can be difficult to explain the reasoning behind grade boundaries, leading to disputes between learners and instructors when scores are close but result in different grades. Because it lacks predefined absolute criteria, it is more susceptible to bias and concerns regarding fairness.

Comparison Summary

FeatureRaw ScoresCriterion-ReferencedNorm-Referenced
Primary FocusNumerical dataMastery of contentRelative ranking
StandardsNoneAbsolute/PredefinedRelative/Group-based
Best Use CaseInitial data collectionCertifying competencyLarge-scale ranking
Major DrawbackLack of contextResource intensiveHard to justify boundaries