วันศุกร์ที่ 13 มีนาคม พ.ศ. 2569

Minimum-cost flow

Minimum-cost flow is an optimization problem that finds the cheapest way to send a specific amount of "flow" (material, data, or goods) from source nodes to sink nodes through a network. It minimizes total transportation costs, ensuring flow does not exceed edge capacities.

Key Aspects of Min-Cost Flow:
  • Capacity Constraints: Every edge has a maximum capacity that
     cannot be exceeded.
  • Cost per Unit: Each unit of flow on an edge has a associated cost.
  • Flow Conservation: For every node, flow in must equal flow out, except for supply nodes () and demand nodes ().
  • Goal: Minimize
     (total cost) while satisfying all supply/demand needs.
Common Applications:
  • Logistics & Supply Chain: Shipping goods from factories to consumers at minimum cost.
  • Telecommunications: Routing data packets to reduce latency and maximize bandwidth utilization.
  • Energy Distribution: Transporting electricity or liquids through pipelines.
  • Assignment Problems: Matching tasks to workers efficiently.
Solving Techniques:
The problem is typically solved using variations of the Successive Shortest Path algorithm, which uses Bellman-Ford or Dijkstra's algorithm with potentials to find the cheapest path, often using solvers such as Gurobi or Google OR-Tools.

วันพุธที่ 11 มีนาคม พ.ศ. 2569

Raw scores vs Criterion-referenced evaluation vs Norm-referenced evaluation

In human performance assessment, the choice between raw scores, criterion-referenced evaluation, and norm-referenced evaluation depends on the instructor's objectives, available resources, and the need for explainability.

Raw Scores

Raw scores are the direct numerical values measured during an assessment before any interpretation or conversion into grades occurs.

  • Pros: They provide a precise, granular measurement of an individual's performance on a specific task without the potential bias introduced by grading algorithms.
  • Cons: Raw scores alone are difficult to interpret as they do not inherently inform learners or instructors of the actual level of learning competence or the necessary improvements needed. For example, a raw score of 50/100 could mean: excellent performance on a very difficult test or poor performance on an easy test

Criterion-Referenced Evaluation

This scheme translates performance into absolute rating labels (e.g., Excellent, A) based on a predetermined rubric or fixed standard.

  • Pros: It ensures that grades reflect a student's mastery of specific content regardless of how their peers perform. It provides a clear, absolute standard that is often easier for stakeholders to understand.
  • Cons: It is most suitable for examinations that cover all content topics, which typically requires significantly longer exam-taking times and more resources for checking answers. It can be difficult to apply if the assessment is not comprehensive. 

Norm-Referenced Evaluation

This scheme converts scores into relative ranking labels by comparing an individual’s performance to the performance of their peers.

  • Pros: It is highly efficient for large classes or courses where instructors must meet strict time constraints and save on grading resources. It is the preferred "choice of necessity" when exams cannot comprehensively assess all topics due to limited resources. It inherently reflects the relative quality of performance within a specific group.
  • Cons: It can be difficult to explain the reasoning behind grade boundaries, leading to disputes between learners and instructors when scores are close but result in different grades. Because it lacks predefined absolute criteria, it is more susceptible to bias and concerns regarding fairness.
Labeling or grading implies interpretation.

Comparison Summary

FeatureRaw ScoresCriterion-Referenced gradingNorm-Referenced grading
Primary FocusDirect measurement without interpretation Interpretation as Mastery of contentInterpretation as Relative ranking
StandardsNoneAbsolute/PredefinedRelative/Group-based
Best Use CaseRaw ranking like TCAS examCertifying competencyLarge-scale ranking
Major Drawback
Lack of contextResource intensiveHard to justify boundaries

Discussion about another problem with criterion-referenced grading is that someone may not agree with your criterion: Why must be 80 points++ to get A? Why does F have the widest score range of 0-50?

The following points explain why these criteria can be problematic and how they contrast with the alternative methods discussed in the sources:

  • Fixed Percent Ranges: Criterion-referenced grading typically maps a learning score to a predefined percent range for a specific grade (e.g., 80% for an A). This means the standards are set before the assessment begins and do not change regardless of the actual distribution of student performance.
  • Lack of Explainable Discrimination: A core difficulty with fixed boundaries is the "explainability" of the grade. In these systems, a student scoring just below a threshold (like 79 vs. 80) may receive a different grade without a data-driven justification for that specific cut-off. The sources suggest that it is difficult for instructors to resolve disputes when learners score contiguously but fall into different predefined boundaries.
  • Arbitrary Nature of Absolute Standards: Because these criteria are absolute, they may not reflect the relative quality of an individual’s performance compared to their peers. If an exam is "overly difficult" or "too easy," all learners with similar scores might get the same grade C which cannot accurately differentiate the true learning competence of the group.
  • Contrast with Data-Driven Gaps: To address the problem of arbitrary cut-offs, the sources propose norm-referenced heuristic methods like the Widest-Gap-First algorithm. Instead of using a predefined number like 80, this method identifies the widest score gaps in the actual data to define boundaries. This provides a "simple and clear-cut justification": a student receives a certain grade because their score is closer to others in that group than to the group above.
  • Fairness Concerns: When unique grade symbols represent unequal score intervals (such as F covering 0–50 while A covers only 80–100), it can be seen as providing unequal chances for students to receive certain grades. The sources note that "fair" grading should ideally maintain uniform intervals or use widest score gaps to prevent two learners with similar competence from receiving different grades.

Stanine

 In educational assessment, a stanine (short for STAndard NINE) is a method of scaling test scores on a nine-point standard scale with a mean of 5 and a standard deviation of 2.

It is designed to simplify the interpretation of test results by grouping scores into broad categories rather than looking at precise raw scores or percentiles.

How Stanines Work

The scale converts a normal distribution of scores into nine units. Because it follows a bell curve, most students fall into the middle stanines (4, 5, and 6), while very few fall into the extreme ends (1 or 9).

StaninePercentage of CasesPerformance Level
94%Highest (Top)
87%Well Above Average
712%Above Average
617%High Average
520%Average
417%Low Average
312%Below Average
27%Well Below Average
14%Lowest (Bottom)

Key Characteristics

  • Coarseness: Because it only has nine points, it "smooths out" small, insignificant differences between students. For example, two students with slightly different raw scores might both be a "Stanine 6," preventing over-interpretation of minor score gaps.

  • Comparison: It allows educators to compare a student’s performance across different subjects (e.g., comparing a Stanine 7 in Math to a Stanine 5 in Reading) using a single, unified metric.

  • Simplicity: It is often easier for parents and students to understand than complex z-scores or T-scores.

Mathematical Context

If you are working with standard normal distributions, the stanine (S) can be calculated from a z-score (z) using the following linear transformation:

S = 2z + 5

The result is then rounded to the nearest whole number between 1 and 9.

Use in Pedagogy

In the context of Outcome-Based Education (OBE) or curriculum design, stanines are frequently used to identify groups of students who may need additional support or advanced enrichment, as they provide a clear snapshot of where a student sits relative to a peer group.

Would you like me to help you create a grading rubric or a distribution chart based on this scale?

--Gemini