Benchmarking involves measuring and comparing the performance of systems, models, or processes against a standard or across different systems. This process helps to evaluate how well a model or system performs in relation to known standards or to other models. In machine learning, the benchmark is usually set by the performance of the baseline or other leading models.
Benchmarking uses baselines: The baseline serves as the initial point of comparison in benchmarking. When evaluating a model or system, the baseline provides the first performance standard. If a model performs better than the baseline, it’s an indicator that the model has some value, and further benchmarking against other models can help assess its true effectiveness.
Baseline establishes expectations: Without a baseline, benchmarking would lack a clear starting point. By defining what "acceptable" or "expected" performance looks like, the baseline enables meaningful benchmarking comparisons.