## Introduction

The measured values of biological quantities facilitated by clinical laboratories provide essential information that conditions the correct clinical orientation, optimisation of patients’ healthcare process, and lead to appropriate therapeutic, diagnostic, or healthcare actions. Therefore, these values must be reliable (exact) and comparable with other ones obtained in different periods and places (traceable) (*1*, *2*).

Metrological concepts like measurement uncertainty (MU) and metrological traceability (MT) allow to know the degree of accuracy of the measured values that a clinical laboratory provides, and the comparability or transferability of these results over time and space. Currently, such is the importance of these two concepts that the estimation of MU and the knowledge of MT are required for clinical laboratories accredited by ISO 15189:2012 (*3*).

Measurement uncertainty complements a measured value of a biological quantity, indicating the magnitude of the doubt about this value and providing a quantitative indication of its quality and reliability (*4*). Nowadays, there are two main approaches for estimating MU: so-called *bottom-up* and *top-down*. The *bottom-up* approach is based on a comprehensive categorisation of the measurement where each potential uncertainty source is identified and quantified. The estimates of uncertainty expressed as standard deviations (standard uncertainties) are assigned to individual components of the procedure, which are then mathematically combined using propagation rules to provide a combined standard uncertainty. Finally, an expanded uncertainty is estimated, multiplying the combined uncertainty by an appropriate coverage factor (*1*).

Conversely, the *top-down* approach considers uncertainty as a whole. First, the most significant uncertainty sources are identified and grouped. Then, their standard uncertainties are estimated using available laboratory tests performance information, such as measurement procedure validation or verification data, and intra-laboratory or inter-laboratory data (*e.g.* internal and external quality control data). Subsequently, the combined uncertainty is obtained from the standard uncertainties for, finally, to estimate the expanded uncertainty (*1*).

Furthermore, MT is defined as the property of a measurement result whereby the result can be related to a reference through a documented unbroken chain of calibrations, each contributing to the measurement uncertainty (*4*). In other words, to achieve comparability of results over space and time, it is essential to link all the individual measurement results to some common reference. In this way, results can be compared through their relationship to that reference. Ideally, this reference should be an International System (SI) unit of measurement materialised by a primary reference measurement procedure and a primary measurement standard (*5*).

At present, there are several guidelines and publications that clinical laboratories could use to estimate the MU, but there is still no consensus on how they should calculate the MU (*6*-*18*). On the contrary, there does seem to be an agreement on how laboratories should describe the MT of the measurement results they provide. Nevertheless, there are a small number of clinical laboratories that know the MT of their results, even though they are aware of the lack of comparability that currently exists for patient’s results (*13*, *17*-*19*). Thus, with the intention to facilitate the task of clinical laboratories, this review aims to provide a proposal to estimate the measurement uncertainty in clinical laboratories showing examples to decide what information and which formulae they should use to calculate the MU. Also, practical suggestions are provided to enable laboratories to describe the MT of their results.

## Measurement uncertainty estimation

The *top-down* approach is particularly well suited to measuring systems commonly encountered in clinical laboratories. So, the MU should be estimated using this approach and taking into account the following steps (*6*):

### Specification of the measurand

A measurand is defined as the quantity intended to be measured (*4*). So, the measurand must be unequivocally defined and the measurement procedure used must be exhaustively detailed; otherwise, an insufficient specification of the measurand may itself be a significant uncertainty source (definitional or intrinsic uncertainty) that could be difficult to estimate. To specify the measurand, it is necessary to include at least the following information (*6*, *20*):

The name of the biological system containing the component (analyte),

*e.g.*blood, plasma, serum, urine,*etc.*The name of the biological component (so-called analyte),

*e.g.*glucose, sodium ion, sirolimus, troponin T,*etc.*The kind-of-property,

*e.g.*, substance concentration, mass concentration, number concentration, catalytic concentration,*etc.*The measurement unit,

*e.g.*, mmol/L, mg/L, entities/L, µkat/L,*etc.*

Sometimes, it is also necessary to include additional information such as the measuring system (or the measurement method or the measurement principle) used to measure the quantity, and the conditions under which the measurements are performed (*e.g.*, the temperature for enzymes).

### Identification of the uncertainty sources

According to different clinical laboratory guidelines, the most significant uncertainty sources contributions to the overall MU are captured by the uncertainties related to the assigned value of the end-user calibrator (*u*_{cal}), the long-term intermediate precision (*u*_{Rw}), and the bias (*u*_{b}) (*6*, *7*). Thus, it would be sufficient that clinical laboratories consider only these three uncertainty sources to provide reasonable estimates of MU that help ensure that patient results are fit for medical use.

### Estimation of the standard uncertainties

#### Uncertainty related to the assigned value of the end-user calibrator

A correct estimate of MU is indeed not possible without the *u*_{cal} because it includes all uncertainties contributions accumulated across the entire traceability chain of a measurement result (*13*). Thus, clinical laboratories should include the *u*_{cal} in the uncertainty budget when they estimate the MU. The *in vitro* diagnostic (IVD) manufacturers are requested to comply with the European Regulation 2017/746 on *in vitro* medical diagnostics and must be provided with this information to clinical laboratories (*21*). Usually, manufacturers present this information as the calibration material assigned value (x_{cal}) jointly with its expanded uncertainty (*U*_{cal} or %*U*_{rel(cal)}) using a coverage factor (*k*) equal to 2. So, the *u*_{cal} can be obtained as:

or

Instead, when clinical laboratories prepare their calibration materials, they are entirely responsible for estimating the *u*_{cal}. In these cases, the *u*_{cal} can be calculated taking into account all information used to prepare the calibration materials, and statistically combining the uncertainties associated with each one of the sequential value assignment steps utilising the *law for the propagation of uncertainty* (*8*, *9*, *22*).

#### Uncertainty related to the long-term intermediate imprecision

Most of the components of the MU are included in the long-term intermediate imprecision. This imprecision can be calculated from internal quality control (IQC) data (*6*).

When clinical laboratories estimate the *u*_{Rw}, there are different considerations that they should take into account (*6*, *7*, *13*):

The IQC materials used for estimating the

*u*_{Rw}should comply with specific attributes or characteristics. For example, the materials should be commutable and different from that used to check the correct alignment of the measuring systems.The IQC material data must be collected for a sufficiently long-time-interval to reflect most of the sources of variability influencing the measurement process.

Different IQC material levels at mean values close to important medical decision limits should be used to know the

*u*_{Rw}behaviour across the measuring interval of the measuring systems.A precision study (

*e.g.,*comparing variances using the*F*-test) of representative human samples and IQC materials should be performed to verify that the magnitude of imprecision for both materials is similar. An example of how to assess this type of studies was published by Fuentes-Arderiu*et al*. (*23*).In clinical laboratories, it is common to indistinctly measure a biological quantity with more than one identical measuring system (or different modules of the same measuring system). Therefore, it would be advisable to obtain an estimate of MU that would have the variation overall measuring systems.

To avoid the effect of IQC material lot changes on estimating uncertainty, as well as for practical reasons, the use of a single IQC material lot during the estimation study it would be advisable (

*6*,*8*).

As far as possible, clinical laboratories should comply with most of these considerations to perform an adequate *u*_{Rw} estimation, as well as to avoid a possible over-estimate of the *u*_{Rw}.

When only one calibrator lot, IQC lot, and a unique measuring system are used during a specified time interval, the *u*_{Rw} can be calculated as the classical standard deviation (s):

where x_{i} represents IQC values obtained in a specified time-interval, n the number of IQC replicate measurements in a specified time-interval and x the IQC mean value obtained in a specified time-interval.

Furthermore, when two or more lots of calibration or IQC materials are involved in a specified time interval, or when two or more identical measuring systems are used to measure the same biological quantity, the *u*_{Rw} can be calculated as a pooled standard deviation (s_{p}) (*22*):

with a pooled mean (x_{p}) given by:

where n_{i} is the number of IQC replicate measurements using the calibrator lot i (or the number of IQC replicate measurements using the IQC lot i; or the number of IQC replicate measurements using measuring system i), s_{i} is the standard deviation obtained using the calibrator lot i (or the standard deviation obtained using the IQC lot i, or the standard deviation obtained using the measuring system i), x_{i} is the IQC mean value obtained using the calibrator lot i (or the IQC mean value obtained using the IQC lot i, or the IQC mean value obtained using the measuring system i), x_{p} is the IQC pooled mean calculated using all calibrator lots (or IQC pooled mean calculated using all IQC lots, or IQC mean value calculated using all measuring systems) and N is the total number of IQC replicate measurements.

#### Uncertainty related to the bias

At present, how to deal with the bias on clinical measurements and how to calculate the bias component of uncertainty continues to be a matter of debate. Some authors firmly state that the bias (or its uncertainty) must not be included in the uncertainty budget because the bias component is already will be part of the *u*_{cal} (*13*). In other words, it is expected that IVD manufacturers must ensure the traceability of their measuring systems to the highest-order available references. This statement is partially correct because it is known that, in several cases, the IVD manufacturers continue to prepare their calibration materials *in-house* without any traceability to high-order metrological references, although they are requested to comply with European Regulation 2017/746.

In contrast, other authors opine that when a significant bias is detected this one should be eliminated (*24*, *25*). If the bias cannot be eliminated, there are two ways of proceeding: 1) to correct the bias by applying a correction factor and incorporating its uncertainty to the uncertainty budget, or 2) to include the bias itself in the uncertainty budget. It should be noted that the first point would only be applicable for those cases in which the bias study is assessed using certified reference material (CRM), and when the traceability declared by the IVD manufacturer is to the same CRM used to evaluate the bias study (*24*, *25*).

Regarding the *u*_{b}, different procedures allow estimating the measuring system bias, is the one based on the use of reference materials the most widely used. Reference material can be a CRM, an IQC material (with or without an associated IQC inter-laboratory scheme), or a control material belonging to an external quality assurance service (EQAS) (*6*, *7*, *25*). Of all of them, CRM or commutable IQC or EQAS control materials with values assigned by an international conventional or primary measurement procedures should be used whenever possible. In the absence of these CRM or commutable control materials, inter-laboratory IQC followed by control materials from an EQAS can be used (*14*).

When a CRM is used to estimate the bias (b), the b and its uncertainty (*u*_{b}) can be calculated as (*24*):

where x designates the mean value obtained after processing the CRM in a specific time-interval, μ is the CRM assigned value and *u*_{μ} the uncertainty associated with the CRM assigned value. Note that Eq. 5 should be used to calculate the x value if more than one calibration lot or measuring system is used.

Bias studies using IQC materials can be performed following the Farrance *et al.* recommendations (*22*):

When two or more lots of calibrator or IQC materials are involved in a specified time interval, or when two or more identical measuring systems are used to measure the same biological quantity, the bias can be calculated as a weighted mean value of bias (b

_{w}) (*24*):

and its uncertainty () as:

The ICQ manufacturer must provide the *μ*_{i,k}.
Otherwise, if the IQC material presents an associated IQC inter-laboratory scheme, it can be estimated as (*26*):

In the previous equations R represents the total pool size (total number of replicate measurements, *i.e.*, of IQC values), N the number of IQC materials levels used, m the number of calibrator lots (or IQC material lots or measuring systems) used, n_{i,k} the number of replicate measurements using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k), b_{i} the mean bias over n_{i} replicates, using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k), x_{i} the pooled mean value obtained using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k), μ_{i,k} the reference value using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k) - this value can be the value assigned by the manufacturer of the IQC material or conventional value calculated as the mean of arithmetic means of peer-group laboratories participating in an inter-laboratory IQC program (*e.g.* UNITY from Bio-Rad Laboratories) using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k), is the uncertainty associated with the reference value μ_{i,k}, is the robust peer-group standard deviation obtained using the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k), q_{i,k} the number of peer-group laboratories participating in the IQC material level i for the calibrator lot k (or IQC material k, or measuring system k).

When only one calibrator lot, IQC lot, and a unique measuring system are used during a specified time interval, equations become:

The IQC manufacturer must provide the μi,k. Otherwise, if the IQC material presents an associated IQC inter-laboratory scheme, it can be estimated as (*26*):

where N represents the number of IQC material levels used, n_{i} the number of replicate measurements using the IQC material level i, M the total number of replicate measurements, b_{i} the bias over n_{i} replicates using the IQC material level i, x_{i} the mean value obtained using the IQC material level i, μ_{i} the reference value for the IQC material level i (this value corresponds to the conventional value calculated as the mean of arithmetic means of peergroup laboratories participating in an inter-laboratory IQC program (*e.g.* UNITY from Bio-Rad Laboratories), *u*_{μi} the uncertainty associated with the mean reference value μ_{i}, s_{Labsi} the robust peer-group standard deviation obtained for the IQC material level i and q_{i} the number of peer-group laboratories participating in the IQC material level i.

The bias can also be estimated from EQAS. In these cases, the bias and its uncertainty can be calculated as described above. Thus:

When more than one measuring system is used to measure the same quantity, a mean bias (b) can be calculated as (*24*):

and its uncertainty (*u*_{b}) as:

The EQAS’ manufacturer must provide the or, if not, could be calculated as (*26*):

where R is the total pool size (total number of measurements including all measuring systems and EQAS participations, N is the number of EQAS participations, e_{i,k} is the measurement error for the EQAS participation i and the measuring system k, x_{i,k} is the measured value obtained for the EQAS participation i and the measuring system k, μ_{i} is thre reference value assigned by the EQAS manufacturer for the EQAS participation i, is the uncertainty associated with the reference value μ_{i}, is the robust peer-group standard deviation facilitated by the EQAS manufactured for participation i and q_{i} is the number of peer-group laboratories for the EQAS participation i.

When only one measuring system is used, the bias (b) and its uncertainty (*u*_{b}) can be calculated as:

where N represents the number of EQAS participations, e_{i} the measurement error for the EQAS participation i, x_{i} the measured value obtained for the EQAS participation i, μ_{i} the reference value assigned by the EQAS manufacturer for the EQAS participation i, the uncertainty associated with the reference value μ_{i}, the robust peer-group standard deviation facilitated by the EQAS manufacturer for participation i and q_{i} the number of peer-group laboratories for the EQAS participation i.

Once the bias and its uncertainty have been estimated, metrological compatibility studies can be carried out to know whether the biases are or are not statistically significant. Thus, a bias is considered significant if the absolute value of the bias itself is higher than its relative expanded uncertainty, *i.e.* if |b| > 2 × *u*_{b} (*4*).

If a significant bias is detected, its treatment should be different depending on the kind-of-reference material used. If a CRM is used, the bias should be eliminated by applying a correction factor to every individual measured value obtained, dividing the assigned value of the CRM by the mean value obtained in the bias study. Also, the MU associated with this correction factor (*u*_{cf}), calculated such as the *u*_{b}, should be included in the uncertainty budget. On the contrary, if IQC or EQAS control materials are used, it is not recommended to apply a correction factor to eliminate the bias, and the bias itself should be included in the uncertainty budget (*6*, *22*, *25*).

### Calculation of the combined standard uncertainty

Wen the individual contribution of each standard uncertainty source has been estimated, the combined standard uncertainty can be calculated by adding estimates of the standard uncertainties considered above, according to one of the following equations:

Clinical laboratories should use Eq. 20 when the compatibility study shows that the bias is not statistically significant. Equation 21 should be used when a CRM is used to estimate the bias, the bias is significant, and it has been “eliminated” applying a correction factor. Equation 22 should use if IQC or EQAS materials are used to estimate the bias, and the laboratory cannot “eliminate” the bias.

### Calculation of the expanded uncertainty

Expanded uncertainty (*U*) is calculated multiplying the u_{c} by a coverage factor *k*:

This *k*-value depends on the type of probability distribution, the level of statistical significance selected and the number of independent measurements made to obtain the *u*_{c}. Under typical clinical laboratory working conditions, it is acceptable to use a *k*-value of 2 (*6*, *7*).

### Comparison of the expanded uncertainty obtained with the maximum allowable expanded uncertainty

Finally, to know if a *U* value is acceptable, it must be compared with the maximum allowable (permissible) expanded uncertainty (*U*_{max}). Thus, an U value is considered acceptable if it is lower or equal than the previously selected *U*_{max} by the laboratory.

Another controversial point that currently exists is how the *U*_{max} should be established. Measurement uncertainty requirements for defining fitness-for-purpose limits may be based on clinical outcome studies, biological variation or state-of-the-art, being those based on biological variation, despite their limitations, generally accepted and used (*27*-*30*). However, it should be noted that unless a country has established legal metrological requirements (*e.g.* the German RiliBÄK), the selection of one type of requirement or another is a matter of consensus and depends on the clinical laboratory itself.

So, despite there are several ways to select the *U*_{max}, we show here a procedure based on state-of-the-art to calculate the *U*_{max} using on the RiliBÄK concept named “root mean square of measurement error” (∆) (*31*-*33*):

where ∆_{max} represents the maximum allowable absolute root mean square of measurement error, %∆_{rel(max)} the maximum allowable pecent relative root mean square of measurement error, μ_{a} the reference value for which the requirement has been established, CV_{max} the maximum allowable coefficient of variation and %b_{rel(max)} the maximum allowable percent relative bias.

The %∆_{rel(max)} values can be selected directly from RiliBÄK (*31*). Otherwise, they can be calculated from the CV_{max} and %b_{rel(max)} using biological variation data, state-of-the-art data, or data from different organizations such as CLIA, National Cholesterol Education Program for lipid-related quantities, European Medicine Agency (EMA) for drugs, among others (*34*-*38*).

To illustrate the proposal for the estimation of MU, some biological quantities that are routinely measured in clinical laboratories using both already “commercial” (*i.e.*, those with CE marking) and “*in-house*” validated measurement procedures have been selected (see Supplementary material 1). Table 1 shows the MU budget and the maximum allowable relative expanded uncertainty. Besides, Supplementary materials 2, 3 and 4 contain spreadsheets that allow calculating the primary measurement uncertainty sources (*u*_{cal}, *u*_{Rw} and *u*_{b}), the *u*_{c}, and the U. Also, they include a study to know if the *U* obtained is or is not acceptable compared with the *U*_{max}, and show an example of how to specify the measurand. Every supplementary material considers the use of the three kind-of-materials to estimate the bias, CRM, IQC materials (with an associated IQC inter-laboratory scheme), and EQAS materials.

## Metrological traceability description

As we commented before, the description of MT in clinical laboratories is a less controversy matter than the MU uncertainty and can be made simply based on the ISO 17511:2020 (*5*). All information needed to its description can be provided by the manufacturers of the reagents or calibration materials, as well as from certificates of analysis of CRM declared by international or national metrology institutes, and from the Joint Committee for Traceability in Laboratory Medicine (JCTLM) database (*39*). For each biological quantity, the strategy to follow can be based on:

Obtaining the MT declared by the manufacturers. If this information is not present in brochures or incomplete information is found, this one can be acquired directly asking the manufacturers, or in some cases, from websites of government agencies, such as the Food and Drug Administration (FDA) (

*40*).Obtaining additional information related to the references (units of measurement, measurement procedures, or reference materials) to describe the calibration hierarchies and the sequence of result assignments up the point at which metrological traceability begins. This information can be obtained from the reagent’s manufacturers, CRM certificates, or the JCTLM database (

*39*).Performing a table or flow chart from all information previously collected to describe the metrological traceability chain and the calibration hierarchy of the measurement results.

As an example, Table 2 and Supplementary Material 5 show the MT description for some biological quantities.

## Conclusions

This review provides practical suggestions of how clinical laboratories could estimate the MU and describe the MT of biological quantities results to help and motivate clinical laboratories to: 1) conduct this type of studies, 2) incorporate information regarding uncertainty and traceability in their reports, and 3) allow them a greater understanding of the importance that these concepts have in the laboratory medicine sciences. Also, in the “clinical laboratory accreditation era”, this review could help laboratories in meeting those ISO 15189 requirements related to these two metrological concepts.