3. Interlaboratory Studies: New FDA Approach

A random variable χ is a functional variable if it takes values in a functional space F (full normed or semi-normed). A particular case occurs when the functional variable χ = χ(t) : t ∈ T, where T is an interval T ⊂ R that belongs to a Hilbert space, as is the case of continuous functions in an interval.

A set of functional data χ₁, ....., χ_n is the observation of n functional variables χ₁, ....., χ_n with the same distribution as χ. Where χ is usually assumed to be an element of:

L₂(T) = {f : T → R, ∫_Tf(t)²dt < ∞}

With the inner product (f, g) = ∫_Tf(t)g(t)dt.

The norm of χ(t) is defined by:

$$||\chi(t)||=\left(\int_a^b X(t)^2 dt\right)^{\frac{1}{2}}$$

In this context, the ILS package is used to apply consistency tests (outlying detection) in an Interlaboratory Study. For this purpose, the TG dataset composed of the Thermogravimetric (TG) curves described in Examples of Interlaboratory Studies is used.

In this vignettes, we use the ILS package to perform the estimations and graphical representation of the statistics H(t), K(t), d_H and d_K, with the aim to perform a r&R study for the datasets composed of functional data TG and DSC that are also included in the ILS package.

3.1. Hypothesis of reproducibility and repeatability

In the ILS studies, each laboratory performs n samples experimentally, obtaining n different curves of observations {X₁^l(t), …, X_n^l(t)}, which are obtained for each, l = 1, …, L. Functional statistics H_l(t) and K_l(t) are calculated for each laboratory assuming the corresponding null hypothesis that there are no statistically different measurements between the laboratories.

The null hypothesis of reproducibility states that:

H₀ : μ₁(t) = μ₂(t) = ⋯ = μ_p(t)

Where μ_l(t), l = 1, …, L is the functional mean of the population for each laboratory l. To evaluate the reproducibility of the laboratory results, the H(t) statistic is calculated as follows:

$$H_l(t)=\frac{X_i^l(t)-\bar{X}(t)}{S_l(t)}; l=1,\ldots,L$$

Where X̄(t) and S_l(t) are the mean and the functional point-to-point variance calculated for the l laboratory.

The null hypothesis of repeatability can be defined by:

H₀ = σ₁²(t) = σ₂²(t) = … = σ_L²(t)

Where σ_l(t), l = 1, …, L are the theoretical functional variances corresponding to each laboratory l. The repeatability test is based on the statistic (K(t)), expressed as:

$$K_l(t)=\frac{S_l(t)}{\sqrt{\bar{S}^2(t)}}; l=1,\ldots,L $$

Where, $\bar{S}^2(t)=\frac{1}{L}\displaystyle\sum_{l=1}^LS_l^2(t)$

On the other hand, to test the reproducibility hypothesis, the test statistic d_H is defined as:

$$d_l^H=||H_l(t)||=\left(\int_a^b H_l(t)^2 dt\right)^{\frac{1}{2}}$$

Considering that the larger values of d_K correspond to non-consistent laboratories, for the repeatability hypothesis, we define d_l^K = ||K(t)|| and likewise, the large values of d_K correspond to non-consistent laboratories.

3.2. ILS: Thermogravimetric Study

The techniques developed to check if inconsistent laboratories are detected either by outliers in the within-laboratory or in between-laboratory variability, have been implemented in the ILS package. As above mentioned, laboratories 1, 5 and 6 have provided different results from the remaining laboratories and should be detected as outliers. We use the datasets described in 2.2, the TG dataset that contains Thermogravimetric test results from 7 laboratories, while the DSC dataset contains results from 6 laboratories (excluding laboratory 1). First you estimate the functional statistics H(t) and K(t) by the function mandel.fqcs(), then you make the corresponding graphs in the defined functional space.

library(ILS)
data(TG, package = "ILS")
delta <- seq(from = 40 ,to = 850 ,length.out = 1000 )
fqcdata <- ils.fqcdata(TG, p = 7, argvals = delta)
mandel.tg <- mandel.fqcs(fqcdata,nb = 10)
plot(mandel.tg,legend = T,col=c(rep(3,5),1,1))

Figure 7: TG dataset: The right panels show the functional statistics H(x) (up) and K(x) (below) for each laboratory, whereas the left panels show the d_H (up) and d_K (below) test statistics for each laboratory.

Figure 7, shows both the K(t) and H(t) statistics for each laboratory, as well as the d_K and d_H contrast statistics. The control limit between short lines is constructed at a significance level α = 0.01 corresponding to the critical values c_K and c_H. The following code refers to the use of the ILS package into the TG dataset.

data(DSC, package = "ILS")
fqcdata.dsc <- ils.fqcdata(DSC, p = 6, index.laboratory = paste("Lab",2:7), 
                           argvals = delta)
mandel.dsc <- mandel.fqcs(fqcdata.dsc,nb = 10)
plot(mandel.dsc,legend = F,col=c(rep(3,4),1,3))

Figure 8: DSC dataset: The right panels show the functional statistics H(x) (up) and K(x) (below) for each laboratory, whereas the left panels show the d_H (up) and d_K (below) test statistics for each laboratory.

Interlaboratory Study defined by the DSC dataset. Thus, Figure 8 shows that repeatability hypothesis was not reject. Otherwise, the reproducibility’s hypothesis was rejected in the case of laboratory 6 (see Figure 8), that is properly detected as an outlier.

- 3. Interlaboratory Studies: New FDA Approach
  - 3.1. Hypothesis of reproducibility and repeatability
  - 3.2. ILS: Thermogravimetric Study

Functional Data Analysis (FDA)

3. Interlaboratory Studies: New FDA Approach

3.1. Hypothesis of reproducibility and repeatability

3.2. ILS: Thermogravimetric Study