We have all learned a lot about clinical performance studies for IVD validation in the past couple of years: With thousands of Covid tests entering the EU-market, their validation studies became one of the most sought-after services from CROs and biobanks. There were plenty of practice opportunities to optimise processes and tighten collaborations. One vital step of clinical performance study design, however, was taken off the curriculum: At an early stage of the pandemic, commonly accepted guidelines for Covid test validation were published which contained clear requirements regarding the number of samples to be included in the study. As a result, we didn’t have to bother with the infamous issue of sample size planning.
Today, MDCG-guideline 2021 – 21, around which all our validation efforts revolved for over two years, is part of EU Regulation 2022/1107 (‘Common Specifications for certain class D in vitro diagnostic medical devices’). This document comprises twelve annexes containing tables with official (and quite challenging) sample size requirements for the state-of-the-art validation of devices intended for detection of blood group antigens, HIV, hepatitis, and other infectious diseases. These Common Specifications can greatly facilitate study design, as the number and specifications of samples to be procured are clear from the start. However, they place high demands on manufacturers and apply to only a dozen medical indications.
This means that for the vast majority of IVD devices, clinical performance study design still relies on statistical sample size calculation. We don’t want to delve into the depths of this mathematical field here (which fills countless books and publications), but rather give a couple of hints to make life easier of a non-statistician.
First of all, there is not one correct way to conduct a sample size calculation. There are multiple statistical tests that apply to different study designs. For IVD devices, however, it usually (but not always) comes down to two types of statistical tests that aim at achieving the following study goals:
- Approach: I want to estimate the performance of my product with a reasonable margin of uncertainty, OR
- Approach: I want to show that the performance of my product does not fall below a certain value.
These goals sound similar, but the formulas they use and the claims that they provide evidence for are different. If the intended purpose of your product stipulates being non-inferior to a state-of-the-art device, an estimation of the performance with a two-sided margin of uncertainty (the first approach) may not be the best way to calculate sample size for its clinical study. What you want to know after all is, how many donors you need to recruit to show that your device is no less sensitive than a competitor’s device. This circumstance is much better accounted for by the second approach.
Knowing which statistical approach for sample size calculation is appropriate for my study is not much use to me if I don’t know the formulae or how to apply them (which is true for most of us). Fortunately, there are several statistical publications that provide tables with pre-calculated sample sizes for different expected sensitivity/specificity-values, margins of uncertainty (= width of the confidence interval; applies to the first approach), levels of disease prevalence, or minimal accepted sensitivity/specificity-values (applies to the second approach).
[Infobox: Yes, this means that the outcome of the study (i.e., the performance of your product) must already be known during sample size planning. Remember that the goal of sample size planning is finding a number of samples that is large enough to sufficiently support your performance claim, but not so large that the study becomes economically and ethically unreasonable. The expected outcome can be deduced from pre-studies or findings from your product development phase, but also from competitors’ performance studies. For instance, if the state-of-the-art in medicine requires that your product has a certain minimum sensitivity, you know that this sensitivity should be the expected outcome of your study, otherwise your product would not be usable (AND you know that you should choose the second approach to sample size calculation).]
Some exemplary publications are:
- K. Hajian-Tilaki / Journal of Biomedical Informatics 48 (2014) 193 – 204
- A. Flahault et al. / Journal of Clinical Epidemiology 58 (2005) 859 – 862
- F. Krummenauer, H‑U. Kauczor / Fortschr Röntgenstr 174 (2002) 1438 – 1444 (German)
Using sample sizes from these or similar publications will not compromise your conformity assessment as long as you provide justification for the sample size chosen. The current state-of-the-art of your product type or findings from pre-studies can be used as justification.
Of course, multiple factors not mentioned here have an influence on sample size estimation (e.g., power and p‑value). Also, this article cannot apply to quantitative or semi-quantitative assays. For simple qualitative IVD assays, however, the statistical complexity is limited, which allows the generalisations made here in the first place. For anything more complex, consulting a statistician is always the only safe way.