11.6 Sample size justification
An important part of any pre-registration or indeed paper is your sample size justification. This is a statement that says why your sample is the size it is, and justifies why that size is enough for to conclusions to be meaningful. The kinds of considerations that can figure in a sample size justification are quite varied. They can include pragmatic ones such as time and resources, and precedent from previous related studies. Sometimes your study is exploratory and you will really had no idea what kinds of effects or associations you might find. It is fine to say so, but you could still discuss your minimal detectable effects. Whatever sample size you end up targeting (or having), you should provide an honest account of why it is what it is and what you can infer from it. Some kind of power calculation is usually required.
Researchers in the past often found it an unreasonable demand to be held to the outputs of simple power calculations, when so many other things reasonably affect how many participants they could recruit. But remember, power calculation is not just sample size determination analysis, the kind where you put in a SESOI and a power level and it tells you how many participants you need to run. You can also use it the other ways, where you accept the number of participants that you have been or will be able to run, and then you use the calculation to say something about your level of power for effects of different sizes, or about the size of the minimal detectable effect. If you did not use power calculations to determine your sample size, you can still use them to say something about your sensitivity to effects of different sizes, or your minimal detectable effect.
If you can’t justify your sample size, or if the power calculations tell you that you are very unlikely to have been able to detect important effects that really are there, then your study is probably not ready for the big time, even if the results are suggestive. Treat it as a pilot and pre-register another one with a more adequate sample size. This could either mean recruiting more people, or using a different type of design. The lower your power, the greater the likelihood that any ‘significant’ effects you do find by null hypothesis significance testing are false positive flukes (Button et al., 2013). There are too many false positives from under-powered studies out there. Try not to add to them.