Logo Utrecht University

S4 conference


In March 2018, we organized the first edition of the Small Sample Size Solutions conference. The aim of the S4 Conference was to share information, learn about new developments, and discuss solutions for typical small sample size problems. We invited presenters to write a book chapter about their solutions in our book titled ‘Small Sample Size Solutions: A Guide for Applied Researchers and Practitioners’ edited by Rens van de Schoot & Milica Miočević.

The book is published open access, but if you like to have it on paper you can get a 20% discount enter the code FLR40 at checkout*. Available for order now or just download the open access version

* Offer cannot be used in conjunction with any other offer or discount and only applies to
books purchased directly via the website of Routledge..


Read More

The list of contributors includes both established authors providing an overview of available methods in a particular field, and early career researchers working on promising innovative solutions. The authors of the chapters reviewed at least one other chapter in this volume, and each chapter was written with the goal of being accessible for applied researchers and students with basic knowledge of statistics.

The current book provides guidelines and tools for implementing a variety of solutions to issues that arise in small sample research, along with references for further (technical) details. Our book combines several distinct, yet related, methods and moves away from the ‘simple’ statistical models that are typically discussed in textbooks. The book includes solutions for estimation of population means, regression analyses, meta-analyses, factor analyses, advanced structural equation models with latent variables, and models for nested observations. The types of solutions consist of Bayesian estimation with informative priors, various classical and Bayesian methods for synthesizing data with small samples, constrained statistical inference, two-step modeling, and data analysis methods for one participant at a time. All methods require a strong justification of the choice of analytic strategy and complete transparency about all steps in the analysis. The book is accompanied by state-of-the-art software solutions, some of which will only be released next year. All proposed solutions are described in steps researchers can implement with their own data and are accompanied with annotated syntax in R available on the Open Science Framework (osf.io/am7pr/).

The content of the substantive applications spans a variety of disciplines, and we expect the book to be of interest to researcher within and outside academia who are working with small samples sizes. The S4 conference is a reoccurring event, and the research on optimal solutions to small sample size issues is ongoing. This book represents a much-needed collection of currently available solutions, and we hope that it aids applied researchers in their endeavors and inspires methodological researchers to expand the field of small sample size solutions. We would like to thank all contributors for sharing their work and a special thanks to Evelien Schat and Gerbrich Ferdinands for their assistance with compiling this book. We hope to meet you, reader of this book, at our next conference.


The book has been split into three sections:

Section 1 contains several chapters that describe and make use of Bayesian statistics.


Chapter 1- Introduction to Bayesian Statistics by Milica MiočevićRoy Levy and Rens van de Schoot

In this brief introductory chapter, we sought to inform readers new to Bayesian statistics about the fundamental concepts in Bayesian analyses. The most important take-home messages to remember are that in Bayesian statistics, the analysis starts with an explicit formulation of prior beliefs that are updated with the observed data to obtain a posterior distribution. The posterior distribution is then used to make inferences about probable values of a given parameter (or set of parameters). Furthermore, Bayes Factors allow for comparison of non-nested models, and it is possible to compute the amount of support for the null hypothesis, which cannot be done in the frequentist framework. Subsequent chapters in this volume make use of Bayesian methods for obtaining posteriors of parameters of interest, as well as Bayes Factors.

Chapter 2- The Role of Exchangeability in Sequential Updating of Findings from Small Studies and the Challenges of Identifying Exchangeable Data Sets by Milica MiočevićRoy Levy and Andrea Savord

This chapter discusses the notion of exchangeability and how it relates to Bayesian analyses and sequential accumulation of scientific knowledge. We provide an example of two small exchangeable data sets that when synthesized yield a more precise interval for the effects of interest, and we demonstrate that establishing exchangeability is challenging, and that exchangeable data sets are rare in the social sciences. The chapter concludes with suggestions for calibrating nonexchangeable data sets in order to use them in the sequential updating of findings from related studies.

Chapter 3- A Tutorial on Using the WAMBS-Checklist to Avoid the Misuse of Bayesian Statistics by Rens van de Schoot, Duco Veen, Laurent Smeets, Sonja D. Winter and Sarah Depaoli

Chapter 3 guides readers through the steps of the When-to-Worry-and-How-to-Avoid-the-Misuse-of-Bayesian-Statistics checklist (the WAMBS-checklist) in order to provide background for other chapters in this book. This chapter supplements the original WAMBS-checklist with prior and posterior predictive model checking. We also compare the performance of two popular Bayesian R packages, RStan and rjags. We show why using the Hamiltonian Monte Carlo procedure, available in RStan, is more efficient in small sample sizes. All data and the annotated R code to reproduce the results are available on the Open Science Framework.

Chapter 4- The importance of collaboration in Bayesian analyses with small samples by Duco Veen and Marthe Egberts

This chapter addresses Bayesian estimation with (weakly) informative priors as a solution for small sample size issues. Special attention is paid to the problems that may arise in the analysis process, showing that Bayesian estimation should not be considered a quick solution for small sample size problems in complex models. The analysis steps are described and illustrated with an empirical example for which the planned analysis goes awry. Several solutions are presented for the problems that arise, and the chapter shows that different solutions can result in different posterior summaries and substantive conclusions. Therefore, statistical solutions should always be evaluated in the context of the substantive research question. This emphasizes the need for a constant interaction and collaboration between applied researchers and statisticians.

Chapter 5- A tutorial on Bayesian penalized regression with shrinkage priors for small sample sizes by Sara van Erp

Many of the methods provided in other chapters of this book offer solutions for samples that are small in an absolute sense, for example in single-case designs. In this chapter, the focus is instead on small samples relative to the complexity of the model. I illustrate how Bayesian penalization offers a solution to this problem by applying so-called ’shrinkage priors’ that shrink small effects towards zero while leaving substantial effects large. A tutorial is provided on applying Bayesian penalization to a linear regression model using the R package bayesreg, which implements various shrinkage priors.


Section 2 is composed of chapters on methods for analyzing data from a single participant.


Chapter 6- One by One: The Design and Analysis of Replicated Randomized Single-Case Experiments by Patrick Onghena

One possible small sample size solution is obtained by conceptualizing a study with a small sample size as a set of replicated single-case experiments. Single-case experiments are experiments in which one unit is observed repeatedly during a certain period of time under different levels of at least one manipulated variable. These experiments can have a randomized design and can be replicated for a preplanned number of participants. The statistical analysis of the data collected with these experiments can be model-based or design-based, and can be unilevel or multilevel. This chapter provides an example of a unilevel design-based analysis.

Chapter 7- Single-case experimental designs in clinical intervention research by Marija Maric and Vera van der Werff

Single-case experimental designs (SCEDs) have received increasing attention in the past decade in clinical intervention studies, mainly because of their great conceptual and practical value to provide answers to clinically-relevant questions. SCED-studies can serve as an alternative for, or addition to, Randomized Controlled Trials. This chapter provides descriptions of the main features of SCEDs, namely: research questions that can be answered using SCEDs, and issues related to design and data-analyses. In addition, using a fictitious example, we illustrate a single-case study which tests both effectiveness as well as change processes in an anxious adolescent treated with cognitive behavioral therapy. In the last section of the chapter, several recent challenges are highlighted which may, when resolved, lead to even greater implementation of this useful design in clinical (research) practice.

Chapter 8- How to improve the estimation of a specific examinee’s (n = 1) math ability when test data are limited by Kimberley Lek and Ingrid Arts

In practice, (test) data for a specific examinee can be limited. A teacher or psychologists might be interested, for instance, in estimating the math ability of a specific examinee, but the scope of the math test the examinee completed is limited. Or, the teacher or psychologist might be interested in the question of how reliable the examinee’s score is on the test (i.e., the error variance), but the examinee did not answer enough items or tests to accurately estimate this reliability. In such cases, it can be beneficial to supplement the information from the single examinee with other information sources. This chapter explores two such sources. The first is the rich, but often ignored source of teacher knowledge. The second is the possibility to use test results of more or less ‘similar’ examinees (e.g., examinees with the same or comparable test scores). This chapter discusses how teacher knowledge and test results of ‘similar’ examinees can be used to improve the estimation of an examinee’s math ability.

Chapter 9- Combining evidence over multiple individual analyses by Fayette Klaassen

Hypothesis testing is omnipresent in behavioral and biomedical research, and usually concerns evaluating whether effects exist in the population. For example, the research question might be: is there a difference between groups on average? This chapter presents a Bayesian method to evaluate hypotheses for each person in a sample and aggregate this result to answer the question whether a hypothesis holds for everyone in the sample, rather than on average.
Using an empirical dataset, the methodology is illustrated step by step: from formulating the research question and hypotheses, to modelling the data and drawing conclusions.

Chapter 10- Going Multivariate in Clinical Trial Studies: A Bayesian Framework for Multiple Binary Outcomes by Xynthia Kavelaars

In an era where medicine is increasingly personalized, clinical trials often suffer from small samples. As a consequence, treatment comparison based on the data of these trials may result in inconclusive decisions. Efficient decision-making strategies are highly needed so decisions can be made with smaller samples without increasing the risk of errors. The current chapter centers around one such strategy: Including information from multiple outcomes in the decision, thereby focusing on data from binary outcomes. Key elements of the approach are 1.) criteria for treatment comparison that are suitable for two outcomes and 2.) a multivariate Bayesian technique to analyze multiple binary outcomes simultaneously. The conceptual discussion of these elements is complemented with software to implement the approach. To further facilitate trials with small samples, the chapter also outlines how interim analyses may result in more efficient decisions compared to the traditional sample size estimation before data collection.


Section 3 deals with complex hypotheses and models fit to small sample data.


Chapter 11- An Introduction to Restriktor: Evaluating Informative Hypotheses for Linear Models by Leonard Vanbrabant and Yves Rosseel

Many researchers have specific expectations about the relation between the means of different groups or between (standardized) regression coefficients. For example, in an experimental setting, the comparison of two or more treatment groups may be subject to order constraints (e.g., H1: µ1 < µ2 < µ3 = µ4). In practice, hypothesis H1 is usually tested using a classical one-way ANOVA with additional pairwise comparisons if the corresponding F-test is significant. In this chapter, we introduce the freely available R package restriktor for evaluating order-constrained hypothesis directly. Testing specific expectations directly does not require multiple significance tests. This way, researchers avoid inflated type I error rates that might occur without any corrections that control the familywise type I error rate, and decreases in power that occur due to such corrections. The procedure is illustrated using four examples.

Chapter 12: Testing Replication with Small Samples: Applications to ANOVA  by Mariëlle Zondervan-Zwijnenburg and Dominique Rijshouwer

Findings based on small samples can offer important insights, but original small sample findings should be replicated before strong conclusions can be drawn. In this chapter, we present four common replication research questions: 1) whether the new effect size is similar to the original effect size; 2) whether the new effect size differs from the original effect size; 3) whether the conclusions based on new results differ from the original conclusions; and 4) what the effect size is in the population. For each of these research questions, we discuss appropriate evaluation methods: replication Bayes factors, confidence intervals, prediction intervals, the prior predictive p-value, and bias-corrected meta-analysis methods. Each method is illustrated for the replication of an ANOVA and associated post-hoc t-test. Annotated R-code for all analyses is provided with the chapter.

Chapter 13- Small sample meta-analyses: Exploring heterogeneity using MetaForest by Caspar J. van Lissa

Meta-analyses often suffer from two related problems: A small sample of studies, and many between-studies differences that might influence the effect size. Power is typically too low to adequately account for these between-study differences using meta-regression. Researchers risk overfitting: capturing noise in the data, rather than true effects. This chapter introduces MetaForest: A machine-learning based approach for identifying relevant moderators in meta-analysis. MetaForest is robust to overfitting, handles many moderators, and captures non-linear effects and higher order interactions. This chapter discusses the problems with small samples and many moderators, introduces MetaForest as a small-sample solution, and provides a tutorial example analysis.

Chapter 14: Item Parcels as Indicators: Why, When, and How to Use Them in Small Sample Research by Charlie Rioux, Zachary L. Stickley, Omolola A. Odejimi and Todd D. Little

Meta-analyses often suffer from two related problems: A small sample of studies, and many between-studies differences that might influence the effect size. Power is typically too low to adequately account for these between-study differences using meta-regression. Researchers risk overfitting: capturing noise in the data, rather than true effects. This chapter introduces MetaForest: A machine-learning based approach for identifying relevant moderators in meta-analysis. MetaForest is robust to overfitting, handles many moderators, and captures non-linear effects and higher order interactions. This chapter discusses the problems with small samples and many moderators, introduces MetaForest as a small-sample solution, and provides a tutorial example analysis.

Chapter 15: Small Samples in Multilevel Modeling by Joop J. Hox and Daniel McNeish

Multilevel models have become a mainstream and flexible method by which to account for clustered data. These models have sample sizes at different levels of the hierarchical data structure where the sample size at the highest level is generally the most relevant for assessing whether the analysis may be at risk for small sample estimation bias. Unfortunately, the highest-level sample size is also the most difficult and costly to increase. This chapter reviews the approaches and remedies for small samples sizes in multilevel regression and multilevel structural equation models, from both frequentist and Bayesian perspectives.

Chapter 16- Small sample solutions for structural equation modeling by Yves Rosseel

Structural equation modeling (SEM) is a widely used statistical technique for studying relationships in multivariate data. Unfortunately, when the sample size is small, several problems may arise. Some problems relate to point estimation, whereas other problems relate to small sample inference. This chapter contains several potential solutions for point estimation, including penalized likelihood estimation, a method based on model-implied instrumental variables, two-step estimation, and factor score regression. This chapter also contains a brief discussion of inference, including several corrections for the chi-square test statistic, local fit statistics, and some suggestions to improve the quality of standard errors and confidence intervals.

Chapter 17- SEM with Small Samples: Twostep Modeling and Factor Score Regression versus Bayesian Estimation with Informative Priors by Sanne Smid and Yves Rosseel

Using a simulation study, we investigated – under varying sample sizes – the performance of twostep modeling, factor score regression, maximum likelihood estimation and Bayesian estimation with default and informative priors. We conclude that with small samples, all frequentist methods showed signs of breaking down (in terms of non-convergence, negative variances, extreme parameter estimates), as did the Bayesian condition with default priors (in terms of mode-switching behavior). When increasing the sample size is not an option, we recommend using Bayesian estimation with informative priors. However, results should be interpreted with caution, because of the large influence of the prior on the posterior with relatively small samples. When researchers prefer not to include prior information, twostep modeling or factor score regression are recommended, as those led to higher convergence rates without negative variances, more stable results across replications and less extreme parameter estimates than maximum likelihood estimation with small samples.

Chapter 18- Important Yet Unheeded: Some Small Sample Issues that are often Overlooked by Joop J. Hox

When small samples are analyzed, several issues typically arise. This chapter discusses five problem areas and potential solutions. First, multivariate analysis using Ordinary Least Squares estimation does not assume large samples, and therefore works well with small samples, although these still lead to tests with low power. Second, analysis of small samples is more vulnerable to violations of assumptions, and data characteristics and analysis choices in estimation become more important. As a result, data cleaning and examination of potential violations of assumptions are crucial. Finally, problems with small samples are ameliorated by using research designs that yield more information.