Boosting Small Probability Samples with Nonprobability Sample Information – Joseph Sakshaug
Scientiﬁc surveys based on random probability samples are ubiquitously used in the social sciences to study and describe large populations. They provide a critical source of quantiﬁable information used by governments and policy-makers to make informed decisions. However, probability-based surveys are increasingly expensive to carry out and declining response rates observed over recent decades have necessitated costly strategies to raise them. Consequently, many survey organizations have shifted away from probability sampling in favor of cheaper non-probability sampling based on volunteer web panels. This practice has provoked signiﬁcant controversy and scepticism over the representativeness and usefulness of non-probability samples. While probability-based surveys have their own representativeness concerns, comparison studies generally show that they are more representative than non-probability surveys. Hence, the survey research industry is in a situation where probability sampling is the preferred choice from an error perspective, while non-probability sampling is preferred from a cost perspective. Given the advantages of both sampling schemes, it makes sense to devise a strategy to combine them in a way that is beneﬁcial from both a cost and error perspective. We examine this notion by evaluating a method of integrating probability and non-probability samples under a Bayesian inferential framework. The method is designed to utilize information from a non-probability sample to inform estimations based on a parallel probability sample. The method is evaluated through a real-data application involving two probability and eight non-probability surveys that fielded the same questionnaire simultaneously. We show that the method reduces the variance and mean-squared error (MSE) of a variety of survey estimates, with only small increases in bias, relative to estimates derived under probability-only sampling. The MSE/variance efficiency gains are most prominent when a small probability sample is supplemented by a larger non-probability sample.
Institute for Employment Research