Bootstrap is a powerful statistical tool and resampling technique that is used for estimating the sampling distribution of a statistic by repeatedly resampling from the observed data. It is particularly valuable when traditional parametric methods are not applicable or when we want to make inferences about a population parameter without making strong distributional assumptions.
- Resampling: Bootstrap involves drawing random samples from the observed data. These resampled datasets are called “bootstrap samples.” Each bootstrap sample typically has the same size as the original dataset.
- Estimation: A statistic of interest, such as the mean, median, variance, or a parameter estimate, is calculated for each bootstrap sample. This provides a collection of values for the statistic of interest, which forms the basis for inference.
- Sampling Distribution: By repeating the resampling process a large number of times, we can create a “bootstrap distribution” for the statistic. This distribution approximates the sampling distribution of the statistic under the assumptions of the original data.
- Inference: With the bootstrap distribution in hand, we can perform various types of statistical inference. For example, we can calculate confidence intervals, estimate standard errors, perform hypothesis tests, and more, without relying on traditional parametric assumptions like normality.