In regression analysis, bootstrapping is a method for statistical

inference, which focused on building a sampling distribution with the key idea

of resampling the originally observed data with replacement1. The

term bootstrapping, proposed by Bradley Efron in his “Bootstrap methods:

another look at the jackknife” published in 1979, is extracted from the cliché

of ‘pulling oneself up by one’s bootstraps’2. So, from the meaning

of this concept, sample data is considered as a population and repeated samples

are drawn from the sample data, which is considered as a population, to

generate the statistical inference about the sample data. The essential bootstrap analogy states that “the

population is to the sample as the sample is to the bootstrap samples”2.

The bootstrap falls into two types, parametric and nonparametric. Parametric

bootstrapping assumes that the original data set is drawn from some specific

distributions, e.g. normal distribution3. And the samples generally are

pulled as the same size as the original data set. Nonparametric

bootstrapping is just the one described in the beginning, which draws a portion

of bootstrapping samples from the original data. Bootstrapping is quite useful in

non-linear regression and generalized linear models. For small sample size, the

parametric bootstrapping method is highly preferred. In large sample size,

nonparametric bootstrapping method would be preferably utilized. For a further

clarification of nonparametric bootstrapping, a sample data set, A = {x1, x2, …,

xk} is randomly drawn from a population B = {X1, X2, …, XK} and K is much

larger than k. The statistic T = t(A) is considered as an estimate of the

corresponding population parameter P = t(B).2 Nonparametric

bootstrapping generates the estimate of the sampling distribution of a

statistic in an empirical way. No

assumptions of the form of the population is necessary. Next, a sample of size k

is drawn from the elements of A with replacement, which represents as A?1 = {x?11, x?12, …, x?1k}. In the resampling,

a * note is added to distinguish resampled data from original data. Replacement

is mandatory and supposed to be repeated typically 1000 or 10000 times, which

is still developing since computation power develops, otherwise only original

sample A would be generated. 1 And for each bootstrap estimate of

these samples, mean is calculated to estimate the expectation of the

bootstrapped statistics. Mean minus T is

the estimate of T’s bias. And T?, the bootstrap variance estimate, estimates the sampling variance of the population, P. Then bootstrap confidence

intervals can be constructed using either bootstrap percentile interval

approach or normal theory interval approach. Confidence intervals by bootstrap

percentile method is to use the empirical quantiles of the bootstrap estimates,

which is written as T?(lower)