resample()

resample()

The resample() function will take a random sample of rows from the data frame, with replacement (of the specified sample size). It can be useful if you want to bootstrap a sampling distribution of statistics (e.g., b1, PRE, F) (Bootstrapping is sampling with replacement from observed data (such as the sample data) to estimate the variability in a statistic of interest).

For random sampling without replacement, see the sample() function.

Example 1:

You can take a closer look at how the resample() function works by looking at a couple of examples using a small sample size. In the example code below, you will see how you can take a random sample of rows (with replacement) for a single variable in the row, or for the whole row.

# You can sample rows (with replacement) for a specific variable
# Try resampling 5 Thumb lengths from Fingers a few times
# Notice that you get various combinations of the sample values (with replacement of values)
print("Resampled Thumbs, N=5")
resample(Fingers$Thumb, 5)
resample(Fingers$Thumb, 5)
resample(Fingers$Thumb, 5)
# Or you can sample (with replacement) the whole row
# Try resampling 3 whole rows from Fingers a couple of times
print("Resampled Fingers, N=3")
resample(Fingers, 3)
resample(Fingers, 3)

Example of output from running the code above (your values may vary due to sampling variation):
Output of resampled thumbs for N equals five
Output of 3 randomly sampled rows from the Fingers data frame

Example 2:

What happens if you take a sample greater than the number of rows in the data? Take a look with this small dataset.

# A small vector with 5 heights (in inches)
Height_in <- c(63, 66, 64, 70, 60)
print("Heights in inches")
Height_in
# Try resampling from the vector a few times
# Notice that you get various combinations of the sample values (with replacement of values)
print("Resampled Heights, N=5")
resample(Height_in, 5)
resample(Height_in, 5)
resample(Height_in, 5)
# What happens if you take a sample greater than the number of values in the sample?
print("Resampled Heights, N=10")
resample(Height_in, 10)
resample(Height_in, 10)
resample(Height_in, 10)

Example of output from running the code above (your values may vary due to sampling variation):
Output of randomly sampled heights of differing size Ns. When N is larger than the number or rows, it will just keep sampling as if there are more rows because it is replacing the values after sampling them.

Example 3:

Here is an example of how to construct a bootstrapped sampling distribution of b1s using resample().

# Use resample() to bootstrap a sampling distribution of 1000 b1s, centered on the sample b1
sdob1_boot <- do(1000) * b1(Tip ~ Condition, data = resample(TipExperiment))
# Check a few rows of the data frame
head(sdob1_boot)
# Optional: Get the actual sample b1
print("sample b1:")
b1(Tip ~ Condition, data = TipExperiment)
# Plot all of the simulated b1s from this DGP to estimate how much the sample b1 might vary
gf_histogram(~b1, data = sdob1_boot)

Example of output from running the code above (your values may vary due to sampling variation):
Output of first 6 rows of sdob1_boot
Output of sample b1. It equals 6.05.
Histogram of 1000 bootstrapped b1s. It is centered around 6 and spreads from about negative 5 to 20.

    • Related Articles

    • sampling with replacement

      Sampling with replacement takes a sample from a population, record the values, put all cases back into the population, then sample again; the R function resample() does this.
    • do()*

      The do()* function runs one or more lines of code the number of times specified inside the parentheses and returns the results as a data frame. Example 1: # take a random sample (n=10) of a variable, with replacement, # and calculate the standard ...
    • sample()

      The sample() function will take a random sample of rows from the data frame, without replacement (of the specified sample size).  For random sampling with replacement, see the resample() function. Example 1: You can take a closer look at how the ...