The resample() function will take a random sample of rows from the data frame, with replacement (of the specified sample size). It can be useful if you want to bootstrap a sampling distribution of statistics (e.g., b1, PRE, F) (Bootstrapping is sampling with replacement from observed data (such as the sample data) to estimate the variability in a statistic of interest).
For random sampling without replacement, see the sample() function.
Example 1:
You can take a closer look at how the resample() function works by looking at a couple of examples using a small sample size. In the example code below, you will see how you can take a random sample of rows (with replacement) for a single variable in the row, or for the whole row.
# You can sample rows (with replacement) for a specific variable
# Try resampling 5 Thumb lengths from Fingers a few times
# Notice that you get various combinations of the sample values (with replacement of values)
print("Resampled Thumbs, N=5")
resample(Fingers$Thumb, 5)
resample(Fingers$Thumb, 5)
resample(Fingers$Thumb, 5)
# Or you can sample (with replacement) the whole row
# Try resampling 3 whole rows from Fingers a couple of times
print("Resampled Fingers, N=3")
resample(Fingers, 3)
resample(Fingers, 3)
Example of output from running the code above (your values may vary due to sampling variation):
Example 2:
What happens if you take a sample greater than the number of rows in the data? Take a look with this small dataset.
# A small vector with 5 heights (in inches)
Height_in <- c(63, 66, 64, 70, 60)
print("Heights in inches")
Height_in
# Try resampling from the vector a few times
# Notice that you get various combinations of the sample values (with replacement of values)
print("Resampled Heights, N=5")
resample(Height_in, 5)
resample(Height_in, 5)
resample(Height_in, 5)
# What happens if you take a sample greater than the number of values in the sample?
print("Resampled Heights, N=10")
resample(Height_in, 10)
resample(Height_in, 10)
resample(Height_in, 10)
Example of output from running the code above (your values may vary due to sampling variation):
Example 3:
Here is an example of how to construct a bootstrapped sampling distribution of b1s using resample().
# Use resample() to bootstrap a sampling distribution of 1000 b1s, centered on the sample b1
sdob1_boot <- do(1000) * b1(Tip ~ Condition, data = resample(TipExperiment))
# Check a few rows of the data frame
head(sdob1_boot)
# Optional: Get the actual sample b1
print("sample b1:")
b1(Tip ~ Condition, data = TipExperiment)
# Plot all of the simulated b1s from this DGP to estimate how much the sample b1 might vary
gf_histogram(~b1, data = sdob1_boot)
Example of output from running the code above (your values may vary due to sampling variation):