Random sampling is a method of selecting observations in which each member of a population has a known chance of being included in the sample. Random sampling helps create samples that are representative of the population and reduces the risk of bias.
Researchers usually cannot study an entire population, so they collect data from a sample instead.
Random sampling helps ensure that:
Different types of observations have a chance to be included
The sample is less likely to reflect the researcher's preferences
Conclusions from the sample are more likely to generalize to the population
Suppose a school has 2,000 students and a researcher wants to estimate the average number of hours students spend on homework each week.
Instead of surveying only students from one class, the researcher could:
Create a list of all 2,000 students.
Use a random process to select 200 students.
Survey those students.
Because the students were selected randomly, the sample is more likely to represent the larger population.
A random sample may still differ from the population by chance.
For example:
One random sample might contain slightly more seniors than expected.
Another might contain slightly more athletes.
Random sampling reduces systematic bias, but it does not eliminate sampling variation.
In a modeling framework, random sampling helps researchers collect observations from a broader data-generating process (DGP).
The goal is for the sample to reflect the variation present in the larger population or process being studied. If the sample is collected randomly, patterns found in the sample are more likely to represent patterns in the population rather than quirks of the sampling procedure.
For example:
Surveying randomly selected students from a school roster → random sampling
Surveying only students in your first-period class → convenience sampling
Random sampling generally produces more trustworthy results.
These ideas are related but different.
A sample can be random without being independent, and observations can be independent without being randomly selected.
Random sampling helps researchers:
Reduce bias
Obtain more representative samples
Make stronger inferences
Generalize findings beyond the observed data
Because of these benefits, random sampling is a cornerstone of statistical reasoning.