Identifying Random Sampling
Suppose you were handed a bag full of a large number of small, unknown items, and asked to try to identify the contents without looking in the bag. Naturally the easiest way to investigate would be to reach inside and grab a handful to see what turned up. If you reached in and then pulled out a handful of nothing but marbles of various colors off the top, you’d quite reasonably state that this must be a bag of marbles.
NEMO - pixabay.com/en/brown-drive-paper-food-fruit-24550/?oq=bag
Is it possible that you might be wrong? How? What could you have done to improve your guess?
A random sampling is easily the most commonly known method of choosing representatives of a population to comprise a statistical sample. Since the most likely error in developing an accurate statistical sample comes from some sort of personal bias, careful researchers try to allow as little influence from personal decisions to affect the choice of the sample members as possible. The easiest way to avoid conscious or subconscious bias is to use a random method of selection. However, true random sampling may be more difficult than it appears. A good definition of random sampling is: “A sample consisting of individuals each chosen entirely by chance, in such a way that, at every stage of the process, every potential member of the sample has the same probability of being chosen as every other member.”
In order to have a truly random sample, all of the factors involved with the choice of representatives must be entirely random processes, and out of the control of any influence which might have any sort of bias. A simple random sample is the process of assigning a number to each member of the population under study, and then using a random number generator to pick the samples. In subsequent lessons, we will learn how to individually identify and implement other types of random samples, including: Stratified Random Samples, Systematic Random Samples, and Clustered Random Samples.
Identifying Random Samples
1. In order to randomly pick 15 students out of a class of 210, a researcher enters the names into a spreadsheet and numbers them 1-210. Using a random number generator, he picks 15 numbers between 1 and 210, and uses the names associated with those 15 numbers.
Is this a random sample?
Yes, this is a random sample. In fact, this is probably the most common method of randomizing the choice of names from a list. By converting the names to numbers, the researcher minimizes the chance that he/she will be influenced by a subconscious preference for a particular name. Further, by using a random number generator to pick the 15 numbers, he keeps his own thoughts away from the actual number selection process.
2. A researcher is conducting a survey on how many students at a particular high school go off-campus for lunch. He decides to pick every 5th person in line at the school cafeteria and ask how often he/she goes out to get lunch instead of eating at the cafeteria.
Is this a random sample? Is it appropriate for the study?
This is a random sample, but not of the correct population. The concept of ‘every 5th person’ is convenient and effective as a method of choosing random students from the line, but choosing students from the lunch line at school excludes the students who are already off campus for lunch!
Choosing a Random Sample
Evan is conducting a study of the types of seashells on the 2-mile stretch of beach near his home.
How might he choose a random sample to represent the population of seashells on the beach? Would simply walking around without aim and picking up seashells be effective? Why or why not?
PDPhotos - pixabay.com/en/shell-seashell-beach-old-remains-3234/?oq=SEA%20SHELL
Just walking around without direction will not actually result in a random sample, since Evan may have conscious or unconscious preferences that would affect where he would choose to walk. Also, ‘just picking up’ seashells would further influence the sample with his own preferences since he would be picking up the shells he chooses.
If Evan wanted to randomize the sample, he could do so with a set of dice. He could roll the dice before leaving, and use the number to set a distance to travel before stopping. He could pick up any seashells within reach at the stop, and roll the dice again. If he turned right on even rolls, and left on odd rolls, the results would be much more random.
Earlier Problem Revisited
If you reached in and then pulled out a handful of nothing but marbles of various colors off the top, you’d quite reasonably state that this must be a bag of marbles.
Is it possible that you might be wrong? How? What could you have done to improve your guess?
It is entirely possible that you are correct, but it is also pretty clear that the results would be suspect at best. By only grabbing a single handful of items off the top of the bag, you are limiting your results to items that worked their way to the top. There may well have been other items at the bottom of the bag that you did not get a sample of by just grabbing off the top. You could certainly improve the reliability of the sample by first shaking the bag well so that your sample is appropriately randomized.
This is an example of convenience sampling, which is selecting a sample more because it is easily obtained than because it is random or representative.
For each question below, first decide if the example describes a random sample, second describe why you believe it is/isn’t random.
Several apples are selected from each bin of different types at the market.
This is probably not a random sample. The question does not specify how the apples are chosen from each bin. You would have an unconscious preference for shinier apples if you just selected one from a bin.
Each student is assigned a number and a die is rolled. Starting with that number each 4th student is chosen until the quota is met.
This is a valid random sampling, specifically a systematic random sample. Each student has the same opportunity to be chosen as any other.
To build a sample of students in his state, Brian first made a list of school districts and assigned each a number. He then randomly chose 4 districts and assigned each school in those districts to a number and randomly chose 3 schools from each district. Once he has the schools, he assigned numbers to each students in each school, and used a random number generator to pick 15 students from each school.
This is a valid random sampling method, specifically a multi-stage cluster sample. Each student in the state has the same opportunity to be chosen for the sample.
Katrina closed her eyes and reached into a beverage stand, randomly selecting a soda.
This is not a random sample. Katrina would be more likely to pick one within reach than to select from the top or bottom, and would be unlikely to reach behind the front sodas to select one further back. Some sodas have a greater chance of being chosen than others.
State whether each example describes a random sample:
- Consumer Reports is conducting a poll to evaluate consumer opinions on the latest console gaming platform. The publisher includes a postcard in an issue of Consumer Reports, and asks readers to fill out the form and return it.
- Kelly closes her eyes, opens the dictionary and randomly points to a word on the page. She repeats the process 10 times.
- Shake a bag of marbles well, reach in and select a marble with eyes closed.
- Number the students in your class 1-32, and roll six standard dice to choose a random student, discarding any number greater than 32.
- Walk along a pebbled path and randomly pick up a stone.
- Roll standard 4 dice and choose a name starting with the letter associated with that number if the alphabet is numbered 1-26.
- Walk around a park and randomly pick up fallen leaves to see what kind of trees are in the park.
- Search for “dog” on Google to get a random sample of popular dogs.
- Ask shoppers in a hardware store if they know how to replace a light switch to learn the percentage of people that know how.
- Spin a spinner numbered 1-25 to decide how much to charge for a glass of lemonade.
To view the Review answers, open this PDF file and look for section 3.1.
|Bias refers to a desire to achieve a specific result from a particular study, regardless of the data.
|Cluster Sampling involves choosing representatives which are close to other representatives based on a particular factor such as location, age, color, size, etc.
|Random Sampling involves choosing representatives by rolling a die, for instance.
|A sample is a specified part of a population, intended to represent the population as a whole.
|simple random sample
|A simple random sample is the process of assigning a number to each member of the population under study, and then using a random number generator to pick the samples.
|Stratified Sampling involves choosing a proportional number of representatives from each of a number of subgroups of the initial population.
|systematic random sampling
|Systematic random sampling is a type of probability sampling technique where there is an equal chance (probability) of selecting each unit from within the population when creating the sample.
|A weight function is a mathematical device used when performing a sum, integral, or average to give some elements more "weight" or influence on the result than other elements in the same set.
PLIX: Play, Learn, Interact, eXplore - How Many Samples does it takes to Change a Light Bulb?
Practice: Simple Random Sampling
Real World: Drug Test