Unit 1: Sample designs
Definitions:
Sample Designs
Observational Study: doesn't attempt to influence the responses. ex. a survey
-cohort -case-control -cross sectional
Population: the entire group of individuals that we want information about
-parameter: values that describe a population
Sample: part of the population that we examine to gather information
-statistics: values that describe a sample
-voluntary response -convenience -simple random -multistage
- probability -systematic -stratified random -cluster
Voluntary response: the subjects choose themselves by responding to the survey
Convenience sampling: choosing the individuals who are easiest to reach
Probability sampling: a sample chosen by chance
Systematic sampling: sample where you follow a system that usually uses counting
Simple Random Sampling:
-every individual has an equally likely chance of being chosen
-every sample of size "n" has an equally likely chance of being chosen
1. label each individual 2. use random number table to select labels
Stratified:
-divide sample into strata:a group of individuals who are similar
-choose an SRS from each strata
-combine SRS's to form full strata
Cluster Sampling: a group of individuals that were chosen by randomly selecting the group
Multistage Sampling: using one or more sampling techniques to obtain a sample
Census: attempting to contact every individual in the population
Sampling: studying a part in order to gain information about the whole
Qualitative: variables that categorize the individuals
-discrete variables:qualitative variables that have a finite number of possible values
Quantitative: variables that are numerical and using mathematical operations on them provides meaningful results
-continuous variables: quantitative variables that have an infinite number of possible values. Not countable.
Lurking Variables: a variable that is unknown and not controlled
Confounding: a variable that influences both the dependent variable and independent variable
Causation: a relationship between two events where one event is affected by the other.
Bias: the design of a study is biased if it systematically favors certain outcomes
-voluntary response sample -convenience sampling -under-coverage -non-response -response bias -wording bias
-under-coverage:some group of the population is left out of the process for choosing the sample
-non-response:an individual for the sample can't be contacted or does not cooperate
-response bias:behavior of the respondent or interviewer can cause bias
- wording bias:the wording of a question can cause bias
​
​
​
​
​
Designing an
Experiment:
Experimental units: the individuals on which an experiment is done
Subjects: when the units are people
Treatment: a specific experimental condition applied to units
Factor: the explanatory variable (independent variable)
Level: when a treatment is formed from multiple factors
Block: a group of experimental units or subjects that are similar in ways that are expected to affect the response to the treatments
Block design: the random assignment of units to treatments is carried out separately within each block
-matched pair design
1.only two treatments are possible
2.subjects can be paired up based on some blocking variable
​
Statistical
design principles:
The control of the effects of lurking variables is the first principle of statistical design
Randomization is the second major principle of statistical design.
-what is randomized are the treatments
-the purpose is to control for unknown variables
Replication is the third major principle
-can someone else replicate the experiment
​
​