Shared Flashcard Set


GH Data Analysis
Sampling (T Pierce)
Health Care

Additional Health Care Flashcards




Two basic types of sampling
  • probability based
  • non probability based


¤The entirety.
¤All the members/elements you can think of within a specific category.
¤Size and characteristics depend on what one is thinking of studying
Char. of probability sampling
  • always for quantitative research
  • sampling units have known probability of being selected
  • involves statistical analysis
  • results can be generalized
  • sample size calculation is involved
Char. of non-probability sampling
  • usually for qualitative research methods
  • no such thing as known probability
  • doesnt involve statistics (except status quo sampling)
  • results are not intended to be generalizable
  • sample size calculation is not involved
define sampling


¤The act of taking a sample
¤a process of selecting what/who will participate in a study
sampling frame
The complete list of all elements from where you select what/who you want to study
define sample


¤Part of the whole
¤A methodical selection of the required number of sampling units from a defined population
target population
population about which information is required
study population
the population that is actually listed in your sampling frame, sometimes target population and study population are the same population
Random sampling design


¤Simple random sampling
¤Systematic sampling
¤Two-stage sampling (cluster sampling)
Effect of random sampl,ing
  • ensures equal chance of getting selected
    • those I selected can represent those who have not been selected
    • should have representatitive sample of population where it was collected
simple random sampling


¨The most basic sampling design.
¨If you can make a complete list of your target population then you can use simple random sampling
¨The idea is to assign a number to each of the units in a population and then use a random number generator of some sort to choose the respondents for the analysis.
Process of simple random sampling


¨Obtain a sampling frame that contains all the units in the study population (walking door to door, use census, etc.).
¨Number each unit from 1 to N—the study population size.
¨Select sampling units using a sequence of random numbers between 1-N using a random number table or calculator or software (Excel, Stata)
The most practical way of doing this for anything more than very small numbers is to hold the list electronically (in Stata) and generate a column of random numbers beside it.  Then sort the list by the random numbers and take the first n units (n = desired sample size).
random number tables: when to use
when in the field and cannot use computerized simple random samples
random number table: Process


¨Number each member of the study population.
¨Determine study population size (N).
¨Determine sample size (n).
¨Determine starting point in table by randomly picking a page and dropping your finger on the page with your eyes closed.
¨Choose a direction in which to read (up to down, left to right, or right to left).
¨Select the first n numbers read from the table whose last X digits are between 0 and N. (If N is a two digit number, then X would be 2; if it is a four digit number, X would be 4; etc.).
¨Once a number is chosen, do not use it again.
¨If you reach the end of the table before obtaining your n numbers, pick another starting point, read in a different direction, use the first X digits, and continue until done.
Difficulties in practicing simple random sampling
  • too many households
  • widely distributed households
  • no lists of households readily available
  • high transport cost
  • difficult management
  • requires complete sampling frame
  • expensive with widely distributed population even with complete sampling frame
  • SRS usually practical only for small compact populations
systematic sampling


¨Every Kth unit in a list is selected for inclusion in the sample
¨To be effective, there needs to be no systematic order in the list
¤Be very careful if names are listed alphabetically.
¤If the interval chosen is 12 and you had monthly data, then you would always choose data from the same month.
¨Empirically identical to simple random sampling
When is most useful to use systematic sampling

no precise sampling frame

Advantages and disadvantages of systematic random sampling


¤You can start without having a precise sampling frame
¤Easier to keep track in the field
¤With an inherent trend in the sampling frame, representation is ensured from minimum to maximum values of the trend.
nListing according to population size—guarantee good spread of people from urban and rural areas
nOrder by value of housing—sample would include a good spread of rich and poor
¤Good approximation to SRS
¤Any cyclicality or periodicity in the in the list and you are in trouble. 
¤Or large inequality in the list may get you into trouble.
definition of two stage cluster sampling
a 2-stage cluster sample is obtained by first selecting a probability sample of clusters (Primary sampling Unit) and then selecting a probability sample of elements from each sampled clusters
When is it most desireable to do two stage cluster sampling?


¤Large survey covering large geographic area involving sampling of Housing Units (e.g. demographic health surveys)
¤A frame listing all elements in the population may be impossible or costly to obtain where as listing all clusters and knowing their properties is easier. 
Difficulty with two stage sampling


¨Need to ensure each individual has the same chance of being selected.
¨Problem:  If we select both first and second stage units (e.g. villages then household) using SRS, and take the same number of hhs per village, we have a problem. (aka some people have higher chance of being sampled than others)
Solution:  Sample PSUs with probability proportional to size (PPS), so  larger villages are more likely to be selected.  Then take same number of hhs from each cluster selected
Steps to cluster sampling


¨Bangladesh has about 100 HHs in clusters of enumeration areas (EAs) used by the census bureau
¨Sampling frame of EAs is accessible. EAs are the Primary sampling unit (PSU)
¨Take random sample of a desired number of EAs
¨In each EA list HHs= sampling frame for each EAs
¨ Randomly select HHs from each
Two stage cluster sampling using population proportional to size (PPS) model


¨True PPS sampling produces a self-weighting sample, each household has the same chance of being selected—no weights are required.
¨PPS sampling involves listing units (towns, villages etc.) with their population size.
¨In doing this we may select many persons from a large town and few (or even none) from a tiny village—but this is correct—it is proportional to size.
characteristics of stratified sampling


¨Break up the population into homogenous non-overlapping groups (strata) before sampling.
¤Why? We want to get a reasonable estimate of something (e.g. neonatal mortality) in different sub-groups in our population—say by region or urban rural
¨May be used in conjunction with simple random sampling, systematic, or cluster sampling
Primary reasons for doing stratified sample design


¤To potentially improve representativeness, in terms of the stratification variables, by gaining greater control over the composition of the sample.
¤To ensure that particular groups within a population are adequately represented in the sample
What determines how we stratify in sampling?


¨The choice of stratification variables depends on the variables that matter for responses.  In most public health research, natural candidates are race, income, education, gender, location, etc.
¨The sampling fraction generally varies across strata (i.e. some individuals will have more chance of being in the sample)  Therefore we would need to make adjustments for this during our data analysis using sampling weights.
Proportionate stratified sampling vs. disproportionate


¨Proportionate stratified sampling: 
¤when sampling units in the strata are selected proportional to their representation in the source population
¨Disproportionate sampling: 
¤deliberately increasing the size of sampling units selected from a particular strata so they represent a disproportionate figure in the sample compared to the source population.
disproportionate sampling: how to compensate for it?


¨Researchers must adjust unequal probability sampling data to compensate for the differences in the likelihood of some population members appearing in the sample
¨You make this adjustment by weighing certain observations more than others in the estimations.
What is lot quality assurance sampling and its advantages
  • can be used locally at the level of "supervision area" to ID priority areas or indicators that are not reaching average coverage or anc established benchmarks
  • can provide an accurate measure of coverage or health care system quality at a more aggregate level
  • can be used for quality assurance using a minimal sample, maximum security principle
Supporting users have an ad free experience!