Term
| Bivariate descriptive statistics |
|
Definition
are used to describe relationships between two variables Examples: Height and weight Smoking status and lung cancer incidence |
|
|
Term
| Appropriate statistic depends on the |
|
Definition
| variables’ level of measurement |
|
|
Term
|
Definition
| Researchers crosstabulate the frequencies of all categories of two variables in a two-dimensional frequency distribution |
|
|
Term
| Crosstabulated variables should be |
|
Definition
| nominal level (or ordinal level with a small number of categories) |
|
|
Term
|
Definition
| have been developed to describe risk outcomes and facilitate clinical decision making |
|
|
Term
| Risk indexes capture two aspects of the effects of risk exposure |
|
Definition
| absolute and relative risk |
|
|
Term
|
Definition
| Indexes quantify the actual amount of risk related to different exposures |
|
|
Term
|
Definition
| Indexes compare risks in the two risk exposure groups |
|
|
Term
|
Definition
| is the proportion of people with a negative outcome |
|
|
Term
|
Definition
| is the absolute difference between the two risk groups |
|
|
Term
|
Definition
| is the ratio of absolute risks (adverse outcomes) in the two groups |
|
|
Term
|
Definition
| is the proportion of people in each risk group who have the adverse outcome, relative to the proportion who do not |
|
|
Term
| In many cases, the value of RR and OR |
|
Definition
|
|
Term
|
Definition
| is a bond or connection between variables |
|
|
Term
| Correlations between two quantitative variables can be graphed in |
|
Definition
|
|
Term
|
Definition
| graphs the values of one variable on the X axis and the values of the second one on the Y axis of a graph |
|
|
Term
| A scatterplot indicates whether the variables have a |
|
Definition
| linear relationship with each other |
|
|
Term
|
Definition
| direction and magnitude of the relationship |
|
|
Term
|
Definition
| Sometimes data points are not linearly related—they are positively or negatively correlated, but only up to a point, then the relationship changes |
|
|
Term
| A correlation coefficient |
|
Definition
| is a statistic that summarizes the magnitude and direction of relationships between two variables |
|
|
Term
| Most widely used correlation coefficient |
|
Definition
|
|
Term
|
Definition
| is computed with variables that are interval- or ratio-level measures |
|
|
Term
1.00 = Perfect positive relationship E.g., a flat $1 tax for every $5 earned .35 = Weak/moderate positive relationship E.g., nurses’ degree of autonomy and job satisfaction (those with more autonomy are somewhat more satisfied) .00 = No relationship E.g., nurses’ degree of autonomy and height (tall and short nurses equally autonomous) -.20 = Weak negative relationship E.g., diabetic knowledge and a person’s age (older people are somewhat less knowledgeable) -.70 = Strong negative relationship E.g., levels of depression and life satisfaction (those with high levels of depression have lower life satisfaction) |
|
Definition
|
|
Term
| A correlation between two variables never implies |
|
Definition
| that one variable caused the other |
|
|
Term
|
Definition
| for coming to conclusions about what is probably true in a population, based on sample values |
|
|
Term
|
Definition
| statistics—for describing samples |
|
|
Term
| Inferential statistics uses the __________ to provide guidance on what is probably true |
|
Definition
|
|
Term
|
Definition
| is that the deck is fair—not “rigged” |
|
|
Term
|
Definition
|
|
Term
| Probability distributions |
|
Definition
| are similar to frequency polygons (or histograms) |
|
|
Term
| They graph the probabilities of |
|
Definition
| all events that could occur |
|
|
Term
| Probability density function = |
|
Definition
Probability distribution for continuous variables Example: A distribution of IQ scores for a population of 10,000 10-year-old children Population mean = 100.0 = μ Population SD = 15.0 = σ |
|
|
Term
| A ________________ is the distribution of an infinite number of sample means from the population, for samples of a given size |
|
Definition
| sampling distribution of the mean |
|
|
Term
| a mathematic formulation, shows that the mean of a sampling distribution of the mean always equals the population mean |
|
Definition
| The central limit theorem |
|
|
Term
| If population values are normally distributed |
|
Definition
| so is the sampling distribution of the mean |
|
|
Term
| The ________ is the standard deviation of a theoretical sampling distribution |
|
Definition
| standard error of the mean (SEM) |
|
|
Term
|
Definition
| the less likely it is that a sample mean is a good estimate of the population mean |
|
|
Term
|
Definition
| known, but can be estimated |
|
|
Term
|
Definition
| the samples’ standard deviation |
|
|
Term
Statistical Inference Approaches Two basic approaches |
|
Definition
Parameter estimation Hypothesis testing |
|
|
Term
|
Definition
| is used to estimate a population value—e.g., a mean, percentage, or odds ratio |
|
|
Term
|
Definition
A point estimate An interval estimate |
|
|
Term
| A ________ involves the calculation of a single value as the estimate of the parameter |
|
Definition
|
|
Term
| A point estimate is thus simply the |
|
Definition
| value of the descriptive statistic, like a mean |
|
|
Term
| An _______________ provides a range of values within which the population value has a specified probability of lying |
|
Definition
|
|
Term
| Interval estimation involves constructing ___________ around the point estimate |
|
Definition
|
|
Term
| A 95% _________ designates the range of values within which the parameter has a 95% probability of lying |
|
Definition
| confidence interval (95% CI) |
|
|
Term
| Constructing a CI involves calculating |
|
Definition
| confidence limits (the upper and lower limit of what is probable, at the specified probability level) |
|
|
Term
| The most commonly reported CIs are |
|
Definition
|
|
Term
| The ___________ is similar to a normal distribution—bell shaped and symmetric |
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
| computed around proportions/percentages and risk indexes like Relative Risk and the Odds Ratio |
|
|
Term
| The theoretical distribution for constructing CIs in these scenarios is the |
|
Definition
|
|
Term
| CIs around proportions and risk indexes are rarely |
|
Definition
|
|
Term
| Like the CI around a mean, the larger the sample size |
|
Definition
|
|
Term
|
Definition
|
|
Term
| (second broad approach to statistical inference) uses laws of probability to help researchers make objective decisions about accepting or rejecting a null hypothesis |
|
Definition
|
|
Term
In most cases, ________________ states a prediction that variables in the study are NOT related, e.g.: Cigarette smoking is unrelated to lung cancer Turning patients is unrelated to the incidence of pressure ulcers |
|
Definition
|
|
Term
The null hypothesis contrasts with researchers’ _____________, which typically states a prediction that variables in the study ARE related, e.g.: Cigarette smoking is related to lung cancer Turning patients is related to the incidence of pressure ulcers |
|
Definition
| actual research hypothesis |
|
|
Term
is similar to English-based criminal justice system The accused is assumed to be innocent |
|
Definition
|
|
Term
| Error Risk in Hypothesis Tests |
|
Definition
| Without data from the population, researchers make decisions about accepting or rejecting the null hypothesis based on incomplete information |
|
|
Term
|
Definition
|
|
Term
| The null hypothesis is really true in the population, and the researcher accepts it as true |
|
Definition
|
|
Term
| The null hypothesis is really false in the population, and the researcher rejects it |
|
Definition
|
|
Term
The null hypothesis is really true in the population, but the researcher rejects it (a false positive) E.g., an ineffective intervention is erroneously considered effective |
|
Definition
|
|
Term
The null hypothesis is really false in the population, but the researcher accepts it (a false negative) E.g., an effective intervention is erroneously considered ineffective |
|
Definition
|
|
Term
| Type I errors are controlled through |
|
Definition
| the level of significance, the probability accepted as the risk of a false positive |
|
|
Term
| The ______________ is the area in the theoretical probability distribution corresponding to a rejection of the null hypothesis |
|
Definition
| level of significance or alpha (α) |
|
|
Term
| The probability of committing a Type II error is called |
|
Definition
|
|
Term
| Researchers cannot control β like they can control α, but they |
|
Definition
| can take steps to reduce the risk of β (to increase power) |
|
|
Term
| The most straightforward way to increase power is to |
|
Definition
|
|
Term
| Researchers calculate a _______ using their sample data |
|
Definition
|
|
Term
| They reject the null hypothesis if the test statistic falls |
|
Definition
| at or beyond a critical region on the theoretical distribution for their test statistic; they accept the null hypothesis otherwise |
|
|
Term
| When the null hypothesis is rejected, the results are |
|
Definition
| statistically significant |
|
|
Term
| If the null hypothesis is retained (whenever p > .05), |
|
Definition
| the results are statistically nonsignificant |
|
|
Term
| A statistically significant result is one that has a high probability of being |
|
Definition
| “real” in the population, and probably does not merely reflect a chance fluctuation |
|
|
Term
| Statistical significance does not mean the result is |
|
Definition
| important, relevant, or clinically meaningful |
|
|
Term
| A _______ is one that uses both tails of a sampling distribution to determine the critical region (the region for rejecting the null hypothesis) |
|
Definition
|
|
Term
| A ________________ is one that uses only one tail of a sampling distribution in determining the critical region |
|
Definition
|
|
Term
| A one-tailed test may be appropriate if |
|
Definition
| the alternative hypothesis is directional |
|
|
Term
| Two-tailed tests are more conservative (have less statistical power) than |
|
Definition
| one-tailed tests, but researchers should have a strong justification for looking in only one tail |
|
|
Term
| An ____________________ is a condition relating to the population that is accepted as being true without proof |
|
Definition
| assumption for statistical tests |
|
|
Term
Most tests assume random sampling _______________ This assumption is widely ignored Ideally, though, samples are reasonably representative of the populations from which they are drawn |
|
Definition
|
|
Term
|
Definition
Involves estimating a population parameter Typically assumes the dependent variable is normally distributed in the population Has a dependent variable that is measured on an interval (or approximately interval) or ratio scale |
|
|
Term
|
Definition
Does not involves estimating a population parameter Makes no assumptions about how the dependent variable is distributed in the population (so they are sometimes called distribution-free statistics) Often involves a dependent variable that is measured on an ordinal or nominal scale |
|
|
Term
|
Definition
| Easier to compute, no need to worry about distributional assumptions |
|
|
Term
|
Definition
| More powerful (all else equal, they have lower probability of a Type II error) |
|
|
Term
|
Definition
| Used when groups being compared are different, unrelated people |
|
|
Term
|
Definition
| Used when groups being compared are the same people |
|
|
Term
A concept widely used in statistical testing Refers to the number of components that are free to vary around a parameter |
|
Definition
|
|
Term
|
Definition
Select the test statistic (which depends on a number of factors, like number of groups being compared) Specify level of significance (α) Decide on one-tailed versus two-tailed test Calculate test statistic using appropriate formulas Calculate degrees of freedom Compare test statistic to tabled value for appropriate df and α Decide whether to accept or reject the null hypothesis |
|
|
Term
|
Definition
Select the test statistic Specify level of significance (α) Decide on one-tailed versus two-tailed test Instruct the computer accordingly The computer will calculate the test statistic, df, and the actual probability level |
|
|