Term
| statistical inference (inferential statistics) |
|
Definition
| the statistician's process of generalizing results from a sample to a population |
|
|
Term
| What does estimate accuracy depend on? |
|
Definition
| 1) How representative the sample is of the general population 2) The degree of sampling error |
|
|
Term
|
Definition
| The feature or characteristic of a population whose value you want to determine (e.g. the percentage of the population with chlamydia) |
|
|
Term
|
Definition
| result (like percentage) based on the sample population |
|
|
Term
|
Definition
| to hypothesize that a population parameter has a particular value, and then see if the value of the corresponding sample statistic is compatible with your hypothesis |
|
|
Term
|
Definition
| a measure of the chance of getting some outcome of interest from some event |
|
|
Term
| On what scale do we measure probability? |
|
Definition
Between 0 and 1 0 is impossible, 1 is inevitable |
|
|
Term
| If the probability of an event happening is p, what is the probability of the event not happening? |
|
Definition
|
|
Term
| How can you calculate the probability of an event happening? |
|
Definition
| it is the number of outcomes that favor (aka "fulfill") that event, divided by the total number of possible outcomes. |
|
|
Term
| proportional frequency approach |
|
Definition
| a method of calculating probability which uses existing frequency data as the basis for probability calculations (because not all outcomes have the same probability as something like flipping a coin would) |
|
|
Term
| If data is Normally distributed, what percentage of values will lie no further than two standard deviations from the mean? |
|
Definition
|
|
Term
|
Definition
same definition as probability favored in the clinical arena |
|
|
Term
|
Definition
| the risk for a single group (terminology distinguishes it from relative risk) |
|
|
Term
|
Definition
| the risk for one group compared to the risk for some other group |
|
|
Term
| How can you calculate the "odds" of an event happening? |
|
Definition
| odds are equal to the number of outcomes favorable to the event divided by the number of outcomes not favorable to the event. (55 blue, 40 green; 55/40= 1.375. The odds are 1.375 to 1 for blue) |
|
|
Term
| What are the main differences between risk/probability and odds? |
|
Definition
-The range of risk is between 0 and 1. The range of odds are between 0 and infinity -When the odds <1, the event are unfavorable to the outcome. -When the odds =1, the event is as likely to happen as it is not to happen -When the odds are >1, the odds are favorable to the outcome |
|
|
Term
|
Definition
| odds in health statistics are expressed as ‘something’ to one. This value of one is called the reference value. |
|
|
Term
| The connection between probability and odds gives us the ability to do what? |
|
Definition
Derive one from the other risk or probability = odds/(1 + odds) odds = probability/(1 – probability) |
|
|
Term
| How can you calculate relative risk (aka risk ratio)? |
|
Definition
| divide the risk for one group (usually the one exposed to the risk) by the risk for the second, non-exposed group |
|
|
Term
| How can you calculate odds ratio? In what type of study would you find this? |
|
Definition
Divide odds that those with a disease will have been exposed to the risk factor, with the odds that those who don’t have the disease will have been exposed A case control study (e.g. odds that those with a stroke had exercised is 0.78; odds that those without a stroke had exercised is 1.97, odds ratio is 0.78/1.97= 0.40) |
|
|
Term
| One cannot calculate risk in what type of study? Why? What can you use instead? |
|
Definition
case-control study (however, the odds ratio is reasonably good estimate)
in a case-control study you don’t select on the basis of whether people have been exposed to the risk or not, but on the basis of whether they have some condition (a stroke) or not. BOTH groups will contain individuals who were and were not exposed to the risk (see pg 103) |
|
|
Term
| What does NNT stand for, and what is it? |
|
Definition
Number needed to treat NNT is the number of patients who would need to be treated with the active procedure, rather than a placebo (or alternative procedure), in order to reduce by one the number of patients experiencing the condition |
|
|
Term
| What is ARR and what does it stand for? |
|
Definition
Absolute risk reduction The difference between two absolute risks (e.g. the reduction in risk gained by weighing more than 18 lbs at one year rather than weighing 18 lbs or less) |
|
|
Term
| What is the relationship between NNT and ARR? |
|
Definition
|
|
Term
| confidence interval estimator |
|
Definition
| a numeric expression that quantifies the likely size of the sampling error |
|
|
Term
| the mean of all possible sample means is the same as what? |
|
Definition
|
|
Term
|
Definition
| a measure of the spread of the data in a SINGLE sample |
|
|
Term
What is this equation used to do? s.e.(x ̄) = s/ (sqrt n) |
|
Definition
| estimate the standard error |
|
|
Term
|
Definition
| a measure of the preciseness of the sample mean as an estimator of the population mean (smaller is better) |
|
|
Term
| confidence interval (equation) |
|
Definition
(the distance from the sample mean – 2 × s.e.(x ̄), to the sample mean + 2 × s.e.(x ̄) (if you pick one out of all the possible sample means at random, there is a probability of 0.95 that it will lie within two standard errors of the population mean; this is the 95% confidence interval estimate) |
|
|
Term
| A confidence interval is said to represent what? |
|
Definition
| a plausible range of values for the population parameter |
|
|
Term
| Make sure you know how to calculate confidence interval for a population proportion using the equation, pg 116-117 |
|
Definition
|
|
Term
| What is the most common application of confidence intervals? |
|
Definition
| the comparison of two population parameters, for example between the means of two populations, such as the mean age of a population of women and the mean age of a population of men |
|
|
Term
| Name the prerequisites for the two-sample t test |
|
Definition
-data for both groups must be metric -the distribution of the relevant variable in each population must be reasonably Normal -The population standard deviations of the two variables concerned should be approximately the same, but this requirement becomes less important as sample sizes get larger |
|
|
Term
| What can you do if you want to know if there is a statistically significant difference between two population means? |
|
Definition
| calculate the 95 per cent confidence interval for the difference and see if it contains zero. If it does, you can be 95% confident that there is a statistically significant difference in the means. |
|
|
Term
| For what do we use the two-sample t test? |
|
Definition
| estimating the difference in the means of two independent populations |
|
|
Term
|
Definition
-Used in place of the 2-point t test -Compares population medians rather than the means -Only requires that the two population distributions have the same approximate shape, but does not require either to be Normal. -It is the non-parametric equivalent of the two-sample t test |
|
|
Term
|
Definition
| can be applied to data which is metric, and also has some particular distribution, most commonly the Normal distribution (non-parametric doesn't make distributional requirements) |
|
|
Term
| Briefly describe the Mann-Whitney method |
|
Definition
-Starts by combining the data from both groups, which are then ranked. -The rank values for each group are then separated and summed. -If the medians of the two groups are the same, then the sums of the ranks of the two groups should be similar. However, if the rank sums are different, you need to know whether this difference could simply be due to chance, or is because there really is a statistically significant difference in the population medians. (decide using confidence interval) |
|
|
Term
| When should one use the Wilcoxen test? |
|
Definition
When two groups are matched Ordinal data or skewed metric data
You can obtain confidence intervals for differences in population medians based on this test It is non-parametric |
|
|
Term
| ratio of two independent population means tells what? |
|
Definition
tells you how many times bigger one population mean is than another. If sample ratio is different than 1, need to find out if it's due to chance or if the difference is statistically significant. |
|
|
Term
| If the confidence interval for the ratio of two population parameters does not contain the value 1, then... |
|
Definition
| you can be 95% confident that any difference in the size of the two measures is statistically significant. |
|
|