Shared Flashcard Set

Details

Intermediate Statistics
Test B
65
Mathematics
Graduate
11/08/2013

Additional Mathematics Flashcards

 


 

Cards

Term
What are the steps in hypothesis testing?
Definition
1) Formally state your null (H0) and research or alternative (H1) hypotheses
2) Select an appropriate test statistic and the sampling distribution of that test statistic
3) Select a level of significance (alpha level) and determine the critical value and rejection region of the test statistic based on the selected level of alpha
4) Conduct the test: Calculate the obtained value of the test statistic and compare it to the critical value
5) Make a decision about your null hypothesis and interpret this decision in a meaningful way based on the research question, sample, and population
Term
What is the standard error of the mean?
Definition
The standard deviation for the distribution of sample means
Term
How does one translate the sample mean into a z score when the population standard deviation is not known?
Definition
z = (X-bar - µ) ÷ (s ÷ √n)
Term
What is the critical region?
Definition
The area of the sampling distribution that contains all unlikely or improbable sample outcomes and that would cause one to reject the null hypothesis
Term
Directional hypothesis tests are referred to as "_____-tailed" statistical tests, and nondirectional hypothesis tests as "_____-tailed"
Definition
One; two
Term
What is the formula used to conduct a z test for proportions?
Definition
z = (p-hat - p) ÷ (sigma sub p-hat)

Where:
sigma sub p-hat = √p(q) ÷ n
p = the population proportion assumed under the null hypothesis
p-hat = the sample proportion
q = 1- p
Term
When is it appropriate to use a t test for hypothesis testing instead of a z test?
Definition
The z test and z distribution may be used for making one-sample hypothesis tests involving a population mean under two conditions: if the population standard deviation is known and if the sample size is large enough (≥100) so that the sample standard deviation (s) can be used as an unbiased estimate of the population standard deviation
Term
We are interested in the average dollar amount lost by victims by burglary. The National Insurance Association has reported that the mean dollar amount lost by victims of burglary is $2,222. Assume that this is the population mean. We believe that the true population mean loss is different from this. Formally state the null and research hypotheses we would test to investigate this question. What if we believed the dollar amount to be higher?
Definition
H0: µ = $2,222
H1: µ ≠ $2,222

If we believed the amount was higher, the hypotheses would be
H0: µ = $2,222
H1: µ > $2,222
Term
What is a chi-square goodness of fit test?
Definition
A one or two variable test that indicates if there is a relationship between categorical variables
Term
Can the chi-square test of independence indicate the strength of a relationship between two variables?
Definition
No
Term
What is the formula for the chi-square goodness of fit test?
Definition
x^2 = (∑-number of categories) x (ƒ-of-observed - ƒ-of-expected)^2 ÷ ƒ-of-observed


In words, subtract the expected frequency from the observed frequency, square that difference, and then divide by the expected frequency. Perform this for all of the categories and then sum those calculations. This will be the obtained value of the chi-square statistic
Term
How does one find the degrees of freedom with the chi-square statistic?
Definition
k - 1

The number of groups minus one
Term
In the chi-square test of independence, what is the observed frequency?
Definition
The number of instances actually measured as shown in the sample data
Term
How does one find the expected frequencies needed for the chi-square test?
Definition
By determining what we should see if the null hypothesis is true
Term
The chi-square test is appropriate for what levels of data?
Definition
Nominal and ordinal
Term
What is a joint frequency distribution?
Definition
The simultaneous occurrence of one event from the first variable and another event from the second variable (in other words, the intersection of the two events).
Term
What is a contingency table?
Definition
A table that shows the joint distribution of two categorical variables, where one variable designates the columns and the other designates the rows
Term
In describing the dimensions of a contingency table, a 3 x 2 table means that there are ___ columns and ___ rows
Definition
2; 3

(Think of it as an R x C table)
Term
Row marginals refer to what? What do column marginals refer to?
Definition
The number of cases in each row of the table; the frequency in each column of the table
Term
To what does relative risk in a contingency table refer?
Definition
The chances of landing in a particular cell in the table
Term
What is the difference in using the chi-square goodness of fit and test of independence?
Definition
The independence test looks at the cell frequencies in a contingency table. In other words, for a test of independence you would take the difference between the observed and expected cell frequency, square the difference, and divide that by the expected cell frequency
Term
How do you find the expected cell frequency for a chi-square test of independence?
Definition
Multiply the row marginal frequency for the given row of interest times the column marginal for the column of interest divided by the number of cases

ƒexpected = (RM x CM) ÷ n
Term
How does one determine the number of degrees of freedom for a chi-square test of independence?
Definition
Degrees of freedom = (# of rows -1) x (# of columns -1)
Term
What is are measures of association?
Definition
Statistics that inform us about the strength or magnitude as well as the direction of the relationship between two variables
Term
Define the formula for the phi-coefficient and what level of data for which it is appropriate. What is the range of the phi-coefficient and what do those numbers indicate?
Definition
phi = √(chi-square ÷ n)

Nominal level data

0 to 1; 0 means no relationship and 1 means perfect relationship
Term
Lambda is known as a proportionate reduction in error (PRE) measure of association. What does this mean?
Definition
It allows one to tell exactly how much better one will be able to predict one variable from knowledge of another. It requires that the independent variable is known from the dependent
Term
What is the computational formula for lambda?
Definition
lambda = ((∑ƒi) - ƒd) ÷ n - ƒd

Where
ƒi = largest cell frequency in EACH category of the independent variable
ƒd = largest marginal frequency of the dependent variable
Term
The phi and lambda coefficients are both only appropriate for nominal level data. What is appropriate for ordinal level?
Definition
Goodman and Kruskal's Gamma
Term
What is the general formula for gamma?
Definition
gamma = (CP - DP) ÷ (CP + DP)

Where
CP = number of concordant pairs of observations
DP = number of discordant pairs of observations
Term
How does one determine if a pair is concordant?
Definition
When the scores on the two variables are consistently higher or consistently lower for two pairs of observations
Term
To determine the number of discordant pairs in a table, you...
Definition
Start in the lower leftmost cell that is low on the column variable but high on the row variable. Multiply this cell frequency by the sum of the cell frequencies for all cells that are both above and to the right of that cell
Term
How do you calculate the number of concordant pairs in a contingency table?
Definition
Start in the top leftmost cell and multiply this cell frequency by the sum of all cell frequencies that are both below and to the right of this cell
Term
What are the two explanations for a difference between sample means for two populations?
Definition
1- There really is a difference between between the groups
2- The difference is due to sampling error
Term
What is the sampling distribution of sample mean differences?
Definition
The theoretical distribution of the difference between an infinite number of sample means
Term
What is the standard error of the difference between two means?
Definition
The standard deviation of the sampling distribution of the difference between two means
Term
What is the equation for the standard error of the difference between two means?
Definition
sigma sub x-bar1 - x-bar2 = √(sigma1^2 ÷ n1) + (sigma2^2 ÷ n2)

Where
sigma 1 = standard deviation of the first population
sigma 2 = standard deviation of the second population
Term
What is an independent random sample?
Definition
When samples are drawn whose elements are randomly and independently selected
Term
What is a pool variance estimate?
Definition
The estimation of the standard error of the difference of two unknown population standard deviations when we assume the standard deviations are equal
Term
What does the matched-groups t test test?
Definition
The difference between the scores for each pair of samples
Term
Explain the difference between independent and dependent variables. If you think that low self-control affects crime, which is the independent and which is the dependent variable?
Definition
An independent variable is the variable whose effect or influence on the dependent variable is what you want to measure. In causal terms, the independent variable is the cause, and the dependent variable is the effect. Low self-control is taken to affect one's involvement in crime, so self-control is the independent variable and involvement in crime is the dependent variable.
Term
When is it appropriate to use an independent-samples t test and when is it appropriate for a t test for dependent samples or matched groups?
Definition
An independent-samples t test should be used whenever the two samples have been selected independently of one another. In an independent samples t test, the sample elements are not related to one another. In a dependent-samples or matched-groups t test, by contrast, the sample elements are not independent but are instead related to one another. An example of dependent samples occurs when the same sample elements or persons are measured at two different points in time, as in a "before and after" experiment. A second common type of dependent sample is a matched-groups design.
Term
What is an analysis of variance (aka ANOVA)?
Definition
A tool that can conduct multiple tests of population means while maintaining a true alpha level.
Term
The sums of squares will follow a __________ distribution with k - 1 degrees of freedom.
Definition
Chi squared
Term
The expected frequency in the chi square test is a so-called "_________" factor: It turns the above frequency into a proportion. That way, the whole thing behaves like a __ score.
Definition
Normalizing; z
Term
What is the general form of a t statistic?
Definition
t = (statistic - mean of sampling distribution) / standard error
Term
What is the formula for variance?
Definition
S^2 = (∑(x - xbar)^2) ÷ (n - 1)
Term
The F-test is the _____ of the variance. What is it's formula?
Definition
Ratio; F = (sigma1^2) ÷ (sigma2^2)
Term
When calculating the different kinds of variability, the following scores and means apply:
-Total variability: the difference between an _______ score and the _____ mean
-Within-group: the difference between an _______ score and the _____ mean
-Between-group: the difference between the _____ mean and the ____ mean
Definition
-Total: individual; grand
-Within-group: individual; group
-Between-group: group; grand
Term
What is the formula for the total sum of squares?
Definition
SS-tot: ∑i ∑k (x-indiv - x-bar-grand)^2
Term
What is the formula for the within group sum of squares?
Definition
SSwithin = ∑i ∑k (x-indiv - x-bar-group)^2
Term
What is the formula for the between group sum of squares?
Definition
SSbetween = ∑i ∑k (x-bar-group - x-bar-grand)^2
Term
When calculating the degrees of freedom in the sums of squares, what are the formulas for the three types?
Definition
Total: n - 1
Within: n - k
Between: k - 1
Term
To find variance with the sum of squares and degrees of freedom, we divide what by what?
Definition
The sum of squares for whichever type (total, within, between) by the degrees of freedom for that type
Term
What is the formula for the F test?
Definition
F = (SS-between ÷ df-between) ÷ (SS-within ÷ df-within)
Term
Tukey's Honest Significant Difference test requires a calculate the critical difference score. What is the formula to do this?
Definition
CD = q√(within-group variance ÷ n-sub-k)

Where
n-sub-k = number of cases in each of the k groups
q = studentized range statistic
Term
For Tukey's HSD test, you need to find q. What three things do you need to do this?
Definition
1) Alpha level
2) Degrees of freedom within groups
3) Number of groups
Term
Tukey's HSD test doesn't look at one hypothesis, but tests ____ ____ of sample means.
Definition
Each pair
Term
What is the formula for eta squared (aka correlation ratio)?
Definition
eta^2 = SS-between ÷ SS-total
Term
What is the formula for the q in Tukey's HSD?
Definition
q = range ÷ standard deviation of the sample
Term
The t test does not easily generalize to more than ___ groups
Definition
Two
Term
F distributions converge to a ____ ________ distribution as the denominator df go to positive infinity.
Definition
Chi-square
Term
When is it appropriate to perform an analysis of variance with our data? What type of variables do we need?
Definition
An analysis of variance can be performed whenever we have a continuous (interval or ratio level) dependent variable and a categorical variable with three or more levels or categories, and we are interested in testing hypothesis about the equality of our population means
Term
What statistical technique should we use if we have continuous dependent variable and a categorical independent variable with only two categories?
Definition
If we have a continuous dependent variable and a categorical independent variable with only two categories or levels, the correct statistical test is a two-sample t test, assuming that the hypothesis test involves the equality of two population means
Term
Why do we call this statistical technique an analysis of variance when we are really interested in the difference among population means?
Definition
It is called the analysis of variance because we make inferences about the differences among population means based on a comparison of the variance that exists within each sample, relative to the variance that exists between the samples. More specifically, we examine the ratio of variance between the samples to the variance within the samples. The greater this ratio, the more between-samples variance there is relative to within-sample variance. Therefore, as this ratio becomes greater than 1, we are more inclined to believe that the samples were drawn from different populations with different population means.
Term
What two types of variance do we use to calculate the F ratio?
Definition
Between-group variance divided by within-group variance
Supporting users have an ad free experience!