Term
| What are the two big categories of metrics of data? |
|
Definition
|
|
Term
| What are the sub-categories of discrete? |
|
Definition
|
|
Term
|
Definition
| cannot be ranked. e.g., religion, political party affiliation, etc. |
|
|
Term
|
Definition
relates information about which cases have similar traits, but allows ranked judgments |
|
|
Term
|
Definition
| can be ranked and have equal unit distance across their entire range(e.g. income in dollars- each increase by one dollar means the same thing at every point along the range) |
|
|
Term
| What are two common graphs for categorical variables? |
|
Definition
|
|
Term
| What is the only measure of central tendency for categorical variables? |
|
Definition
| The mode- the most frequently occurring category |
|
|
Term
| What are the two major descriptive goals with continuous variables? |
|
Definition
| central tendency, dispersion (spread) |
|
|
Term
| What are the two broad classes of descriptive statistics? |
|
Definition
| rank statistics, moment-based statistics |
|
|
Term
| What is the measure of central tendency for rank-based statistics? |
|
Definition
|
|
Term
| What is the measure of dispersion for rank-based statistics? |
|
Definition
| The IQR: interquartile range |
|
|
Term
| What is the measure of central tendency for moment-based statistics? |
|
Definition
|
|
Term
| Which type of data, rank or moment-based, is more sensitive to outliers? |
|
Definition
| Moment-based, because the mean only needs a little bit of an outlier to be thrown off, while the median doesn't break down with outliers |
|
|
Term
| What are the measures of spread for moment-based statistics? |
|
Definition
| The variance and the standard deviation |
|
|
Term
| What would happen to the variance if we had no variation in y at all? |
|
Definition
| Variance would equal zero- no spread around the mean. |
|
|
Term
| What happens to the variance as data are spread further from the mean? |
|
Definition
|
|
Term
| What is the standard deviation? |
|
Definition
the average dierence between values of y and the mean of y |
|
|
Term
| What are the two key dimensions of dependent variables? |
|
Definition
|
|
Term
| What are two types of research designs (hint: think spatial and temporal) |
|
Definition
| Cross-sectional, time series |
|
|
Term
| What is cross-sectional data design? |
|
Definition
| Looks at multiple units at one time |
|
|
Term
| What is temporal data design? |
|
Definition
| Looks at one unit over time |
|
|
Term
| Why might causality be clearer in a time series? |
|
Definition
| we can examine a phenomenon before and after some independent variable changes |
|
|
Term
| What kind of causal theories are common with the physical sciences? |
|
Definition
Deterministic theories: an increase in X by a certain amount will *always* cause an increase in Y of a certain amount |
|
|
Term
| What kind of causal theories are more common with humans? |
|
Definition
| Probabilistic: Increases in X cause increases in Y on average |
|
|
Term
| What is an ecological fallacy? |
|
Definition
inferring individual behavior from population averages |
|
|
Term
| What is the first hurdle? |
|
Definition
| Is there something connecting x and y- does it make sense that they might cause one another in a traceable way? |
|
|
Term
| What is the second hurdle? |
|
Definition
|
|
Term
| What is the third hurdle? |
|
Definition
|
|
Term
| What is the fourth hurdle? |
|
Definition
Can we eliminate any Z's that might relate to X and Y and cause Y ? |
|
|
Term
| Finish this phrase: "While correlation does not mean causation," |
|
Definition
| Correlation is necessary for causation to exist. |
|
|
Term
|
Definition
a research design in which the researcher both controls and randomly assigns values of the treatment (key independent variable) to participants |
|
|
Term
| What are the two key components of experiments? |
|
Definition
| control, random assignment |
|
|
Term
|
Definition
| The value(s) of the treatment (key independent variable) X are determined by the researcher and not by the participants or nature. |
|
|
Term
| What two groups comprise the experimental group? |
|
Definition
| Treatment group, control group |
|
|
Term
| What does randomization control for? |
|
Definition
| for every possible Z regardless of whether we can even list the possible Zs |
|
|
Term
| Experiments have good ____________ validity but bad ____________ validity. |
|
Definition
|
|
Term
| What is internal validity? |
|
Definition
| The extent to which we can accurately state that the observed independent variable produced the observed effect |
|
|
Term
|
Definition
| relates to the generalizability of your findings |
|
|
Term
| What can help with the idea that experiments are low on external validity? |
|
Definition
|
|
Term
| What is an observational study? |
|
Definition
| one in which the researcher does not have control over the quantities of the independent variable (or any variable) |
|
|
Term
| What is the population of interest in an experiment? |
|
Definition
| the set of units (people, countries, etc. . . ) that the researcher's theory relates to |
|
|
Term
| What are the two ways to get observational data? |
|
Definition
| Measure the entire population of interest, measure a sample of the population of interest |
|
|
Term
| What are the two types of data in an experiment? |
|
Definition
| Population data, sample data |
|
|
Term
| What is population data in an experiment? |
|
Definition
| data about every possible relevant case |
|
|
Term
| What is sample data in an experiment? |
|
Definition
| a dataset drawn from a subset of cases of some underlying population |
|
|
Term
|
Definition
Dened by the characteristic that every member of the population of interest has an equal probability of being selected for participation in the study. |
|
|
Term
| What is a convenience sample? |
|
Definition
| Use participants that are readily at hand |
|
|
Term
| What is the difference between random assignment and random sampling? |
|
Definition
Random assignment refers to the decision about whom to give a potential treatment. We randomly assign people to the treatment group from the larger experimental group.
Random sampling refers to drawing at random of a sample to study. Usually done (or attempted) in survey research. |
|
|
Term
| Without experimental control and random assignment, crossing hurdles ___ and ___ of causal evaluation is difficult. |
|
Definition
|
|
Term
| What kind of test do you perform if both variables are categorical? |
|
Definition
|
|
Term
| What kind of test do you perform if both variables are continuous? |
|
Definition
| Correlation coefficient (Pearson's r) |
|
|
Term
| What kind of test do you perform if the dependent variable is continuous and the independent variable is categorical? |
|
Definition
|
|
Term
| What kind of test do you perform if the dependent variable is categorical and the independent variable is continuous? |
|
Definition
|
|
Term
| What do p values range between? |
|
Definition
|
|
Term
| What does the p value show? |
|
Definition
the probability of randomly finding a relationship in the sample that does not exist in the population |
|
|
Term
| As the p value approaches ____ we get more confidence that there is a real relationship between the two variables in the population |
|
Definition
|
|
Term
| The more data we have, the _____ our p values will be. |
|
Definition
|
|
Term
| If a p value is less than ____, the relationship is said to be _______ ________. |
|
Definition
| 0.05, statistically significant |
|
|
Term
| What are the three steps to trying to interpret cross tabs? |
|
Definition
1. Figure out what defines the rows and columns 2. Figure out what each cell tells you 3. Look for general patterns |
|
|
Term
| When is Pearson's chi squared (x^2) statistic used? Write it down. |
|
Definition
| Used to find the relationship between two categorical variables. |
|
|
Term
| How do you calculate the degrees of freedom for the chi-squared statistic? |
|
Definition
| df = (r - 1)(c - 1) where r is the number of rows in your table and c is the number of columns |
|
|
Term
| When do you use a t-test? Write down the formula. |
|
Definition
| When the dependent variable is continuous and the independent variable is discrete. |
|
|
Term
| What does the numerator of the t-test formula tell you? |
|
Definition
| the greater the dierence between the means, the higher the value of t will be |
|
|
Term
| The denominator of the t test requires the _____ ______ of the difference of the two means. Write down how this is calculated |
|
Definition
|
|
Term
| How do you calculate degrees of freedom for the t-test? |
|
Definition
| Subtract one from the smaller of the two n's |
|
|
Term
| When do you use Pearson's r (correlation coefficient?) |
|
Definition
| When both the independent and dependent variables are continuous |
|
|
Term
| What does covariance mean? |
|
Definition
| That the variables change together |
|
|
Term
| How do you calculate the covariance between two variables? Write it down. |
|
Definition
|
|
Term
| If ___ is systematically higher than its mean for the same observations in which ___ is higher than its mean, we'll get a ________ contribution to the covariance. |
|
Definition
|
|
Term
| How is Pearson's r calculated? Write it down. |
|
Definition
|
|
Term
| What t-statistic is used to determine whether two continuous variables have a higher correlation than we would expect at random? Write it down. How are degrees of freedom calculated? |
|
Definition
|
|