Shared Flashcard Set

Details

Basic Practice of Statistics
key terms, concepts
78
Mathematics
Undergraduate 1
10/23/2008

Additional Mathematics Flashcards

 


 

Cards

Term
Resistant Measure
Definition
a measure that can resist the influence of extreme observations

e.g Median
Term
Median
Definition
midpoint of a distribution (i.e. the number such that half the observations are smaller and the other half are larger (n+1)/2
Term
Quartiles (Traditional)
Definition
1st Quartile is > 25% of observations
2nd Quartile = median
3rd Quartile is > 75% of observations
Term
Quartiles (Freund/Perles)
Definition
the lower quartile (Q1) is the ¼(n+3)th observation

the second quartile (median) is the ½(n+1)th observation

the upper quartile (Q3) is the ¼(3n+1)th observation
Term
Choosing a Summary (center/spread)
Definition
Five number summary is usually better than mean and standard deviation for a distribution or one with strong outliers
Term
Density Curve
Definition
A curve that has area exactly 1 underneath it. The area under the curve and above any range of values is the proportion of values that fall in that range
Term
Mean of skewed distribution
Definition
The mean of a skewed distribution is pulled toward the long tail
Term
Normal Curve/Distribution
Definition
Symmetric, single-peaked, and bell-shaped
Term
68-95-99.7
Definition
68% of values fall within the 1 std dev from the mean
95% fall within 2 std dev from the mean
99.7% fall within 3 std dev from the mean
Term
Standardization/Z-score
Definition
subtract mean of distribution from value and divide by standard deviation (z-score)
Term
Z-score
Definition
tells is how many standard deviations original value falls away from the mean and in what direction
Term
Standard Normal Distribution
Definition
The normal distribution with mean 0 and standard deviation 1
Term
Behavior of Mean of Skewed Distribution
Definition
Mean moves farther toward long tail for a skewed curve
Term
Five Number Summary
Definition
minimum, Q1, Q2(Median), Q3 Maximum
Term
Behavior of Std Dev
Definition
s is zero when there is no spread and gets larger as spread increases
Term
Standard Deviation
Definition
sq root of the variance
Term
Variance
Definition
sum of individual deviations squared divided by the degrees of freedom (i.e. n-1)
Term
Interquartile Range
Definition
Q3-Q1 (Outlier is 1.5 X IQR above Q3 or below Q1
Term
Response Variable
Definition
Measures outcome of a study
Term
Explanatory Variable
Definition
explains or influences changes in a response variable
Term
Scatterplot
Definition
Plot explanatory variable on x-axis and response variable on the y-axis
Term
Positively Associated
Definition
when above average of one variable tend to accompany above average of the other or below average values tend to occur together
Term
Negatively Associated
Definition
when above average value of one variable accompany below average values of the other and vice versa
Term
Linear Relationship
Definition
when points in a scatter plot lie in a straight line pattern
Term
Correlation
Definition
the sum of the x deviations over std dev of x times the y deviations times 1/n-1
Term
Correlation - fact 1
Definition
Correlation makes no distinction between x and y
Term
Correlation - fact 2
Definition
Because r uses standardized variables r doesn't change when change units of measurement for x and y or both
Term
Correlation - fact 3
Definition
Positive r indicates positive association and negative r indicates negative correlation
Term
Correlation - fact 4
Definition
r is always between -1 and 1 and strength increases as move away from 0 in either direction (r = +-1 points lie on straight line)
Term
Correlation - fact 5
Definition
correlation measure strength of linear relationship only not curved
Term
Correlation - fact 6
Definition
correlation is not resistant i.e. affected by outliers
Term
Regression Line
Definition
a straight line that describes how a response variable changes as an explanatory variable changes
Term
Least-squares regression line
Definition
the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible
slope = r*(sy/sx)
intercepts = y-b*x
Term
Slope and Correlation
Definition
along the regression line a change of one std dev in x corresponds to a change of r std dev in y in other words as correlation grows less strong the prediction moves kess in response to changes in x
Term
r^2
Definition
is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x
Term
Residuals
Definition
The difference between an observed value of the response variable and the value predicted by the regression line
residual = obs y - predicted y
Term
Mean of least-squares residuals
Definition
is always zero
Term
Residual Plot
Definition
a scatterplot of the regression residuals against the explanatory variable
Term
Influential Points
Definition
point in extreme of x direction which has a strong influence on the position of the regression line
Term
Outlier
Definition
observation that lies outside the overall pattern of the other observations
Term
Extrapolation
Definition
the use of a regression line for prediction far outside the range of values of the explanatory variable
Term
Averaged Data
Definition
correlations based on averages are usually too high when applied to individuals
Term
Lurking Variable
Definition
a variable that has an important effect on the relationship among the variables in a study but is not included amont the variables studied
Term
Nonsense Correlations
Definition
changing one of the variables causes changes in the other - usually caused by lurking variable
Term
Association <> Causation
Definition
an association between an explanatory variable and a response variable is not by itself good evidence that changes in x cause changes in y even if that association is strong
Term
Establishing Causation
Definition
Association is strong
Association is consistent
Higher doses are associated with stronger responses
Cause precedes effect in time
Cause is plausible
Term
Two-way Table
Definition
table defining two categorical variables
Term
Marginal Distributions
Definition
row and column totals that appear at right and bottom margins of a two way table
Term
Simpson's Paradox
Definition
an association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group
Term
Observational Study
Definition
observes individuals and measures variables of interest but does not attempt to influence responses e.g. sampling
Term
Experiment
Definition
study that deliberately imposes some treatment on individuals in order to observe their responses
Term
Confounding
Definition
when two variables (explanatory or lurking) effects on a response variable cannot be distinguished from each other
Term
Population
Definition
entire group of individuals we want info about
Term
Sample
Definition
subset of population that we actually examine in order to gather information
Term
Sample Design
Definition
method used to choose sample from population
Term
Voluntary Response Sample
Definition
sample where people choose themselves to respond to a general appeal. biased b/c people with strong opinions-especially negatve ones-are most likely to respond
Term
Convenience Sampling
Definition
sample design that chooses the individuals easiest to reach
Term
Bias
Definition
systematic error; i.e. sample design that favors certain outcomes
Term
Simple Random Sampling
Definition
consists of n indviduals from a population chosen such that every set of n individuals has an equal chance to be selected
Term
Probability Sampling
Definition
sample technique that gives each member of the population a known chance of being selected
Term
Stratified Random Sample
Definition
divides population into groups of similar individuals called strata and then choosing a SRS from each stratum and combining the SRSs to form sample
Term
Strata
Definition
groups of similar individuals within a population used in stratified random sampling
Term
Multi-stage Sampling
Definition
Stage 1: Divide population into groups and select a sample of the groups
Stage 2: divided groups from one into smaller areas called blocks and take a stratified sample from the blocks
Stage 3: Sort individuals from blocks into clusters and take random sample of clusters
Term
Undercoverage
Definition
when some groups in the population are left out of SRS. e.g. phone survey and 6% w/o phones
Term
Nonresponse
Definition
when an individual chosen for the sample can't be contacted or refuses to cooperate
Term
Response Bias
Definition
bias caused by behavior of respondent or interviewer e.g. respondent lying, race or sex of interviewer
Term
Telescoping
Definition
bringing events in the past forward in memory to more recent time periods e.g. saw dentist 8 months ago and say yes to seeing dentist in the last 6 mos.
Term
Wording of Questions
Definition
wording of quesions in sample surveys can introduce bias
Term
Sampling Frame
Definition
list of individuals from which a sample is selected
Term
Experimental Units
Definition
The individuals on which an experiment is done
Term
Subjects
Definition
the experimental units when dealing with human beings
Term
Treatement
Definition
experimental condition applied to the units
Term
Factors
Definition
the explanatory variable(s) in an experiment
Term
Level
Definition
values of the factors in an experimental treatment
Term
Randomization
Definition
use of chance to divide experimental units into groups in an experiment
Term
Randomized Comparative Experiment
Definition
An experiment that uses both comparison and randomization
Term
Completely Randomized
Definition
experimental design where all experimental units are allocated at random among all treatments
Term
Statistically Significant
Definition
An observed effect so large that it would rarely occur by chance
Supporting users have an ad free experience!