The whiskers of a boxplot tell you... 

The top of the box in a boxplot tells you... 

The bottom of the box in a boxplot tells you... 

The bar/line in a boxplot tells you... 

The formula to determine whether a point will be an outlier is... 

a summary measure that is the property of the population. It is what a researcher is trying to measure/test in a study.
For example, the mean height of 20 year old women. 


a summary or number that is a property of the sample. It is the answer to the parameter  the result that is reached.
For example, the statistic of the mean height of 20 year old women is 5 foot 4 inches. 


Not affected by any variable 


Explanatory variable: A type of independent variable that isn't independent for certain. Always plotted on the x-axis. 


basically the same thing as dependent variable. Important to find a good outcome variable for the respective study.
i.e. Post Office's outcome variable for the "bigness" of a package is weight in lbs and girth in inches. 


Standard Deviation (Definition) 

Typical deviation from the mean. Give or take; average distance to the average.
Smaller standard deviation desirable, demonstrates that data is more stable. 


Standard Deviation (Formula) and steps to find 

It is the SQUARE ROOT of the variance.[image]
Step 1: Find mean, n, and n1. Step 2: Subtract each value by the mean. Step 3: Square every amount. Step 4: Add every squared value. Step 5: Divide by n1. Step 6: Square root the quotient. 


Variance: It does the same thing as the standard deviation. It just doesn't have interpretable units. 


Covariance: the average amount that x and y deviate together. Used to calculate correlation. 


Correlation: The bounded version of covariance. Between -1 and 1. Having either 1 or -1 means a perfect linear relationship.
No units.
No units. 


Limitations of correlation: Variables must be continuous. Restricted to linear relationships. No causation. Sensitive to outliers. 


Advantages of histograms: Show distributions (i.e. bimodality). Shows patterns in data. Organizes data into bins. 


Disadvantages of histograms: Hard to eyeball summary statistics like the mean and sd. Easily manipulable. Difficult to interpret when N is small. 


Advantages of boxplots: Provides quick, 5 number summary of the data (minimum, Q1, median, Q3, maximum). Easily identify outliers. Easy to do side-by-side boxplots to compare groups. 


Disadvantages of boxplots: Doesn't show distribution. Hides patterns like bimodality and clusters. Skewed distributions make for an uninformative boxplot. 


Normal distribution: Same thing as a bell curve. Describes lots of real world phenomena. The normal density function requires mean and standard deviation. Features: Symmetric around the mean. Mean, median, and mode are equal. Area under the curve is equal to 1. 68% of the area is within 1 SD of the mean. 95% of the area is within 2 SDs of the mean. 


Empirical rule: 68-95-99 rule. How to use the empirical rule: Use with normal distribution. Say the given mean is 30 and the sd is 5. The area within 1 SD of the mean is 35 and 25, so 68% of the x values lie between 25 and 35. The area within 2 SDs of the mean is 20 and 40, so 95% of the observations lie within 20 and 40. 


Right skewed distribution: mean > median. Income of the world is an example. The histogram trails off to the right. 


Left skewed distribution: mean < median. The histogram trails off to the left. 


