Shared Flashcard Set

Details

STAT 100 - Exam One
First STAT 100 exam @ UIUC
62
Mathematics
Undergraduate 1
09/26/2011

Additional Mathematics Flashcards

 


 

Cards

Term
BASIC VOCABULARY
Definition
Term
Mean
Definition
AKA average. Most common descriptor of the center of data.
Term
Median
Definition
Midpoint of the data set. Half the values lie above it and half lie below.
Term
Mode
Definition
The most frequent value in a data set.
Term
Range
Definition
Difference between highest and lowest value.
Term
Gap
Definition
Marked split in values, each bundle is called a cluster.
Term
Frequency
Definition
Number of times a value appears in the data.
Term
Quartiles
Definition
Division of data into four sets based on frequency. Used in boxplots.
Term
Interquartile Range
Definition
The difference between the first and third quartiles in a data set.
Term
Spread
Definition
How far the data is from the central value. Measured by standard deviation.
Term
Influence
Definition
How strongly a value will impact the mean. IE, a very large or very small value will affect the mean more than a central value.
Term
Standard Deviation
Definition
A measure of how distant values are from the center of the data.
Term
Empirical Rule of Standard Deviation
Definition
The rule states that in roughly bell-shaped data sets, 68% of values will fall within one SD, 95% will fall within two SDs, and 99.7% will fall within three. This is also known as the "68-95-99.7 Rule"
Term
Z-Score
Definition
AKA standard score. A measure of the number of standard deviations a value is from the center.
Term
Percentile Rank
Definition
The percent of values that are lower than an examined value.
Term
Area Fallacy
Definition
Misuse of statistics in which the height of a histogram is correctly represented but the area is not.
Term
Simpson's Paradox
Definition
This occurs when a conclusion based on individual groups of data is contradicted when the groups are combined.
Term
PAIRED DATA VOCABULARY
Definition
Term
Regression
Definition
Fitting a mathematical expression to explain a paired data set.
Term
Positive Correlation
Definition
If the explanatory and response variable increase and decrease together.
Term
Negative Correlation
Definition
If the explanatory and response variable increase and decrease inversely.
Term
Linear Correlation Coefficient
Definition
AKA "r". Measure of how well the data fits a mathematically defined linear correlation. (A regression/equation)
Term
Steps of Calculating "r"
Definition
1) Standardize the variables. (value minus center, divided by SD)
2) Multiply each standardized x value by its corresponding y value.
3) Divide the sum of those products by the number of terms minus one.
Term
Residual
Definition
The difference between an expected value and the observed value.
Term
Residual Sum of Squares
Definition
AKA "SSE". Sum of the squared residuals in a set. A measure to compare how well a regression fits the data.
Term
Least Squares Regression Line
Definition
The line with the lowest SSE. This means it is the best possible linear fit for the data set.
Term
Causation
Definition
If the explanatory variable is shown to effect the response variable. Be sure there is no lurking variable that better explains the correlation.
Term
Deviation
Definition
The difference between a value and the center of the data set.
Term
Explained Deviation
Definition
The difference between the average response variable and the examined one that can be attributed to the explanatory variable.
Term
Unexplained Deviation
Definition
The difference between explained deviation and total deviation.
Term
Residual Standard Deviation
Definition
Standard deviation calculated from the deviance between expected and observed values.
Term
TYPES OF VARIABLES
Definition
Term
Explanatory/Independent Variable
Definition
Variable that explains a correlation, usually on the x-axis.
Term
Response/Dependent Variable
Definition
Variable that results from a correlation, usually on the y-axis.
Term
Numerical/Quantitative Variable
Definition
Variable that a number defines. (IQ, height, time, etc)
Term
Categorical/Qualitative Variable
Definition
Variable that a word or category describes. (eye color, major, name, etc)
Term
Continuous Variable
Definition
Variable that can be anything within a range of values. (GPA, weight, etc)
Term
Discrete Variable
Definition
Variable that is one of some number of set values. (siblings, shoe size, etc)
Term
Ranked Variable
Definition
Categorical variable that has an inherent hierarchy of value. (Grades, business rating, military rank, etc)
Term
Lurking Variable
Definition
Variable that is not immediately obvious that may lead to incorrect conclusions.
Term
Outliers
Definition
A value that is far removed from the rest of the data. Should only be removed if it is a mistake.
Term
Predicted Value
Definition
Has a hat on top, means the value expected based on a regression.
Term
Leverage Point
Definition
Extreme outlier that significantly changes the line of regression.
Term
Bell-Shaped Distribution
Definition
Frequency of values is greatest near the median and least at the extremes of the range.
Term
Uniform Distribution
Definition
The frequency of values is consistent across the entire range.
Term
U-Shaped Distribution
Definition
Frequency of values is least near the median and greatest at the extremes of the range.
Term
Symmetric Distribution
Definition
Data is almost mirrored on each side of the central value.
Term
Right Skewed Distribution
Definition
Data is more frequent in lower values. (Long tail to the right)
Term
Left Skewed Distribution
Definition
Data is more frequent in higher values. (Long tail to the left)
Term
TYPES OF GRAPH
Definition
Term
Dotplot
Definition
Graph that uses stacked dots to show frequency. (Most useful for small ranges with repeated values)
Term
Stem-and-Leaf Plot
Definition
Table that sorts data based on 10's place.
Term
Pie Chart
Definition
Useful when one wants to call attention to the relative frequency of variables.
Term
Bar Chart
Definition
Common bar graph, each bar represents a value and its height represents that values frequency. All bars are the same width.
Term
Paretto Chart
Definition
Special bar graph in which unranked categorical variables are listed from left to right in order of frequency.
Term
Histogram
Definition
A graph that uses uneven widths to represent ranges of values and the area of the bar to represent those value's frequency.
Term
Steps of Drawing a Histogram
Definition
1) Calculate the percentage of values in each group.
2) Find the height of each bar based on width.
3) Draw the histogram.
Term
Boxplots
Definition
When data is graphically represented by four evenly divided (in terms of frequency) ranges called quartiles.
Term
Boxplot Outliers
Definition
If a value is more than three times the interquartile range from the first or third quartiles, it's an outlier. If it's between 1.5 and three times the IQR, it's a potential outlier.
Term
Combination Graph
Definition
When the data of two dependent variables is represented on the same graph.
Term
Scatterplot
Definition
A graph of paired data represented by points.
Term
Residual Plot
Definition
Inversion of the graph to set the linear regression to a slope of zero. Helps determine if points are evenly distributed above and below.
Supporting users have an ad free experience!