Shared Flashcard Set

Details

Title

STAT 100 - Exam One

Description

First STAT 100 exam @ UIUC

Total Cards

Subject

Mathematics

Level

Undergraduate 1

Created

09/26/2011

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Mathematics Flashcards

Cards Return to Set Details

Term

BASIC VOCABULARY

Definition

Term

Mean

Definition

AKA average. Most common descriptor of the center of data.

Term

Median

Definition

Midpoint of the data set. Half the values lie above it and half lie below.

Term

Mode

Definition

The most frequent value in a data set.

Term

Range

Definition

Difference between highest and lowest value.

Term

Gap

Definition

Marked split in values, each bundle is called a cluster.

Term

Frequency

Definition

Number of times a value appears in the data.

Term

Quartiles

Definition

Division of data into four sets based on frequency. Used in boxplots.

Term

Interquartile Range

Definition

The difference between the first and third quartiles in a data set.

Term

Spread

Definition

How far the data is from the central value. Measured by standard deviation.

Term

Influence

Definition

How strongly a value will impact the mean. IE, a very large or very small value will affect the mean more than a central value.

Term

Standard Deviation

Definition

A measure of how distant values are from the center of the data.

Term

Empirical Rule of Standard Deviation

Definition

The rule states that in roughly bell-shaped data sets, 68% of values will fall within one SD, 95% will fall within two SDs, and 99.7% will fall within three. This is also known as the "68-95-99.7 Rule"

Term

Z-Score

Definition

AKA standard score. A measure of the number of standard deviations a value is from the center.

Term

Percentile Rank

Definition

The percent of values that are lower than an examined value.

Term

Area Fallacy

Definition

Misuse of statistics in which the height of a histogram is correctly represented but the area is not.

Term

Simpson's Paradox

Definition

This occurs when a conclusion based on individual groups of data is contradicted when the groups are combined.

Term

PAIRED DATA VOCABULARY

Definition

Term

Regression

Definition

Fitting a mathematical expression to explain a paired data set.

Term

Positive Correlation

Definition

If the explanatory and response variable increase and decrease together.

Term

Negative Correlation

Definition

If the explanatory and response variable increase and decrease inversely.

Term

Linear Correlation Coefficient

Definition

AKA "r". Measure of how well the data fits a mathematically defined linear correlation. (A regression/equation)

Term

Steps of Calculating "r"

Definition

1) Standardize the variables. (value minus center, divided by SD)
2) Multiply each standardized x value by its corresponding y value.
3) Divide the sum of those products by the number of terms minus one.

Term

Residual

Definition

The difference between an expected value and the observed value.

Term

Residual Sum of Squares

Definition

AKA "SSE". Sum of the squared residuals in a set. A measure to compare how well a regression fits the data.

Term

Least Squares Regression Line

Definition

The line with the lowest SSE. This means it is the best possible linear fit for the data set.

Term

Causation

Definition

If the explanatory variable is shown to effect the response variable. Be sure there is no lurking variable that better explains the correlation.

Term

Deviation

Definition

The difference between a value and the center of the data set.

Term

Explained Deviation

Definition

The difference between the average response variable and the examined one that can be attributed to the explanatory variable.

Term

Unexplained Deviation

Definition

The difference between explained deviation and total deviation.

Term

Residual Standard Deviation

Definition

Standard deviation calculated from the deviance between expected and observed values.

Term

TYPES OF VARIABLES

Definition

Term

Explanatory/Independent Variable

Definition

Variable that explains a correlation, usually on the x-axis.

Term

Response/Dependent Variable

Definition

Variable that results from a correlation, usually on the y-axis.

Term

Numerical/Quantitative Variable

Definition

Variable that a number defines. (IQ, height, time, etc)

Term

Categorical/Qualitative Variable

Definition

Variable that a word or category describes. (eye color, major, name, etc)

Term

Continuous Variable

Definition

Variable that can be anything within a range of values. (GPA, weight, etc)

Term

Discrete Variable

Definition

Variable that is one of some number of set values. (siblings, shoe size, etc)

Term

Ranked Variable

Definition

Categorical variable that has an inherent hierarchy of value. (Grades, business rating, military rank, etc)

Term

Lurking Variable

Definition

Variable that is not immediately obvious that may lead to incorrect conclusions.

Term

Outliers

Definition

A value that is far removed from the rest of the data. Should only be removed if it is a mistake.

Term

Predicted Value

Definition

Has a hat on top, means the value expected based on a regression.

Term

Leverage Point

Definition

Extreme outlier that significantly changes the line of regression.

Term

Bell-Shaped Distribution

Definition

Frequency of values is greatest near the median and least at the extremes of the range.

Term

Uniform Distribution

Definition

The frequency of values is consistent across the entire range.

Term

U-Shaped Distribution

Definition

Frequency of values is least near the median and greatest at the extremes of the range.

Term

Symmetric Distribution

Definition

Data is almost mirrored on each side of the central value.

Term

Right Skewed Distribution

Definition

Data is more frequent in lower values. (Long tail to the right)

Term

Left Skewed Distribution

Definition

Data is more frequent in higher values. (Long tail to the left)

Term

TYPES OF GRAPH

Definition

Term

Dotplot

Definition

Graph that uses stacked dots to show frequency. (Most useful for small ranges with repeated values)

Term

Stem-and-Leaf Plot

Definition

Table that sorts data based on 10's place.

Term

Pie Chart

Definition

Useful when one wants to call attention to the relative frequency of variables.

Term

Bar Chart

Definition

Common bar graph, each bar represents a value and its height represents that values frequency. All bars are the same width.

Term

Paretto Chart

Definition

Special bar graph in which unranked categorical variables are listed from left to right in order of frequency.

Term

Histogram

Definition

A graph that uses uneven widths to represent ranges of values and the area of the bar to represent those value's frequency.

Term

Steps of Drawing a Histogram

Definition

1) Calculate the percentage of values in each group.
2) Find the height of each bar based on width.
3) Draw the histogram.

Term

Boxplots

Definition

When data is graphically represented by four evenly divided (in terms of frequency) ranges called quartiles.

Term

Boxplot Outliers

Definition

If a value is more than three times the interquartile range from the first or third quartiles, it's an outlier. If it's between 1.5 and three times the IQR, it's a potential outlier.

Term

Combination Graph

Definition

When the data of two dependent variables is represented on the same graph.

Term

Scatterplot

Definition

A graph of paired data represented by points.

Term

Residual Plot

Definition

Inversion of the graph to set the linear regression to a slope of zero. Helps determine if points are evenly distributed above and below.

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Mathematics Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile