Shared Flashcard Set

Details

Math 146 Chapter 3
Intro to Statistics
38
Mathematics
Undergraduate 1
10/01/2017

Additional Mathematics Flashcards

 


 

Cards

Term
Arithmetic mean
Definition
This is computed by adding all of the values of the variable in a the data set and dividing by the number of observations. Also known as the mean, or the average.
Term
Population arithmetic mean
(μ)
Definition
This is computed using all of the individuals in a population. It is a parameter. The average of a population.

μ = (x1+x2+...+xN)/N = (Σxi)/N
Term
Sample Arithmetic mean
Definition
This is computed using sample data. The sample mean is a statistic. Average of the sample.
Term
Median
Definition
The value that lies in the middle of the data when arranged in ascending order - represented by M
Term
Resistant
Definition
A numerical summary is said to be ________ if extreme values (very large or very small) relative to the data do not affect its value substantially.
Term
Mode
Definition
The most frequent observation of a variable that occurs in a data set. There can be multiple.
Term
Bimodal
Definition
When the data set has two modes
Term
Multimodal
Definition
When the data set has 3 or more modes
Term
No mode
Definition
When no observation in a data set occurs more than once.
Term
Dispersion
Definition
The degree to which the data are spread out. Includes: the range, standard deviation, variance, and the interquartile range
Term
Range
Definition
The difference between the largest and the smallest data value. Represented by R.

R = largest data value - smallest data value
Term
Deviation about the mean
Definition
Population: For the ith observation, it is xi - μ
Sample: For the nth observation, it is xi-(mean)x
Term
Population Standard Deviation (σ)
Definition
The ___ of a variable is the square root of the sum of squared deviations about the population mean, divided by the number of observations in the population N. That is, the square root of the mean of the squared deviations about the population mean.

[image]
Term
Conceptual formula
Definition
Using this formula:
1. Create a table with four columns: enter pop. data in column 1, in column 2 enter the pop. mean.
2. Compute the deviation about the mean for each data value and enter the result in column 3.
3. In column 4, enter the squares of the values in Column 3.
4. Sum the entries in Column 4 and divide this result by ther size of the population.
5. Determine the square root of the value found in step 4.

[image]
Term
Computational Formula
Definition
A formula that is equal to the population standard deviation formula:

[image]

Using this formula:
Create a table with two columns: Population data in column 1. Square each value in column 1 and enter the result in column 2.
Sum the entries in column 1 and sum the entries in column 2.
Substitute these values into the computational formula and simpllify.

[image]
Term
Sample standard deviation (s)
Definition
____of a variable is ther square root of the sum of squared deviations about the sample mean divided by n-1 where n is the sample mean

[image]
Term
Degrees of freedom
Definition
(n-1) because the first n-1 observations have the freedom to be whatever value they wish, but the nth value has no freedom. It must be whatever value forces the sum of the deviations about the mean to equal zero.

In other words, we have n-1 degrees of freedom in the computation of s because an unknown parameter, μ, is estimated with (mean)x. For each parameter estimated we lose 1 degree of freedom.
Term
The larger the standard deviation, the more dispersion that distribution has
Definition
When comparing two populations, __________, provided that the populations use the same units of measure. You want to compare apples with apples.
Term
Variance
Definition
The ___ of a variable is the square of the standard deviation.
Term
Population variance
Definition
σ^2
Term
Sample variance
Definition
s^2
Term
Biased statistic
Definition
This is used to describe a statistic when it consistently under or overestimates a parameter.
Term
Empirical rule
Definition
If the data have a distribution that is bell shaped, then this rule can be used to determine the percentage of data that will lie within k standard deviations of the mean.

If the distirbution is roughly bell shaped, then:
- Appx. 68% of data will lie within 1 standard deviation of the mean. Meaning appx. 68% of data will lie between μ-1σ and μ+1σ
- Appx. 95% of the data will lie within 2 standard deviations of the mean, between μ-2σ and μ+2σ
- Appx. 99.7% of the data will lie within 3 standard deviations of the mean, between μ-3σ and μ+3σ

This rule gives more precise results.
Term
Chebyshev’s Inequality
Definition
An inequality that determines a minimum percentage of observations that lie within k standard deviations of the mean, where k>1 regardless of the basic shape of the distribution (skewed left, skewed right, or symmetric).

- For any data set or distribution, at least (½ - 1/k^2) x 100% of the observations lie within k standard deviations of the mean, where k is any number greater than 1. That is, it lies between μ-kσ and μ+kσ for k>1.

- Can also be used based on sample data
Term
Grouped data
Definition
Data that has been summarized in frequency distributions.
Term
Weighted mean
Definition
This is found by multiplying each value of the variable by its corresponding weight, adding these products, and dividing this sum by the sum of its weights. It can be expressed using the formula:

[image]
Term
Approximate Standard Deviation of a Variable from a Frequency Distribution
Definition
Population Standard Deviation - σ = √ ((Σ((xi - μ)^2)fi) / (Σfi))
Sample standrard deviation - s = √ ((Σ((xi - μ)^2)fi) / (Σfi - 1))

Where xi is the midpoint or value of the ith class, fi is the frequency of the ith class
Term
Z-score
Definition
Represents the distance that a data value is from the mean in terms of the number of standard deviations. We find it by subtracting the mean from the data value and dividing this result by the standard deviation. There is both population ___ and a sample ___:

[image]
Term
kth Percentile
Definition
Denoted Pk, of a set of data is a value such that k percent of the observations are less than or equal to the value

- Percentiles divided a set of data written in ascending order into 100 parts, so 99 percentiles can be determined
- Used to give the relative standing of an observation
Term
Quartiles
Definition
Divide data sets into fourth, or four equal parts: (Q1, Q2, Q3, Q4)

Q1 - the first quartile, divides the bottom 25% from the top 75%; this is equivalent to the 25th percentile

Q2 - the second quartile, divides the bottom 50% of the data from the top 50%; equivalent to the 50th percentile or the median

Q3 - the third quartile, divides the bottom 75% of the data from the top 25%; equivalent to the 75th percentile
Term
Interquartile Range (IQR)
Definition
The range of the middle 50% of the observations in a data set. The IQR is the difference between the third and first quartiles and is found using the formula: IQR = Q3 - Q1
Term
Describe the distribution
Definition
___means to describe a distributions shape (skewed left, right, or symmetric), its center (mean or median), and its spread (standard deviation or interquartile range).
Term
Outliers
Definition
Extreme observations in the data set; can occur by chance, or error.

Checking for ____:
1. Determine the first and third quartiles of the data
2. Compute the interquartile range
3. Determine the fences.
4. If a data value is less than the lower fence or greater than the upper fence, it is consdiered an outlier.
Term
Fences
Definition
Serve as cutoff points for determining outliers
Lower ___ = Q1 - 1.5(IQR)
Upper ___ = Q3 + 1.5(IQR)
Term
Exploratory Data Analysis
Definition
Exploring the data to see if they contain interesting information that may be useful in our research; goal is to collect and present evidence NOT to make conclusions.
Term
Five number summary
Definition
This consists of the smallest data value of a set, Q1, the median, O3, and the largest data value of the set. Organized as so:
MINIMUM Q1 M Q2 MAXIMUM
Term
Boxplot
Definition
A graph that is made using the five-number summary.

1. Determine the lower and upper fences
2. Draw a number line long enough to include the maximum and minimum values. Insert vertical lines at Q1, M, and Q3. Enclose these vertical lines in a box.
3. Label the lower and upper fences.
4. Draw a line from Q1 to the smallest data value that is larger than the lower fence. Draw a line from Q3 to the largest data value that is smaller than the upper fence. These lines are called whiskers.
5. Any data values less than the lower fence or greater than the upper fence are outliers and are marked with an asterisk.
Term
Whiskers
Definition
Lines on the outside of the box plot, that display the distance from the outer quartiles to the outer data values.
Supporting users have an ad free experience!