Shared Flashcard Set

Details

Statistics I: Term Test 1
University of Guelph STAT*2040
118
Mathematics
Undergraduate 2
02/05/2015

Additional Mathematics Flashcards

 


 

Cards

Term
Absolute value of deviation
Definition
A measure of variability. The magnitude of deviation. The distance from the mean.
Term
Addition rule
Definition
P(AUB) = P(A) + P(B) + P(A∩B)
Term
Bar graph
Definition

aka Bar chart

Illustrates the distribution of categorical variables, which are usually placed on the x-axis, and percent relative frequency on the y-axis. Includes Pareto diagrams.

Term
Baye's theorem
Definition
P(A|B) = [P(A∩B)] / [P(B)]
Term
Bimodal distribution
Definition
A distribution with two peaks.
Term
Binomial distribution
Definition

aka Bernoulli distribution.

When there are two possible outcomes or a probability experiment: success and failure.

You have to use the nCr function on your calculator, simbolized by C.

 

P(X = x) = (n C x)px(1 - p)n-x

Term
Boxplot
Definition
An illustration of the five-number summary. The top of the box is the third quartile and the bottom the first quartile. A line through it is the median. Whiskers extend to the maximum and minimum values. Outliers are plotted separtely. Useful for comparing groups.
Term
Categorical variable
Definition

aka Qualitative variable

A variable that falls into one of two or more distinct categories. May be displayed using a bar graph, Pareto diagram, or pie chart.

Term
Chebyshev's Inequality
Definition

aka Chebyshev's theorem

The proportion of observations that lie within k standard deviations must be at least 1 - (1/k2)

Term
Chi-square test
Definition
A test that quantifies how strong evidence is.
Term
Classes
Definition

aka Bins

Ranges of quantitative variables that data is sorted into when making a frequency table. Appropriate number and range of bins must be selected, with boundaries the same for each class.

Term
Cluster sampling
Definition
Random selection of groups within a population, such as towns or households. Within every cluster, every individual is surveyed. Cuts down costs of sampling.
Term
Combination
Definition

A permutation where the order of selection doesn't matter.

x is the number of items in the combination

n is the number of items x is selected from

Cnx = [n!] / [x!(n - x)!]

Term
Complement
Definition

C

An event has not occured.

P(AC) = 1 - P(A)

Term
Conceptually infinite population
Definition
A population that is too large or too nebulous and it is practically impossible to list every member. 
Example: the mosquitoes of Southern Ontario.
Term
Conditional probability
Definition

|

An event that has occured given that another event has already occured.

P(A|B) is the probability of A, given that B has already occured.

Term
Confounded variables
Definition
Variables that are impossible to separate. Cannt study either one without the other being a lurking variable.
Term
Continuous
Definition

A sliding continuum of values. There are infinity number of fractions it can be divided into. May be bound to a certain range.

Example: time, weight, distance.

Term
Control group
Definition
A group in an experiment that is exposed to all the same environmental factors excepting one; the variable which is being studied.
Term
Covariance
Definition
A measure of the linear relationship between x and y.
Term
Cumulative frequency
Definition
The number of data points in a class, pluss all the data points in lower classes.
Term
Degrees of freedom
Definition
The number of independent pieces of information used to estimate a quantity.
Term
Descriptive statistics
Definition
Plots and numerical summaries used to describe a data set.
Term
Deviation
Definition
A measure of variability. The value minus the mean. The sum of all deviations will always equal zero.
Term
Discrete
Definition
Having a countable number of possible values. May be infinite or bound to a certain range. Example: money. Can go up to infinity, but the smallest fraction it can be divided into is cents.
Term
Distribution
Definition
How often variables take on certain values. Includes symmetric, skewed, unimodal, bimodal, and multimodal.
Term
Dot plots
Definition
A metho of illustrating data points. Every data point is individually plotted.
Term
Empirical rule
Definition
About 68% of observations lie within 1 standard deviation of the mean, about 95% within 2 standard deviations, and almost all within 3 stanard deviations. Does not apply to extremely skewed data.
Term
Event
Definition
Represented by a capital letter. A group of outcomes in the sample space.
Term
Expected value (μ)
Definition

The theoretical value of a mean variable. Not to be confused with the most likely value. The average if an experiment was done infinity times.

μ = E(x) = Σ x p(x)

Term
Experiment
Definition
Researchers impose conitions for the explanatory variable that are pre-existing. Well-designed, randomized experiments with a control group can show causal relationships if differences are significant.
Term
Explanatory variable
Definition
The variable which we can control for. In an experiment or observational study individuals are categorized into groups.
Term
Exponential distribution
Definition
Distribution skewed strongly to the right.
Term
Finite population
Definition

A population which is small enough for every member to be listed.

Example: U of G students.

Term
First quartile
Definition

aka 25th percentile

The bottom section of the box in a boxplot. Included in the five-number summary.

Term
Five-number summary
Definition
The minimum, the first quartile, the median, the third quartile, and the maximum. Illustrated with a boxplot.
Term
Frequency
Definition
The number of observations occuring in a category.
Term
Frequency table
Definition
A table showing the frequency of categories in data. Use for making bar graphs and histograms. With histograms, data is sorted into classes.
Term
Geometric distribution
Definition
The number of trials needed to get the first success in a binomial trial. Must be independent binomial trials with constant probability of success. Modelled by the probability mass function.
Term
Geometric mean
Definition

A measure of central tendency. The nth root of the product of observations.

(Πxi)(1/n)

Term
Harmonic mean
Definition

A measure of central tendency. The reciprocal of the mean, using reciprocals of all observations.

n / [∑(1 / xi)]

Term
Histogram
Definition
An illustration of the distribution of a quantitative variable. Made using a frequency table.
Term
Hypergeometric distribution
Definition

Binomial distribution where the trials are not independent; the probability of outcomes is dependent on the results of previous trials.

You need to use the nCr function on a calcultor, symbolized by C.

X is the number of successes

a is the probability of a success

n is the sample size

N is the population size

P(X = x) = [(a C n)*((N - a) C (n - x))] / [N C n]

Term
Independent
Definition

The occurance of an event has no effect on the probability of an another effect and vise versa.

All three must be true or all three false:

1. P(A∩B) = P(A)*P(B)

2. P(A|B) = P(A)

3. P(B|A) = P(B)

Term
Individual
Definition

aka Unit

aka Case

Objects on which measurements are taken.

Term
Inferential statistics
Definition
Investigating the relationship between variables.
Term
Interquartile range (IQR)
Definition

A descriptive measure of variance. The difference between the third and first quartile. Not sensitive to extreme values.

IQR = Q3 - Q1

Term
Intersection
Definition

One event and another event have occured together in the same sample point.

Term
Law of Large Numbers
Definition
If you sample an infinitely large number of variales, you get the expected value and expected sample variance.
Term
Linear transformations
Definition
Conversions that are linear, such as the conversion between Celsius and Fahrenheit.
Term
Lurking variables
Definition
Variables that contribute to correlations, but are not included in the study. Researchers may be completely unaware of them. More likely in observational studie than in experiments.
Term
Maximum value
Definition
The largest value in a dataset. The top line of a boxplot.
Term
Mean (x bar)
Definition

aka Average

The most popular measure of central tendency. Uses more information, but is more sensitive to extreme values in the data. This sensitivity can make the mean misleding

x bar = [Σxi] / n

Term
Mean absolute deviation (MAD)
Definition

The average absolute value of deviation. A reasonable measure of variability, but hard to work with.

MAD = [Σ|xi - x bar|] / n

Term
Median
Definition

aka Second quartile

aka 50th percintile

A measure of central tendency. The line in a boxplot separating the box. The middle point, if all data points were ordered in ascending order. If n is even, the median is the  average ot the two middle values. Not as sensitive to extreme values as the mean. Good for data that is right-skewed, such as property value or salary.

Term
Midrange
Definition
A measure of central tendency. The midpont between the minimum and maximum values.
Term
Minimum value
Definition
The smallest value in a dataset. The bottom line of a boxplot.
Term
Mode
Definition
A measure of central tendency. The most frequenty occuring observation.
Term
Multimodal distribution
Definition
Distribution with multiple peaks.
Term
Multiplication rule
Definition
P(A∩B) = P(A)*P(B|A) = P(B)*P(A|B)
Term
Multivariate hypergeometric distribution
Definition
Hypergeometric distribution where there more than two classifications of outcomes.
Term
Mutually exclusive
Definition

Evens where there is no outcome in the sample space that satisfies both.

P(A∩B) = 0

Term
Negatively skewed
Definition
Skewed to the left. Higher on the right.
Term
Normal distribution
Definition
Perfectly symmetrical distribution. Rare.
Term
Observational study
Definition

Researchers observe and measure variables, but do not impose any conditions on the subjects. The groups of explanatory variables are pre-existing. 

Done if the experiments are impossible (time, money, ethical reasons). Doesn't provide strong evidence for causal relationships; there may be lurking variables.

Term
Outliers
Definition
Extreme values that fall from the overall pattern of distribution. Fall outside the range of boxplot whiskers. Plotted individually in a boxplot.
Term
P value
Definition
A measure of the strength of evidence.
If the probability a result is false is less than 0.05 then the result is considered significant.
Term
Parameter
Definition
A numerical characteristic of a population.
Term
Pareto diagram
Definition
A bar graph where the categories are sorted by percent frequency from largest to smallest.
Term
Percent relative cumulative frequency
Definition
The cumulative frequency expressed as a percent of all data points. The last class should have a percent relative cumulative frequency of 100%.
Term
Percent relative frequency
Definition
The relative frequency expressed as a percent.
Term
Percentile
Definition
The value of the variable that has p% of the ordered data values at or below this value.
Term
Permutation
Definition

An ordering of a set of items. 

x is the number of things being ordered

n is the number of things x is selected from 

Pnx = [n!] / [(n - x)!]

Term
Pie chart
Definition
Illustrates the percent relative frequencies of categorical variables as slice-shaped areas on a circle.
Term
Poisson distribution
Definition

When events occur independently over a range. The probability of an event within any given range of a certain size does not change. 

X is the number of events in a fixed range

 x is a positive integer

λ is the theoretical mean of events in a fixed range

P(X = x) = [λxe] / [x!]

Term
Population
Definition
The set of individuals or objects of interest to an investigator.
Term
Population mean
Definition
A parameter. The average of all individuals in a population.
Term
Positively skewed
Definition
Skewed to the right. Skewed distribution that is higher on thhe left.
Term
Probability
Definition
The propotion of times that the outcome would occur in an infinite number of trials.
Term
Probability experiment
Definition
We don't know what is going to happen in any one individual trial, but we can keep traack of the long-run distribution of outcomes.
Term
Probability Mass Function (PMF)
Definition

Used to calibrate the probability a success will occur after a certain number of trials.

P(X = x) = p*(1 - p)x - 1

P(X ≤ x) = 1 - (1 -p)x

Term
Quantitative variable
Definition
A variable that falls onto a sliding continuous scale of values.
Term
Quartile
Definition
Specific percentiles. Useful descriptive measures of the distribution of data. Used in the construction of boxplots. Includes the first, second, and third quartiles.
Term
R
Definition
A software program that is used for statistics.
Term
Random sampling
Definition
Ensures that we avoid systematic bias in the samples.
Term
Range
Definition
A measure of variability. The maximum value minus the minimum value. Does not provide much information.
Term
Relative frequency
Definition
Frequency divided by n. The proportions of observations in a category.
Term
Response variable
Definition
The variable of interest in an experiment; what we look for changes in.
Term
Sample
Definition
A subset of individuals selected from a population.
Term
Sample mean
Definition
A statistic. The average of all observations in a sample.
Term
Sample points
Definition
Individual outcomes of probability experiments. Exclusive; no two points can occur on the same trial.
Term
Sample space (S)
Definition
A list of all possible outcomes of a probability experiment. Exhaustive; there are no possible outcomes not included in the sample space.
Term
Sample variance (s2)
Definition

A measure of variability. The average squared deviation. Will give an answer in units squared.

s2 = [Σ(xi - x bar)2] / (n - 1)

Term
Side-by-side bar chart
Definition
A bar chart with data for categories is represented by bars side by side to one another.
Term
Simple Random Sampling (SRS)
Definition
One of the simplest and most important types of random sampling. Each individual in the population has the same likelihood of being selected for the sample.
Term
Skewed distribution
Definition
When the distribution is stretched off to one side. Includes positive and negative skewedness.
Term
Squared deviation
Definition
A meaure of variability. The square of deviation.
Term
Stacked bar chart
Definition
A bar chart where categories are represented by stacking bars on top of each other.
Term
Standard deviation (s)
Definition

The squared root of variance. Cannot be negative.

s = √s2

Term
Statistic
Definition
A numerical characteristic of a sample.
Term
Statistical inference
Definition
Making statements about population parameters based on sample statistsics.
Term
Stem
Definition
In a stemplot, groups of data based on the second to last digit in the data points (each data point is written to the same number of decimal points). The stems are listed in ascending order in a column, and the leaves going off to the right.
Term
Stemplot
Definition

aka Stem-and-leaf display

A way of illustrating quantified variable data. The data is sorted into stems and leaves based on the last two digits. The leaves are listed as single digits (the last digit in the data point) to the right of their stem. Must include a legend for the stems. Includes split-stem and back-to-back stemplots.

Term
Strata
Definition
Groups from which samples are taken in stratified random sampling.
Term
Stratified random sampling
Definition
The population is divided into strata and random samples are taken from each strata.
Term
Symmetric distribution
Definition

aka Bell-shaped distribution

Distribution that is roughly the same on either side of the median. Includes normal distribution.

Term
T-test
Definition
Determines if there is a significant difference between variables. If there is, there is a large likelihood that there is a correlation between variables.
Term
Third quartile
Definition

aka 75th percentile

The top line of a boxplot.

Term
Trimmed mean
Definition
A measure of central tendency. A certain percentage of the largest and smallest observations are omitted from calculations, resulting in a mean less sensitive to extreme values.
Term
Uniform distribution
Definition
Distribution that is constant over the entire range.
Term
Unimodal distribution
Definition
Distribution with one peak.
Term
Union
Definition

U

One event or another event has occured in one sample point.

Term
Variability
Definition

The dispersion of a variable.

Var(x) = E * [(x - μ)2] = E * (x - μ)2 * p * x

Term
Voluntary response
Definition
When individuals volunteer themselves to be included in a sample. Results tend to be biased; measuring statistics of people who would volunteer.
Term
Whisker
Definition
An extension up and down from a boxplot indicating the minimum and maximum values if they lie within 1.5 of the length of the box; values outisde this range are outliers.
Term
Wiebull distribution
Definition
Distribution with a peak near the left and skewed towards the right.
Term
Weighted mean
Definition
A measure of central tendency. A mean where some observations are given more weight in calculations.
Term
Z-score
Definition

A unitless measure of how many standard deviations a point is away from the mean. Positive means above the mean, negative means below.

zi = [xi - x bar] / s

Supporting users have an ad free experience!