gen bus 303-stats Mullins
stats exam 1
50
09/28/2011

Term
 population definition & what is the name of the numerical measure that describes a characteristic of a population?
Definition
 collection of all members of a groupparameter
Term
 sample definition and what is the numerical measure that describes a characteristic of a sample?
Definition
 a portion of the population selected for analysisstatistic
Term
 inferential statistics
Definition
 drawing conclusions about a population based only on sample data.
Term
 descriptive statistics
Definition
 collecting, summarizing, and presenting data.
Term
 discrete vs continuous1. # people in the room2. time of commute3. height4. td's scored by pack5. weight
Definition
 both are characteristics of numerical (quantitative data) 1. discrete2. continuous3. continuous4. discrete5. continuous
Term
 categroical (qualitative) vs numerical (quantitative)1. marital status2. deflects per hour3. voltage4. eye color
Definition
 1. categorical2. numerical & discrete3. numerical & continuous4. categorical
Term
 nominal vs ordinal vs interval vs ratio data1. 1st, 2nd places in a race2. temperature f/C 3. money 4. height5. age6.type of car owned7. student's letter grades8. service quality rating9. standardized exam score
Definition
 qualitative (nominal, ordinal) vs. quantitative (interval, ratio) nominal: categories (no ordering or direction)ordinal: ordered categories (rankings, ratings, order or scaling) interval: differences between measurements but no true zeroratio: differences between measurements, true zero exists1. ordinal2. interval3. ratio (you can have absolutely no money)4. ratio5. ratio6.nominal7.ordinal8.ordinal9. interval
Term
 how are nominal/ordinal/interval/ratio graphed? qualitative aka categorical (nominal/ordinal) vs quantitative aka numerical (interval/ratio)
Definition
 categorical: bar chart, pie chart, pareto chart, (graphing data) summary table (tabulating data)numerical: stem and leaf display (ordered array), histogram, polygon, ogive ( all frequency distribution and cumulative distributions)
Term
 i measure 2 students and use their resulting scores to make a statement comparing them. Identify the scale of measurement used: 1. I can only say that the two students are different2. I can say that one student scored 6 points higher than the other3. I can say that one student scored higher than the other, but I can't specify how much higher. 4. I can say that the score for one student is 2x the score of the other.
Definition
 1. nominal2. interval3. ordinal4. ratio
Term
 which is an example of qualitative data?1. social security number2. score on multiple choice exam3. height, in meters4. number of square feet of carpet laid
Definition
 social security is qualitative
Term
 which of the following is an example of quantitative data?1. number on a baseball uniform2. serial number on a one dollar bill3. numer of dependents you claim on your income tax form
Definition
 number of dependents you claim on your tax income form
Term
 which one is not an example of descriptive statistics?1. histogram 2. estimate of number of alaska residents who have visited canada3. table summarizing data collected in a sample4. proportion of mailed out surveys completed and returned
Definition
 2. estimate of the number of alaska residents who have visited canadainferential statistics: drawing conclusions about a population based on sample results
Term
 ordered array is it useful for large or small sets of data?Does it help identify outliers?
Definition
 a sequence of ranked data in order. shows range provides some signals about variabilitymay help identify outliers if data array is large, the ordered array is less useful
Term
 stem and leaf diagram
Definition
 a simple way to see distribution details in a data set
Term
 frequency distribution
Definition
 a tabulation of the number of occurences of each score value or measurement why use it: it is a way to summarize numerical data, it condenses the raw data into a more useful form, it allows for a quick visual interpretation of the data
Term
 the histogram
Definition
 graph of the data in a frequency distribution is called a histogram the class boundaries are shown on the horizontal axis, the vertical axis is either the frequency, relative frequency or percentage, bars of the appropriate heights are used to represent the number of observations within each class width of bars represents width of class interval
Term
 scatter diagrams
Definition
 used to examine possible relationships between two numerical variables
Term
 time series plot
Definition
 used to study patterns in the values of a variable over time- time is usually measured on the horizontal axis
Term
 measures of central tendency:arithmetic mean
Definition
 1. arithmetic mean: most common, advantage=uses actual numerical values, disadvantage= affected by extreme values (outliers)
Term
 point estimate
Definition
 like a sample mean, is a one-number estimate of the value of a population parameter
Term
 median
Definition
 advantage: less sensitive to extreme values, can be used for ordinal datadisadvantage: based on less information than the meanmedian position= (n+1)/ 2 position in the ordered data- it is not the value of the median, it is only the position of the median in the ranked data
Term
 mode
Definition
 value that occurs most oftenadv: not affected by extreme values, can be used for either numerical or categorical datadisadvantage: ignores much information in the datathere may be no modethere may be several modes
Term
 which is best measure of location of "center"1. if outliers exist2. when using categorical data3.if outliers dont exist
Definition
 1. median2. mode3. mean
Term
 box & wisker plot how to find position of 1st, 2nd and 3rd quartiles in ranked data
Definition
 Q1=(n+1)/4Q2= (n+1)/2Q3=3(n+1)/4advantage: you can use when you have extreme values
Term
 geometric mean & geometric rate of return
Definition
 geo mean=used to measure the rate of change of a variable over time. ROR=measures the status of an investment over time geo mean: = (X1 x X2 x...x Xn) ^ (1/n)ROR=[(1+R1) x (1+R2) x ... x (1+Rn)]^(1/n) -1
Term
 geometric vs arithmetic returns which is better?
Definition
 geometric, it eliminates risk
Term
Definition
 the simplest measure of variationdifference between the largest and the smallest values in a set of datadisadvantages: ignores the way in which data are distributed
Term
 measures of variation: interquartile range
Definition
 some outlier problems can be eliminated by using the interquartile range. some high and low valued observations are eliminated and the range is calculated from the remaining values (middle 50%)Q3-Q1
Term
 the variance
Definition
 average of squared deviations of values from the mean. for pop: σ2 = Σ ( Xi - μ )2 / Nfor sample: s2 = Σ ( xi - x )2 / ( n - 1 )
Term
 standard deviation
Definition
 is the square root of the variancemost commonly used measure of variationshows variation about the meanhas the same units as the original datapop: sqrt [ Σ ( Xi - μ )2 / N ]sample: sqrt [ Σ ( xi - x )2 / ( n - 1 ) ]
Term
 measures of variation: summary characteristics
Definition
 the more the data are spread out, the greater the range, variance, and standard deviation.if the values are all the same (no variation) all these measures will be zeronone of these measures are ever negative
Term
 advantages of variance and standard deviation
Definition
 each value in the data set is used in the calculationvalues far from the mean are given extra weight (because the deviations from the mean are squared)
Term
 coefficient of variation
Definition
 measures variation relative to meanalways in %can be used to compare two or more sets of data measured in different unitsshows risk in stocks CV: (standard deviation/mean)
Term
 z scores
Definition
 we use the standard deviation to standardize scores.a z score is a measure of distance from the mean in terms of standard deviation units it is the difference between a value and the mean, divided by the standard deviationa z score about 3.0 or below -3.0 is considered an outlier
Term
 left skewed median>mean or median
Definition
 median>mean
Term
 the empirical rule
Definition
 if the data distribution is approximately bell-shaped, then the interval, mean+ or - 1 standard deviation = 68% of the values in the population or the sample,2 S= 95%3 S= 99.7
Term
 chebyshev rule
Definition
 regardless of how the data are distributed, at least (1-1/K^2) x 100 of the values will fall within K standard deviations of the mean (for k>1) at least 56% data within 1.5 S of meanat least 75% data within 2 S of meanat least 89% data within 3 S of mean
Term
 in general, which of the following descriptive summary measures cannot be easily approximated from a box and wisker plot?a. varianceb. the rangec. the interquartile ranged. the median
Definition
 A. the variance
Term
 sample covariance
Definition
 measures the strength of the linear relationship between two variables (called bivariate data). it is a non-standardized measure of the joint variance of the two variablesonly concerned with the strength of the relationshipno causal is impliedcov xy= sum of (x-xmean)(y-ymean)/n-1
Term
 cov (x,y) >0 = move in ___ directioncov (x,y)<0 = move in ___ directioncov (x,y)=0 = x& Y are ____
Definition
 sameoppositeindependentdepends on the units of measurement of x and y, so cannot compare relative strength of the relationship between variables
Term
 coefficient of correlation
Definition
 measures the relative strength of the linear relationship between two variablessample coef. of correlation: r= cov(x,y)/SxSy
Term
 features of correlation coefficient
Definition
 population= p sample=runt free standardized measureranges between 1 &-1 the closer to -1 the stronger the negative linear relationshipthe closer to 1 the stronger the positive linear relationship the closer to 0 the weaker the linear relationship
Term
 a correlation of -.32 is stronger than .30
Definition
 TRUE
Term
 True or false: Descriptive statistics are used to draw conclusions about a population based on sample data
Definition
 false, Inferential statistics are used to draw conclusions about a population based on sample data
Term
 Which os the following is false? A pareto diagram: 1. is a bar chart where categories are shown in descending order of frequency 2. is used to portray numerical data on an interval scale 3. is often shown with a cumulative polygon 4. is used to separate the "vital few" from the "trivial many"
Definition
 it is false that it is used to portray numerical data on an interval scale paretos are used to portray categorical data
Term
 You would like to represent the distribution of students in a class based on class. which is the best for presenting data? 1. pie chart 2. stem and leaf 3. scatter plot 4. time series plot
Definition
 pie chart
Term
 t/f unlike a grouped frequency distribution, a stem and leaf plot usually preserves original data values
Definition
 true
Term
 t/f scatter diagrams are used to examine possible relationships between numerical and categorical data
Definition
 false just for numerical data
Term
 priori vs empirical classical probability vs subjective
Definition
 priori=each outcome is equally likelyp(y)=p(x)empirical=like relative frequency subjective= an individual judgment or opinion about the probability of occurrence
Term
 the probability of at least one head in two flips is:1..332. .53 .754. 1
Definition
 .75 at least= 1- P(no head)1-.25=.75
