Term
population definition & what is the name of the numerical measure that describes a characteristic of a population? 

Definition
collection of all members of a group parameter 


Term
sample definition and what is the numerical measure that describes a characteristic of a sample? 

Definition
a portion of the population selected for analysis statistic 


Term

Definition
drawing conclusions about a population based only on sample data. 


Term

Definition
collecting, summarizing, and presenting data. 


Term
discrete vs continuous 1. # people in the room 2. time of commute 3. height 4. td's scored by pack 5. weight 

Definition
both are characteristics of numerical (quantitative data) 1. discrete 2. continuous 3. continuous 4. discrete 5. continuous 


Term
categroical (qualitative) vs numerical (quantitative) 1. marital status 2. deflects per hour 3. voltage 4. eye color 

Definition
1. categorical 2. numerical & discrete 3. numerical & continuous 4. categorical 


Term
nominal vs ordinal vs interval vs ratio data 1. 1st, 2nd places in a race 2. temperature f/C 3. money 4. height 5. age 6.type of car owned 7. student's letter grades 8. service quality rating 9. standardized exam score 

Definition
qualitative (nominal, ordinal) vs. quantitative (interval, ratio) nominal: categories (no ordering or direction) ordinal: ordered categories (rankings, ratings, order or scaling) interval: differences between measurements but no true zero ratio: differences between measurements, true zero exists 1. ordinal 2. interval 3. ratio (you can have absolutely no money) 4. ratio 5. ratio 6.nominal 7.ordinal 8.ordinal 9. interval 


Term
how are nominal/ordinal/interval/ratio graphed? qualitative aka categorical (nominal/ordinal) vs quantitative aka numerical (interval/ratio) 

Definition
categorical: bar chart, pie chart, pareto chart, (graphing data) summary table (tabulating data) numerical: stem and leaf display (ordered array), histogram, polygon, ogive ( all frequency distribution and cumulative distributions) 


Term
i measure 2 students and use their resulting scores to make a statement comparing them. Identify the scale of measurement used: 1. I can only say that the two students are different 2. I can say that one student scored 6 points higher than the other 3. I can say that one student scored higher than the other, but I can't specify how much higher. 4. I can say that the score for one student is 2x the score of the other. 

Definition
1. nominal 2. interval 3. ordinal 4. ratio 


Term
which is an example of qualitative data? 1. social security number 2. score on multiple choice exam 3. height, in meters 4. number of square feet of carpet laid 

Definition
social security is qualitative 


Term
which of the following is an example of quantitative data? 1. number on a baseball uniform 2. serial number on a one dollar bill 3. numer of dependents you claim on your income tax form 

Definition
number of dependents you claim on your tax income form 


Term
which one is not an example of descriptive statistics? 1. histogram 2. estimate of number of alaska residents who have visited canada 3. table summarizing data collected in a sample 4. proportion of mailed out surveys completed and returned 

Definition
2. estimate of the number of alaska residents who have visited canada inferential statistics: drawing conclusions about a population based on sample results 


Term
ordered array is it useful for large or small sets of data? Does it help identify outliers? 

Definition
a sequence of ranked data in order. shows range provides some signals about variability may help identify outliers if data array is large, the ordered array is less useful 


Term

Definition
a simple way to see distribution details in a data set 


Term

Definition
a tabulation of the number of occurences of each score value or measurement why use it: it is a way to summarize numerical data, it condenses the raw data into a more useful form, it allows for a quick visual interpretation of the data 


Term

Definition
graph of the data in a frequency distribution is called a histogram the class boundaries are shown on the horizontal axis, the vertical axis is either the frequency, relative frequency or percentage, bars of the appropriate heights are used to represent the number of observations within each class width of bars represents width of class interval 


Term

Definition
used to examine possible relationships between two numerical variables 


Term

Definition
used to study patterns in the values of a variable over time time is usually measured on the horizontal axis 


Term
measures of central tendency:arithmetic mean 

Definition
1. arithmetic mean: most common, advantage=uses actual numerical values, disadvantage= affected by extreme values (outliers) 


Term

Definition
like a sample mean, is a onenumber estimate of the value of a population parameter 


Term

Definition
advantage: less sensitive to extreme values, can be used for ordinal data disadvantage: based on less information than the mean median position= (n+1)/ 2 position in the ordered data it is not the value of the median, it is only the position of the median in the ranked data 


Term

Definition
value that occurs most often adv: not affected by extreme values, can be used for either numerical or categorical data disadvantage: ignores much information in the data there may be no mode there may be several modes 


Term
which is best measure of location of "center" 1. if outliers exist 2. when using categorical data 3.if outliers dont exist 

Definition


Term
box & wisker plot how to find position of 1st, 2nd and 3rd quartiles in ranked data 

Definition
Q1=(n+1)/4 Q2= (n+1)/2 Q3=3(n+1)/4 advantage: you can use when you have extreme values 


Term
geometric mean & geometric rate of return 

Definition
geo mean=used to measure the rate of change of a variable over time. ROR=measures the status of an investment over time geo mean: = (X1 x X2 x...x Xn) ^ (1/n) ROR=[(1+R1) x (1+R2) x ... x (1+Rn)]^(1/n) 1 


Term
geometric vs arithmetic returns which is better? 

Definition
geometric, it eliminates risk 


Term
measure of variation: Range disadvantages? 

Definition
the simplest measure of variation difference between the largest and the smallest values in a set of data disadvantages: ignores the way in which data are distributed 


Term
measures of variation: interquartile range 

Definition
some outlier problems can be eliminated by using the interquartile range. some high and low valued observations are eliminated and the range is calculated from the remaining values (middle 50%) Q3Q1 


Term

Definition
average of squared deviations of values from the mean. for pop: σ2 = Σ ( Xi  μ )2 / N for sample: s2 = Σ ( xi  x )2 / ( n  1 ) 


Term

Definition
is the square root of the variance most commonly used measure of variation shows variation about the mean has the same units as the original data pop: sqrt [ Σ ( Xi  μ )2 / N ] sample: sqrt [ Σ ( xi  x )2 / ( n  1 ) ] 


Term
measures of variation: summary characteristics 

Definition
the more the data are spread out, the greater the range, variance, and standard deviation. if the values are all the same (no variation) all these measures will be zero none of these measures are ever negative 


Term
advantages of variance and standard deviation 

Definition
each value in the data set is used in the calculation values far from the mean are given extra weight (because the deviations from the mean are squared) 


Term

Definition
measures variation relative to mean always in % can be used to compare two or more sets of data measured in different units shows risk in stocks CV: (standard deviation/mean) 


Term

Definition
we use the standard deviation to standardize scores. a z score is a measure of distance from the mean in terms of standard deviation units it is the difference between a value and the mean, divided by the standard deviation a z score about 3.0 or below 3.0 is considered an outlier 


Term
left skewed median>mean or median 

Definition


Term

Definition
if the data distribution is approximately bellshaped, then the interval, mean+ or  1 standard deviation = 68% of the values in the population or the sample, 2 S= 95% 3 S= 99.7 


Term

Definition
regardless of how the data are distributed, at least (11/K^2) x 100 of the values will fall within K standard deviations of the mean (for k>1) at least 56% data within 1.5 S of mean at least 75% data within 2 S of mean at least 89% data within 3 S of mean 


Term
in general, which of the following descriptive summary measures cannot be easily approximated from a box and wisker plot? a. variance b. the range c. the interquartile range d. the median 

Definition


Term

Definition
measures the strength of the linear relationship between two variables (called bivariate data). it is a nonstandardized measure of the joint variance of the two variables only concerned with the strength of the relationship no causal is implied cov xy= sum of (xxmean)(yymean)/n1 


Term
cov (x,y) >0 = move in ___ direction cov (x,y)<0 = move in ___ direction cov (x,y)=0 = x& Y are ____ 

Definition
same opposite independent depends on the units of measurement of x and y, so cannot compare relative strength of the relationship between variables 


Term
coefficient of correlation 

Definition
measures the relative strength of the linear relationship between two variables sample coef. of correlation: r= cov(x,y)/SxSy 


Term
features of correlation coefficient 

Definition
population= p sample=r unt free standardized measure ranges between 1 &1 the closer to 1 the stronger the negative linear relationship the closer to 1 the stronger the positive linear relationship the closer to 0 the weaker the linear relationship 


Term
a correlation of .32 is stronger than .30 

Definition


Term
True or false: Descriptive statistics are used to draw conclusions about a population based on sample data 

Definition
false, Inferential statistics are used to draw conclusions about a population based on sample data 


Term
Which os the following is false? A pareto diagram: 1. is a bar chart where categories are shown in descending order of frequency 2. is used to portray numerical data on an interval scale 3. is often shown with a cumulative polygon 4. is used to separate the "vital few" from the "trivial many" 

Definition
it is false that it is used to portray numerical data on an interval scale
paretos are used to portray categorical data 


Term
You would like to represent the distribution of students in a class based on class. which is the best for presenting data? 1. pie chart 2. stem and leaf 3. scatter plot 4. time series plot 

Definition


Term
t/f unlike a grouped frequency distribution, a stem and leaf plot usually preserves original data values 

Definition


Term
t/f scatter diagrams are used to examine possible relationships between numerical and categorical data 

Definition
false just for numerical data 


Term
priori vs empirical classical probability vs subjective 

Definition
priori=each outcome is equally likely p(y)=p(x) empirical=like relative frequency subjective= an individual judgment or opinion about the probability of occurrence 


Term
the probability of at least one head in two flips is: 1..33 2. .5 3 .75 4. 1 

Definition
.75 at least= 1 P(no head) 1.25=.75 

