Art and science of collecting, analyzing, presenting, and interpreting data 


Facts and figures collected, analyzed, summarized for interpretation and presentation 


Entities on which data are collected 


Characteristic of interest for the elements 


When the data for a variable consists of labels or names used to identify an attribute of the element 


Scale of measurement for a variable
Example: excellent, poor, good on a survey 


Scale of measurement used if the data have all the properties of ordinal data and the interval between values is expressed in terms of a fixed unit of measure. Always numeric.
Example: College Admission SAT scores 


Scale used if the data have all the properties of interval data and the ratio of two values is meaningful
Example: distance, height, time 


Data grouped into specific catergories 


Data that uses numeric values 


Variable that uses categorical data 


Variable that uses quantitative data 


Data collected at the same or approximately the same point in time 


Data collected over several time periods 


Summaries of data, which may be tabular, graphical, or numerical 


The larger group of elements in a particular study 


Smaller group of the population; selected randomly 


Process of conducting a survey to collect data for the entire population 


Statistical process that uses data from a sample to make estimates and test hypotheses about the characteristics of a population 


Deals with methods for developing useful decisionmaking information from large data bases 


A tabular summary of data showing the number (frequency) of items in each of several nonoverlapping classes 


Term
Relative Frequency Distribution 

Tabular summary showing relative frequency
Relative Frequency=F/n 


Term
Percent Frequency Distribution 

Summarizes the percent frequency of the data for each class.
Relative Frequency times 100=Percent frequency 


Graphical device depicting categorical (qualitative) data 


Another graphical device presenting relative frequency and percent frequency for categorical data 


Value halfway between the lower and upper class limits 


One of the simplest graphical summaries. Horizontal axis shows range for data. Each value is represented by dot above. 


Similar to a Bar chart but represents quantitative data rather than qualitative 


Term
Cumulative Frequency Distribution 

Shows the number of data items with values less than or equal to the upper class limit of each class 


Term
Cumulative Relative Frequency Distribution 

Shows the proportion of data items 


Term
Cumulative Percent Frequency Distribution 

Shows the percentage of data items with values less than or equal to the upper limit of each class 


A graph of cumulative distribution. Data values on horizontal axis. Frequencies of frequency percentages on vertical axis. 


Term
Explanatory Data Analysis 

Consists of simple arithmetic and easytodraw graphs that can be used to summarize data quickly 


Can be used to show both the rank order and shape of a data set simultaneously 


Tabular summary of data for two variables 


The reversal of conclusions bases on aggregate and unaggregated data 


Graphical presentation of the relationship between two quantitative variables 


Line that provides approximation of the relationship between quatitative variables 


Measures that are computed for data from a sample 


Measures are computed for data from a population 


A sample statistic is referred to as the point estimator of the corresponding population parameter 


Measure of central location. Value in the middle 


Provides info about how the data are spread over the interval from the smallest to largest value 


Division points after dividing data distribution into four parts 


Largest value minus smallest value 


Term
Interquartile range (IQR) 

Difference between the third quartile (Q3) and the first quartile (Q1) 


Measure of variability that utilizes all the data. Based on the difference between the value of each observation and the mean 


The positive square root of the variance 


Standard deviation divided by the mean multiplied by 100. 


Important numerical measure of the shape of a distribution 


At least (11/z^2) of the data values must be within z standard deviations of the mean, where z is any value greater than 1 


For data having a bellshaped distribution:
Approx. 68% of values will be within one standard deviation of the mean
Approx. 95% will be within two standard deviations
Almost all values will be within three standard deviations of the mean 


Following five numbers are uesd to summarize the data:
1. Smallest value
2. First Quartile (Q1)
3. Median (Q2)
4. Third quartile (Q3)
5. Largest value



Graphical summary of data that is based on fivenumber summary 


Term

Measure of the relationship between two variables that is not affected by the units of measurement for x and y 


Term

xi=value of observation
wi=weight of observation
x=(wi)(xi)/wi 

