Term
Descriptive and Multivariate Statistics (Chapter 9)
Purpose: Farmiliarize readers with the principles of descriptive and multivariate statistics so that they are better producers as well as consumers of research and police data.


Definition
Goals:

Summarize large and small data sets

Examine the integrity of large and small data sets

Determine which statistics best portray the data

Compare more than one variable to others

Apply descriptive statistics to problem solving and data driven decisionmaking



Term

Definition
Science of collecting and organizing data and then drawing conclusions based on data.



Term
Question:
What are the three types of statistics? 

Definition
Answer:
Descriptive, multivariate, and inferential 


Term

Definition
Descriptive Statistics summarize large amounts of information in an efficient and easily understood manner. 


Term

Definition
Multivariate Statistics allow comparisons among factors by isolating the effects of one factor or variable from others that may distort conclusions.



Term

Definition
Inferential Statistics suggest statements about a population based on a sample drawn from that population. 


Term

Definition
Measurement is a process of assigning numbers or labels to units of analysis or items under study. 


Term
Concept of Levels of Measurement


Definition
Four Levels of measurement:

Nominal

Ordinal

Interval

Ratio
(Each conveys a different amount of information.) 


Term
Nominal Level of Measurement 

Definition
All categories must be exhaustive (covering all observations that may exist)
Categories must be mutually exclusive (each observation can only be classified one way) Provides names or labels for distinguishing observations.
Lowest Level of Measurement
(EX – Race/Gender – assigned numbers can not be calculated) 


Term
Ordinal Level of Measurement 

Definition
Categories must be exhaustive and mutually exclusive.
Categories must exhibit a degree of difference which indicates order or ranking
Categories are ordered in some way, but the actual distance between these orderings would not have any meaning.
(EX – Opinion based  good, better, best…) 


Term
Interval Level of Measurement 

Definition
Categories must be exhaustive, mutually exclusive and exhibit a degree of difference.
Assumes that all items on a scale have equal intervals between them.
Logical distances between categories expressed in meaningful intervals.
(EX Temp & IQ)



Term
Ratio Level of Measurement 

Definition
All Characteristics of Interval
Contains a true ZERO point. (A true zero point allows for measuring the total absence of the concept under measure.)
(Ex. Income, weight, time and age)



Term
Implications regarding Levels of Measurement 

Definition

Ratio is the highest level of measurement (Includes all characteristics.)

Researchers should strive for highest level of measurement possible. Lower levels of measurement can not be converted to higher levels, but higher levels can be converted to lower levels.

(Most Important) The statistical technique to be applied with determine the level of measurement needed.



Term
Distribution of Data Sets 

Definition
Process by which large and cumbersome data sets are described in a manner that is easily understood. 


Term

Definition
Allows for a basic description of the data set and for graphical representation. Allows for more efficient management and analysis of large data sets.
x (category)

f (frequency)

fx (f times x)

Highest Numerical Value



Lowest Numerical Value




N = ∑ f

∑ fx




Term

Definition
A summary statistic that provides limited information but allows for condensing a frequency distribution. The RANGE is obtained by subtracting the Highest Numerical Value from the Lowest Numerical Value. The RANGE is used to obtain a class interval. It is also a simple measure of variation
Class Interval

=

Range

i

N of Desired Intervals




Term

Definition
Process by which the class interval is used to group a Frequency Distribution.
Note: Although the data have been condensed and the distribution has changed, the nature of the data remains the same. 


Term
Question:
What is the purpose of Charts and Graphs? 

Definition
Answer:
To portray the distribution of data for a quick and meaningful understanding. 


Term

Definition
•
Percentages are the relation between two or more numbers for which the whole is accorded a value of 100.
•
•
Calculated by dividing the frequency of each interval by the total number of cases.
Useful for managerial reports and policy evaluations.



Term

Definition
Calculated by adding the percent column for each class interval. 


Term
Question:
What should you do with missing data? 

Definition
Answer: In cases containing missing data, you cannot determine whether the missing cases would have fallen into a particular segment or class interval.
Option 1: Include a segment labeled “Missing Cases” and include in the total. (Deflates %)
Option 2: Omit the missing cases altogether. (Inflates %)
In either case document whether missing data has been included or omitted. 


Term
Measures of Central Tendency 

Definition
Most common forms of descriptive statistics. They describe the average value from a distribution of values. The primary measures are MEAN, MEDIAN and MODE. 


Term

Definition
The mean is the arithmetic average and is calculated by dividing the sum of scores by the number of cases. A distribution of data can have only one mean. The one weakness of the mean is that it is affected by extreme score(s) in a distribution. 


Term

Definition
Outliers are extreme scores. Single or small numbers of exceptional cases that deviate from the general pattern of scores. 


Term

Definition
The midpoint or middle score of a distribution. The median is not significantly affected by outliers. In the event that there is an even number of scores in the distribution, rank order the distribution and calculate the average value of the two middle scores. 


Term

Definition
The mode indicates the most frequently occurring score(s) or label in a distribution. It is possible to have more then one score or interval tie for most occurring. The mode is primarily used for nominal measurements as it provides limited information and is not subject of further statistical analysis. 


Term

Definition
Range, Variance and Standard Deviation 


Term

Definition
Variance is a statistical measure that tells us how measured data vary from the average value of the set of data. Variance is the sum of the squared deviations of each score from the mean, divided by the total number of cases. 


Term

Definition
Standard Deviation measures the average distance that each data item is item is away from the mean of all data items in a distribution. It provides insight on how scores in a distribution compare with each other and allows for comparisons between two distributions.

Any standard deviation value has no real intuitive meaning;

Most useful in a comparative sense;

Comparing the relative values for the standard deviation and the mean indicates how much variation there is in a group of cases, relative to the average.



Term

Definition
Skewness illustrates the spread of scores weighted to one side of the mean. The are three distinct patterns that may emerge from unimodal distributions: normal, positive, and negative. 


Term

Definition
There is no skew. Scores are evenly distributed throughout a distribution and statistical assumptions regarding the data can be made. Normal distributions will produce equal measures of central tendency. 


Term

Definition
Unimodal distribution of scores weighted to the left with the mode being the largest value followed by the median and then the mean. Hump to the left with an extended tail to the right. 


Term

Definition
Unimodal distribution of scores weighted to the right generally with the mean being the largest value followed by the median and then the mode. Hump to the right with an extended tail to the left. 


Term
Question:
What should you use to represent a skewed distribution? 

Definition
Answer:
Depends on the nature of the data and the purpose of the research or project.



Term

Definition
Used to standardize some measure for comparative purposes. Calculated by dividing raw numbers by a comparable denominator.
Rate =

Raw Number of Occurrence

X

Unit of Measure

Point of Comparison




Term

Definition
A variable with only two categories 


Term

Definition
Best description of central tendency with dichotomous data. Proportions are the relationships between two or more categories or values. Obtained by dividing the value of the part by the value of the whole. Proportions can be considered the mean of a dichotomous variable and will have a value range between 0 and 1. 


Term

Definition
•
A percentage change is a way to express a change in a variable. It represents the relative change between the old and the new value.
Note: Care should be exercised when comparing two time periods exclusively especially with long periods of time in between.
Percent Change =

After Value  Before Value

X

100

Before Value




Term

Definition
Statistics that encompass the simultaneous observation and analysis of more than one statistical variable with a focus on assessing the strength and identifying patterns of association between or among the variables. 


Term

Definition
Regression analysis is a procedure for pattern recognition. Regression attempts to plot a line (trendline) through a given set of data points on a graph. Given a strong enough pattern of association, a regression line suggests possibilities for predicting future values. 


Term

Definition
A group of statistical techniques used to measure the strength of the relationship between variables. Correlation measures the relative “fit’” or degree of association between two or more variables. This technique provides a measure of the quality of the regression line, and therefore of the reliability of any predictions based on it. 

