Shared Flashcard Set

Details

Stat Ch. 4
Summarizing Data: Calculating Statistical Values & Detecting Outliers
19
Other
Undergraduate 2
02/12/2012

Additional Other Flashcards

 


 

Cards

Term
Descriptive Statistics
Definition
involves using basic statistical calculations and graphical representations
Term
Measures of central tendency
Definition
Mean, Median, Trimmed Mean
Term

Five number summary is a ___.

It is used to measure both ____ ____ and ____.

Definition
set of five statistical values used to describe data that has asymmetric shape (box plot or dot plot)
Term

Central tendency is _____.

Variability is _____.

Definition

The center of the data.

The amount of spread or error in the data.

Term

Q1 is called the ____ or _____.

Q3 is called the ____ or _____.

Definition

- first quartile or the twenty-fifth percentile

- third quartile or the seventy-fifth percentile.

Term

The Interquartile Range is ____.

The Range is ____.

Definition

- is simply the difference in the difference in the third and first quartile vales of the dataset. (IQR=Q3-Q1)

- is simply the difference in the maximum and the minimum values of the dataset. (Range=Max-Min)

Term

The Sample Mean is _____.

 

Definition

- is a statistical value that measures central tendency


 

 

Term
The Trimmed Sample Mean is ____.
Definition

- is an adjusted sample mean that removes possible extreme observation values from the calculation.

- most common tr is 10%

k=.10n --round this value up to the nearest integer and remove that many observations from the highest and lowest

Term

The Sample Standard Deviation measures ____.

It represents the ____ in the data set and an _____.

Definition

- measures the spread in the dataset (how far apart are the data values)

- represents the diversity in the data set and an estimate of what the population standard deviation (σ) is.

Term
The Sample Variance is ____
Definition
- is simply the sample standard deviation squared
Term
In statistics you can compare the ____ ____ directly, but not the ____ ____.
Definition
- sample means directly, not the standard deviations.
Term

The Coefficient of Variation is a _______.


 

It is _____, meaning that it has no units.


It allows you to compare datasets 

CV = s/mean

 

Definition

- is a statistical value that measures the spread of the data relative to the size of the numbers.


 

- dimensionless 

Term
The Five Number Summary pinpoints ____ ____ of the sample dataset.
Definition
- Percentile cutoffs
Term

The k th percentile cutoff, Pk is a cutoff value where _____.

- Sort the dataset in ascending order.

- Calculate the value of Q. Q=(nk)/100

-If the value of Q is a ____, round up to the nearest integer and go to the Qth observation

-If the value is an ____, find the average of the Qth and (Q+1)th observations.

Definition

- where approximately k% of the data is below this value.

- decimal

-integer

Term

An Outlier is _____.


 

An Intentional Outlier is ____.


 

An Unintentional Outlier is ____.

Definition

- is a recorded abnormal data value for an observation.


 

- is due to data entry error or measurement error while conducting the experiment.


is due to recording the value of an observation when he/she/it is not in its natural phenomenon.

 

Term

The Z-Score Method should be used if you _____.

 

z = (x-mean)/s

Definition
- if you assume or can show that the data has approximately a symmetric shape.
Term

To detect any outliers of a dataset for Z-score:

- calculate the ___ ___ and ___ ___ ____.

- calculate the Z-score value for each observation in the dataset z=(x-mean)/s)


If the Z-score is more than ___ or less then ___, then the observation is a possible outlier

 

Definition

- calculate the sample mean and sample standard deviation.

 


- +3, -3

 

Term
The IQR Method should be used if you ____.
Definition
- if you assume or can show that the data has a non-symmetric shape.
Term

To detect any outliers of a data set using IQR:

-calculate the interquartile range (IQR=Q3-Q1)

-subtract 1.5*IQR from Q1 to create a lower boundary (Q1-1.5 IQR)

-add 1.5*IQR from Q3 to create an upper boundary (Q3+1.5IQR)

Definition
- If there are any observation values oustide of this interval, they are potential outliers.
Supporting users have an ad free experience!