Shared Flashcard Set

Details

Statistics: Exam 1
Vocabulary
65
Other
Undergraduate 1
01/31/2011

Additional Other Flashcards

 


 

Cards

Term
bar graph
Definition
a graphical representation of categorical data; names of each category are listed on the x axis and a bar is placed over each category name having height equal to the frequency (or percentage) in that category.
Term
bias
Definition
A condition that occurs when the design of a study systematically favors certain outcomes.
Term
blocking
Definition
The grouping of individuals according to some characteristic like rats in the same litter or plots of land at the same location. The random allocation is carried out separately within each group.
Term
boxplot
Definition
A plot of data based on the five number summary. A line is drawn from the minimum observation to Q1; a box is drawn from Q1 to Q3 with a vertical line at the median and a line is drawn from Q3 to the maximum observation.
Term
categorical variable
Definition
A variable that can be classified into groups or categories such as gender, religion, zip-code, etc. Typically, words are used to describe an individual.
Term
comparative study
Definition
A study where the explanatory variable has two active treatments rather than an active treatment versus a control. Purpose of study is to determine which treatment works best rather than whether a treatment works. Randomization together with comparison enables the researcher to control lurking variables and apply the laws of probability for inference
Term
completely randomized design
Definition
An experimental design where all individuals participating in the experiment are assigned at random to the treatments
Term
confounded variable
Definition
A variable whose effect on the response variable cannot be seperated from the effect of the explanatory variable on the response variable. (Note: Usually confounded variables are lurking variables but only a few lurking variables are also confounded)
Term
confounding
Definition
a situation where the effect of one variable on the response variable cannot be separated from the effect of another variable on the response variable
Term
control
Definition
An "inactive" treatment where no experimental condition is applied to the individuals in order to determine whether the active treatment works. Randomizing together with a control enables the researcher to manage lurking variables when there is not a comparison group. Note: a control is not necessary for a valid experiment as long as two or more comparison treatments are used.
Term
convenience sample
Definition
A sample where the researcher contacts those subjects who are readily available and does not use any random selection. The results are almost surely biased.
Term
distribution
Definition
A list or a graph that shows the possible values of a variable together with the frequency of each value
Term
dotplot
Definition
A one dimensional plot of quantitative data set where each value in the data set is represented by a dot above its corresponding location on the x axis
Term
double blind
Definition
neither the subject nor the doctor, nurse or whoever is diagnosing the results knows which treatment the subject recieved
Term
experiment
Definition
A study where a treatment is deliberately imposed on each individual in the study before responses are measured in order to observe responses to the treatment. A valid experiment must have 1) control or comparison, 2) randomization and 3) replication
Term
explanatory variable
Definition
a variable that may or may not explain the outcomes (responses) of a study. It is described using a phrase that describes all possible treatments. Note: an observational study can have an explanatory variable, but a valid experiment always has an explanatory variable
Term
factor
Definition
another term for explanatory variable
Term
first rule of data analysis
Definition
plot the data
Term
five number summary
Definition
minimum, Q1, median, Q3, maximum; preferred when data are very skewed or have outliers
Term
histogram
Definition
a graphical display of a quantitative data set; data are separated into intervals of equal width and a bar is drawn over the interval having height equal to the frequency (or percentage) of values in the interval. Values of the variable are given on the x axis and frequencies (or percentages) are given on the y axis. (Hence, a histogram gives a distribution.) Histograms are described by shape, center, and spread
Term
individual
Definition
the basic unit (or subject) of the experiment upon which a treatment is applied
Term
interquartile range (IQR)
Definition
a measure of variability recommended for skewed data or data with outliers; computed as IQR= Q3-Q1
Term
lack of realism
Definition
a weakness in experiments where the setting of the experiment does not realistically duplicate the conditions we really want to study
Term
left skewed
Definition
a density curve where the left side of the distribution extends in a long tail. (Mean
Term
lurking variable
Definition
a variable that has an important effect on the relationship among the variables in a study but is not taken into account (Technically, a true lurking variable "interacts" with the explanatory variable in its effect on the response variable.)
Term
mean
Definition
a measure for the center of the data; it's the point that "balances" the data
Term
median
Definition
a measure of the center of data; it's the point such that half the numbers are smaller and the other half are larger (the midpoint of the ordered data set)
Term
multi-stage sample
Definition
sampling is conducted in stages; for a two-stage sample, the individuals are grouped according to some characteristic—groups are first randomly selected and then individuals are randomly selected
from those selected groups. (In a stratified sample, individuals are randomly selected from every group.) For example, states could be randomly selected; then school districts within selected states, followed by schools within selected school districts within selected states and finally students would be randomly selected from the selected schools from the selected school districts from selected states. That would be a four-stage sample
Term
non-response bias
Definition
bias resulting when individuals selected to be in a survey either cannot be contacted or refuse to answer survey questions
Term
non-response bias
Definition
bias resulting when individuals selected to be in a survey either cannot be contacted or refuse to answer survey questions
Term
Normal Distribution
Definition
a bell-shaped symmetric density curve used to model data sets that have a symmetric mound or bell shape
Term
observational study
Definition
a study that merely observes conditions of individuals in a population and records information; the population is disturbed as little as possible. Note: treatments are not imposed on units.
Term
outlier
Definition
an observation falls outside the overall pattern of the data set. Can be detected by checking observation Q3+1.5 IQR
Term
pie chart
Definition
a graphical display of categorical data using a "pie"; each category is represented as a slic where the size of the slice is proportional to the percentage of data in that category. Not recommended by statisticians.
Term
placebo effect
Definition
the response of patients to any treatment even though it has no physical effect
Term
population
Definition
the entire group of individuals about whom we desire to collect information
Term
probability sample
Definition
a sample selected using a random device where each individual in the population has a chance (doesn't have to be equal) of being selected. Probability samples are necessary for making inferences. Examples include: SRS, stratified and multistage
Term
Q1
Definition
A location measure of the data such that has one fourth or 25% of the data is smaller than it. Found by dividing the ordered data set in half (excluding the middle observation if n is odd) and finding the median of the lower half of the data.
Term
Q3
Definition
A location measure of the data that has three-fourths or 75% of the data is smaller than it. Found by dividing the ordered data set in half (excluding the middle observation if n is odd) and finding the median of the upper half of the data.
Term
quantitative variable
Definition
A variable with numerical values such as height or weight. This type of data is required for both variables in regression analysis.
Term
random number table
Definition
A table of digits consisting of digits 0 through 9 whose order cannot be determined but in the long run, each digit occurs 10% of the time.
Term
randomization
Definition
A method of assigning individuals in an experiment to treatment groups using some random device that eliminates bias and gives each unit the same probability of being assigned to any treatment group. Randomization “balances” the treatment groups, thus averaging out lurking and extraneous variables. Allows us to use the laws of probability to make inferences.
Term
range
Definition
The maximum observation minus the minimum observation. Given as one number in statistics (i.e. If max = 98 and min = 12, then range = 98 – 12 = 86)
Term
replication
Definition
Having more than one individual in each treatment group. Replication is necessary for measuring variability. Also, the greater the replication, the more precise the results.
Term
response bias
Definition
bias resulting from individuals in a sample lying or giving incorrect response because they do not have knowledge about the question or can’t recall; response bias could also result from wording of the question or from interviewers influence the responses either intentionally or unintentionally.
Term
response variable
Definition
A variable that gives the result (may not be a number) of the outcome of a study.
Term
right skewed distribution
Definition
A density curve where the right side of the distribution extends in a long tail; (mean > median)
Term
sample
Definition
a subset of individuals in the population; the group of individuals about which we actually collect information.
Term
simple random sample
Definition
A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.
Term
standard deviation
Definition
A measure of the “average” or typical deviation of the observations about the mean; measures variability of data about the mean.
Term
Standard Normal Curve
Definition
A normal distribution with mean of zero and standard deviation of one. Probabilities are given in Table A for values of the standard Normal variable.
Term
statistically significant
Definition
Results of a study that differ too much from what we expected because of randomization to attribute to chance.
Term
stemplot
Definition
A graphical representation of a quantitative data set. Leading values of each data point are presented as stems and second digits are given as leaves.
Term
stratified sampling
Definition
A sampling scheme where the population has been divided into strata according to some characteristic and a simple random sample is selected from within each stratum.
Term
symmetric distribution
Definition
a density curve where the right half is a mirror image of the left half of the distribution. (Mean = median)
Term
undercoverage bias
Definition
Bias that occurs because the list of the population from which the sample is drawn is incomplete—meaning that some people in the population are not listed for selection.
Term
voluntary response sample
Definition
A method of sample selection that consists of people choosing themselves by responding to a general appeal.
Term
z-score
Definition
A measure of the number of standard deviations a value or observation is from the mean, a standardized value.
Term
x
Definition
sample mean (x-bar)
Term
μ
Definition
population or distribution mean
Term
s
Definition
sample standard deviation
Term
σ
Definition
population or distribution standard deviation
Term
Q1
Definition
first quartile
Term
Q3
Definition
third quartile
Term
IQR
Definition
interquartile range= Q3 minus Q1
Supporting users have an ad free experience!