Shared Flashcard Set

Details

Title

IDV 721 -Stats I

Description

Terms and Measures

Total Cards

185

Subject

Mathematics

Level

Graduate

Created

03/12/2010

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Mathematics Flashcards

Cards Return to Set Details

Term

adjusted r squared

Definition

a goodness-of-fit measure in multiple regression analysis that penalizes additional explanatory variables by using a degrees of freedom adjustment in estimating the error variance

Term

alternative hypothesis

Definition

the hypothesis against which the null hypothesis is tested

Term

average

Definition

the sum of numbers divided by n

Term

base group

Definition

the group represented by the overall intercept in a multiple regression model that includes dummy explanatory variables

Term

causal effect

Definition

a ceteris paribus change in one variable has an affect on another variable

Term

bias

Definition

the difference between the expected value of an estimator and the population value is it supposed to be estimating

Term

biased estimator

Definition

an estimator whose expectation, or sample mean, is different from the population value it is supposed to be estimating

Term

Breusch-Pagan Test

Definition

a test for heteroskedasticity where the squared OLS residuals are regressed on the explanatory variables in the model

Term

ceteris paribus

Definition

all other relevant factors are held fixed

Term

Chi-squared distribution

Definition

a probability distribution obtained by adding the squares of independent standard normal random variables. The number of terms in the sum equals the degrees of freedom in the distribution

Term

classic linear model

Definition

the multiple linear regression model under the first set of classical linear model assumptions

Term

cluster sample

Definition

a sample of natural clusters or groups that usually consist of people

Term

confidence interval

Definition

a rule used to construct a random interval so that a certain percentage of all data sets, determined by the confidence level, yields an interval that contains the population value

Term

confidence level

Definition

the percentage of samples in which we want our confidence interval to contain the population value; 95% is the most common confidence level, but 90% and 99% are also used

Term

consistency

Definition

an estimator converges in probability to the correct population value as the sample size grows

Term

covariance

Definition

a measure of linear dependence between two random variables

Term

critical value

Definition

in hypothesis testing, the value against which the test statistic is compared to determine whether or not the null hypothesis is rejected

Term

data frequency

Definition

the interval at which time series data are collected. Yearly, quarterly, and monthly are the most common data frequencies

Term

degrees of freedom

Definition

in multiple regression analysis, the number of observations, minus the number of estimated parameters

Term

dependent variable

Definition

the variable to be explained in the multiple regression model

Term

dummy variable

Definition

a variable that takes on the value of zero or one

Term

dummy variable trap

Definition

the mistake of including too many dummy variables among the independent variables; it occurs when an overall intercept is in the model and a dummy variable is included for each group

Term

econometric model

Definition

an equation relating the dependent variable to a set of explanatory variables and unobserved disturbances, where unknown population parameters determine the ceteris paribus effect of each explanatory variable

Term

economic model

Definition

a relationship derived from economic theory or less formal economic reasoning

Term

elasticity

Definition

the percentage change in one variable given a 1% ceteris paribus increase in another variable

Term

endogeneity

Definition

a term used to describe the presence of an endogenous explanatory variable

Term

endogenous explanatory variable

Definition

an explanatory variable in a multiple regression model that is correlated with the error term, either because of an ommitted variable, measurement error, or simultaneity

Term

endogenous variables

Definition

in simultaneous equation models, variables that determined by the equations in the system

Term

error term

Definition

the variable in a simple or multiple regression equation that contains unobserved factors that affect the dependent variable. The error term may also include measurement errors in the observed dependent or independent variables

Term

error variance

Definition

the variance of the error term in a multiple regression model

Term

estimate

Definition

the numerical value taken on by an estimator for a particular sample of data

Term

estimator

Definition

a rule for combining data to produce a numerical value for a population parameter; the form of the rule does not depend on the particular sample obtained

Term

exogenous variable

Definition

any variable that is uncorrelated with the error term in the model of interest

Term

expected value

Definition

a measure of central tendency in the distribution of a random variable, including an estimator

Term

experiment

Definition

in probability, a general term used to denote an event whose outcome is uncertain. In econometric analysis, it denotes a situation where data are collected by randomly assigning individuals to control and treatment groups

Term

explained sum of squares (SSE)

Definition

the total sample variation of the fitted values in the multiple regression model

Term

explanatory variable

Definition

in regression analysis, a variable that is used to explain variation in the dependent variable

Term

exponential function

Definition

a mathematical function defined for all variables that has an increasing slope but a constant proportionate change

Term

F distribution

Definition

the probability distribution obtained by forming the ration of two independent chi-square random variables, where each has been divided by its degrees of freedom

Term

F statistic

Definition

a statistic used to test multiple hypothesis about the parameters in a multiple regression model

Term

fitted values

Definition

the estimated values of the dependent variable when the values of the independent variables for each observation are plugged into the OLS regression line

Term

Gauss-Markov Assumptions

Definition

the set of assumptions under which OLS is BLUE (best linear unbiased estimator)... 1) linear in parameters 2)random sampling 3)sample variation in the explanatory variable 4)zero conditional mean 5)homoskedasticity

Term

Gauss-Markov Theorem

Definition

the theorem that states that, under the five Gauss-Markov assumptions, the OLS estimator is BLUE (conditional on the sample values of the explanatory variables)

Term

best linear unbiased estimator (BLUE)

Definition

among all linear unbiased estimators, the estimator with the smallest variance. OLS is BLUE, conditional on the sample values of the explanatory variables, under the Gauss-Markov assumptions

Term

goodness-of-fit measure

Definition

a statistic that summarizes how well a set of explanatory variables explains a dependent or response variable

Term

heterogeneity bias

Definition

the bias is OLS due to omitted heterogeneity (or omitted variables)

Term

heteroskedasticity

Definition

the variance of the error term, given the explanatory variables, is not constant

Term

heteroskedasticity of unknown form

Definition

heteroskedasticity that may depend on the explanatory variables in an unknown, arbitrary fashion

Term

heteroskedasticity-robust f statistic

Definition

an F-type statistic that is (asymptotically) robust to heteroskedasticity of unknown form

Term

heteroskedasticity-rubust LM statistic

Definition

an LM statistic that is robust to heteroskedasticity of unknown form

Term

Heteroskedasticity-Robust Standard Error

Definition

a standard error that is (asymptotically) robust to heteroskedasticity of unknown form

Term

Heteroskedasticity-Robust t statistic

Definition

a t statistic that is (asymptotically) robust to heteroskedasticity of unknown form

Term

asymptotic properties

Definition

properties of estimators and test statistics that apply when the sample size grows without bound

Term

homoskedasticity

Definition

the errors in the a regression model have constant variance conditional on the explanatory variables

Term

Assumption SLR.1 (Linear in Parameters)

Definition

In the population mode, the dependent variable, y, is related to the independent variable, x, and the error (or distribution), u, as y= β0 + β1 xi + u where β0 and β1 are the population intercept and slope parameters, respectively.

Term

Assumption SLR. 2 (Random Sampling)

Definition

We have a random sample of size n, {(x1,yi): i =1,2,…..,n}, following the population model y= β0 + β1 xi + u

Term

Assumption SLR.3 (Sample Variation in the Explanatory Variable)

Definition

The sample outcomes on x, namely, {xi, i=1,….n}, are not all the same value

Term

Assumption SLR.4 (Zero Conditional Mean)

Definition

The error u has en expected value of zero given any value of the explanatory variable. In other words, E(u│x)=0

Term

Assumption SLR.5 (Homoskedasticity)

Definition

The error u has the same variance given any value of the explanatory variable. In other words, Var(u│x)= σ2

Term

hypothesis test

Definition

a statistical test of the null, or maintained, hypothesis against an alternative hypothesis

Term

inconsistency

Definition

the difference between the probability limit of an estimator and the parameter value

Term

inconsistent

Definition

an estimator does not coverage (in probability) to the correct population parameter as the sample size grows

Term

interaction effect

Definition

in multiple regression, the partial effect of one explanatory variable depends on the value of a different explanatory variable

Term

interaction term

Definition

an independent variable in a regression model that is the product of two explanatory variables

Term

intercept

Definition

in the equation of a line, the value of the y variable when the x variable is 0

Term

intercept shift

Definition

the intercept in a regression model differs by group or time period

Term

joint distribution

Definition

the probability distribution determining the probabilities of outcomes involving two or more random variables

Term

joint hypothesis test

Definition

a test involving more than one restriction on the parameters in a model

Term

jointly insignificant

Definition

failure to reject, using as F test at a specified significance level, that all coefficients for a group of explanatory variables are zero

Term

jointly statistically significant

Definition

the null hypothesis that two or more explanatory variables have zero population coefficients is rejected at the chosen significance level

Term

Lagrange Multiplier (LM) Statistic

Definition

a test-statistic with large sample justification that can be used to test for omitted variables, heteroskedasticity, and serial correlation, among other model specification problems

Term

Least Squares Estimator

Definition

a estimator that minimizes a sum of squared residuals

Term

level-level model

Definition

a regression model where the dependent variable and the independent variables are in level (or original) form

Term

level-log model

Definition

a regression model where the dependent variable is in level form and (at least one of) the independent variables are in logarithmic form

Term

linear function

Definition

a function where the change in the dependent variable, given a one-unit change in an independent variable, is constant

Term

logarithmic function

Definition

a mathematical function, defined only for positive arguments with a positive but decreasing slope

Term

log function

Definition

a mathematical function, defined only for strictly positive arguments, with a positive but decreasing slope

Term

log-level model

Definition

a regression model where the dependent variable is in logarithmic form and the independent variables are in level (or original) form

Term

log-log model

Definition

a regression model where the dependent variable (and at least some of) the independent variables are in logarithmic form

Term

matrix

Definition

an array of numbers

Term

mean squared error

Definition

the expected squared distance that an estimator is from the population value; it equals the variance plus the square of any bias

Term

measurement error

Definition

the difference between an observed variable and a variable that belongs in a multiple regression equation

Term

median

Definition

in a probability distribution, it is the value where there is 50% chance of being below the value and a 50% chance of being above it. In a sample of numbers, it is the middle value after the numbers have been ordered

Term

missing data

Definition

a data problem that occurs when we do not observe values of some variables for certain observations (individuals, cities, time periods, and so on

Term

multicollinearity

Definition

a term that refers to the correlation among the independent variables in a multiple regression model; it is usually evoked when some correlations are "large," but an actual magnitude is not well-defined

Term

multiple hypothesis test

Definition

a test of a null hypothesis involving more than one restriction on the parameters

Term

Multiple Linear Regression (MLR) Model

Definition

a model linear in its parameters, where the dependent variable is a function of independent variables plus an error term

Term

multiple regression analysis

Definition

a type of analysis that is used to describe estimation of and inference in the multiple linear regression model

Term

nonlinear function

Definition

a function whose slope is not constant

Term

nonnested models

Definition

two or more models where no model can be written as a special case of the other by imposing restrictions on the parameters

Term

nonrandom sample

Definition

a sample obtained other than by sampling randomly from the population of interest

Term

normal distribution

Definition

a probability distribution commonly used in statistics and econometrics for modeling a population. Its probability distribution has a bell shape.

Term

normality assumption

Definition

the classical linear model assumption that states that the error (or dependent variable)has a normal distribution, conditional on the explanatory variables

Term

null hypothesis

Definition

in classical hypothesis testing, we take this hypothesis as true and require the data to provide substantial evidence against it

Term

omitted variable bias

Definition

the bias that arises in the OLS estimators when a relevant variable is omitted from the regression

Term

one-sided alternative

Definition

an alternative hypothesis that states that the parameter is greater than (or less than)the value hypothesized under the null

Term

one tailed test

Definition

a hypothesis test against a one side alternative

Term

ordinary least squares (OLS)

Definition

a method for estimating the parameters of a multiple linear regression model. The OLS estimates are obtained by minimizing the sum of squared residuals

Term

outliers

Definition

observations in a data set that are substantially different from the bulk of the data, perhaps because of error or because some data are generated by a different model than most of the other data

Term

overall significance of a regression

Definition

a test of the joint significance of all explanatory variables appearing in a multiple regression equation

Term

p-Value

Definition

the smallest significance level at which the null hypothesis can be rejected

Term

parameter

Definition

an unknown value that describes a population relationship

Term

partial effect

Definition

the effect on an explanatory variable on the dependent variable, holding other factors in the regression model fixed

Term

percentage change

Definition

the proportionate change in a variable, multiplied by 100

Term

perfect collinearity

Definition

in multiple regression, one independent variable is an exact linear function of one or more other independent variables

Term

Poisson Distribution

Definition

a probability distribution for count variables

Term

population

Definition

a well-defined group (of people, firms, cities, and so on) that is the focus of a statistical tool or econometric analysis

Term

practical significance

Definition

the practical or economic importance of an estimate, which is measured by its sign and magnitude, as opposed to its statistical significance

Term

prediction

Definition

the estimate of an outcome obtained by plugging specific values of the explanatory variables into an estimated model, usually a multiple regression model

Term

quadratic form

Definition

a mathematical function where the vector argument both pre- and post- multiples a square, systematic matrix

Term

quadratic function

Definition

functions that contain squares of one or more explanatory variables; they capture diminishing or decreasing effects on the dependent variable

Term

R-squared

Definition

in a multiple regression model, the proportion of the total sample variation in the dependent variable that is explained by the independent variable

Term

random sample

Definition

a sample obtained by sampling randomly from the specified population

Term

Regression Specification Error Test (RESET)

Definition

a general test for functional form in a multiple regression model; it is an F test of joint significance of the squares, cubes, and perhaps higher powers of the fitted values from the initial estimators

Term

Misspecification Analysis

Definition

the process of determining likely biases that can arise from omitted variables, measurement error, simultaneously, and other kinds of model misspecification

Term

residual

Definition

the difference between the actual value and the fitted (or predicted) value; there is a residual for each observation is a sample used to obtain the OLS regression line

Term

restricted model

Definition

in hypothesis testing, the model obtained after imposing all of the restrictions required after the null

Term

semi-elasticity

Definition

the percentage change in the dependent variable given a one-unit increase in an independent variable

Term

significance level

Definition

the probability of type I error in hypothesis testing

Term

slope

Definition

in the equation of a line, the change in the y variable when the x variable increases by 1

Term

slope parameter

Definition

the coefficient of an independent variable in a multiple regression model

Term

spurious correlation

Definition

a correlation between two variables that is not due to causality, but perhaps to the dependence of the two variables on another unobserved factor

Term

standard deviation

Definition

a common measure of spread in the distribution of a random sample

Term

standard error

Definition

generically, an estimate of the standard deviation of an estimator

Term

statistical inference

Definition

the act of testing hypotheses about population parameters

Term

statistical significance

Definition

the importance of an estimate as measured by the size of a test statistic, usually a t statistic

Term

sum of squared residuals (SSR)

Definition

in multiple regression analysis, the sum of the squared OLS residuals across all observations

Term

t distribution

Definition

the distribution of the ratio of a standard normal random variable and the square root of an independent chi-square random variable is first divided by its df

Term

t statistic

Definition

the statistic used to test a single hypothesis about the parameters in an econometric model

Term

time series data

Definition

data collected over time on one or more variables

Term

Total Sum of Squares (SST)

Definition

the total sampling variance in a dependent variable about its sampling average

Term

true model

Definition

the actual population model relating the dependent variable to the relevant independent variables, plus a disturbance, where the zero conditional mean assumption holds

Term

two-sided alternative

Definition

an alternative where the population parameter can either be less than or greater that the value stated under the null hypothesis

Term

two-tailed test

Definition

a test against a two-sided alternative

Term

Type I error

Definition

a rejection of the null hypothesis when it is true

Term

Type II error

Definition

the failure to reject the null hypothesis when it is false

Term

Type III error

Definition

when a null hypothesis is rejected in favor of a one-tailed alternative hypothesis but the “statistics” has the opposite sign of what the alternative hypothesis is claiming.

Term

unbiased estimator

Definition

an estimator whose expected value (or mean of its sampling distribution)equals the population value (regardless of the population value)

Term

variance

Definition

a measure of spread in the distribution of a random variable

Term

Weighted Least Squares (WLS) Estimator

Definition

a estimator used to adjust for a known form of heteroskedasticity, where each squared residual is weighted by the inverse of the (estimated) variance of the error

Term

White Test

Definition

A test for heteroskedasticity that involves regressing the squared OLS residuals on the OLS fitted values and on the squares of the fitted values; in its most general form, the squared OLS residuals are regressed on the explanatory variables, the squares of the explanatory variables, and all the nonredundant interactions of the explanatory variables

Term

efficiency

Definition

of two estimators, one is more efficient than the other if it has a smaller variance

Term

MLR1 Summary

Definition

y is related to x and the error term in a linear function

Term

MLR1 Importance

Definition

allows us to predict y from x

Term

MLR1 Violation

Definition

creates error

Term

MLR2

Definition

random sampling size of n (equal selection chances across population)

Term

MLR2 Importance

Definition

leads to measurement based statistics that approximate value of parameters

Term

MLR2 Violations

Definition

biased results

Term

MLR3 Summary

Definition

1) none of the IV are constant
2) no exact linear relation among IV

Term

MLR3 Importance

Definition

lets us tell what explanatory variable is having what effect

Term

MLR3 violation

Definition

the slope and intercept estimates are not defined

Term

MLR4 summary

Definition

u has an expected value of 0 with any value of the independent variable

Term

MLR4 importance

Definition

allows us to derive statistical properties as conditional of the values of x in a sample

Term

MLR4 violation

Definition

likely omitted an important variable and so the explanatory power suffers. this is an example of bias due to misspecification

Term

MLR5 summary

Definition

u has the same (constant) variance given any value of the independent variable

Term

MLR5 importance

Definition

needed to justify t tests, F tests, and confidence intervals

Term

MLR5 violation

Definition

heteroskedasticity (non constant variance) affects efficiency

Term

MLR6 summary

Definition

u is independent of the explanatory variables and is normally distributed with a mean of zero

Term

MLR6 importance

Definition

t hat makes statistical inference possible

Term

MLR6 violation

Definition

creates problems with confidence intervals and significance tests because they are based on assumptions of normally distributed errors

Term

difference between efficiency, consistency, and unbiased

Definition

1) efficiency: of two estimators, one is more efficient than the other if it has a smaller variance
2) consistency: as sample size increases, the variance with sample size of interest, the slope gets close to the true variance
3)bias: the difference between the expected value of an estimator and the true population

Term

Type III Error

Definition

Type III error occurs when a null hypothesis is rejected in favor of a one-tailed alternative hypothesis but the “statistics” has the opposite sign of what the alternative hypothesis is claiming.

Term

What is the difference between an F test and R squared?

Definition

An F stat is the only real measure of goodness of fit, as the R-squared only accounts for the variance

Term

Compare errors (measurement error, individual error, random error, population variance, sample variance, standard deviation, standard error, residual error, type I, type II, type III).

Definition

1)individual error is the diff. b/w expected value and individual observed value
2)random error: error due to random variability in individual observation (part of individual error)
3)population Variance (estimate) is the sum of square of the amount by which the observed values deviate from the mean divided by the number in the population.
4)Sample Variance: is the sum of square of the amount by which the observed values deviate from the mean (individual error) divided by the number of comparisons (sample number -1) (By taking the square of the individual error the negative signs disappear.)

5)Standard Deviation: This is a standardized unit of error which accounts for both the magnitude of the observed values and the number of observations in the sample. It is found by taking the square root of the sample variance. In a normally distributed sample one standard deviation will equal approximately 68% of the area under the distribution curve. Or there is a 68% chance of finding a value within one standard deviation of the mean.
6) Standard error: this is a more narrow average about the mean than the standard deviation. It is calculated as the Standard Deviation divided by the square root of the number of observations. As the number of observations increase the estimation of the sample mean is closer to the “true mean” or the population mean. This is the error reported when giving mean values. This is also the error term in determining the t statistic for hypothesis testing and the 95% confidancee interval.
7) Residual error: In regression analysis this is the variation in the dependant variable not explained by the variation in the dependant variables. This is found by finding the expected values of y given the regression model and the observed values of x. The residual error is the difference between the observed values and the calculated or fitted values.

8) There are also errors associated with hypothesis testing: Type I error which is the error of rejecting a null hypothesis when it is really true. Type II error which is to accept the null hypothesis when it is rally false and Type III error which is the rejection of the null hypothesis is correct but the acceptance of a one tailed alternative is incorrect.

Term

What are dummy (binary), proxy, quadratic, interaction, and natural logarithms and when are they used?

Definition

1) dummy variable are coded as 0 or 1 and used to include qualitative data
2)proxy variables are used when the needed variable can't be measured so you find something similar to replace it
3) quadratic terms are used when the data has a turning point
4) interaction terms are used when the effect of one variable is dependent on the partial effect of another variable
5) logarithms are used when we need to bring numbers down and they reduce variability (White Noise). they only work for positive numbers.

Term

How do we read a p-value in the absence of critical values?

Definition

"the probability of committing a Type I error if we reject the null hypothesis that _____ is 4.8% (p value: 0.048)."

Term

Five steps for checking a model

Definition

1) F stat 2) P value associated with F stat 3) R squared 4) signs on coefficients 5) individual significance

Term

What is the difference between reliability and validity?

Definition

1) reliability (consistency): the degree to which measures yield the same result when applied under the same circumstances
2)validity: the effectiveness of measuring instruments in the extent that the instruments measures the phenomenon one wants to study

Term

base/benchmark variable

Definition

the dummy variable with a value of zero

Term

RESET

Definition

test for functional form misspecification

Term

logistic model

Definition

y is a dummy variable

Term

What is Type 1 error?

Definition

null hypothesis is true, but we reject it

Term

What is Type II error?

Definition

null hypothesis is false, but we don't reject it

Term

What is Type III error?

Definition

statistically significant effect, but does not follow the hypothesis

Term

Prob>F

Definition

p value associated with F used to test the null hypothesis that all the model coefficients are zero

Term

heteroskedasticity

Definition

the variance of the unobservable error, u, conditional on the explanatory variable, is not constant

Term

intercept

Definition

the value of of y when x equals 0

Term

interaction term

Definition

variable that is the product of 2 variables

Term

What are the five steps for checking a model?

Definition

1. look at f-statistic 2. look at p-value associated with f statistic 3. see how much variation is explained by the model (r-squared) 4. look at coefficients for correct signs 5. are variables significant

Term

Explain confidence interval.

Definition

we are 95% sure that the true value of the coefficient in the model generated this data falls within this value

Term

How do we read p> absolute value of t=0.000 or p> absolute value = 0.048?

Definition

the probability of committing a Type 1 error if we reject the null hypothesis (the the slope coefficient is zero) is zero is 1000 (4.8% type 1 error)

Term

confidence level

Definition

the % of our samples in which we want our confidence interval to contain the population value

Term

consistency

Definition

as a sample size increases, the variance with sample size of interest, the slope gets closer to the true variance

Term

reliability

Definition

the degree to which measures yield the same results when applied under the same circumstances

Term

validity

Definition

the effectiveness of the measuring instrument in the extent that the instrument measures the phenomenon one wants to study

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Mathematics Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile