Shared Flashcard Set

Details

Title

PSY 602 - Tests/Measure. TEST 2

Description

TEST 2 - Reliability, Validity, Item Analysis, Testing in Educational Settings

Total Cards

Subject

Psychology

Level

Graduate

Created

11/18/2009

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Psychology Flashcards

Cards Return to Set Details

Term

What does X=T+E stand for?

Definition

score(X)=true score(T)+error(E)
FOR INDIVIDUALS

Term

Face validity

Definition

Does the item "appear" to be measuring the construct?
"face value"
Is it obvious that the test is measuring what it is supposed to be measuring?
Can be manipulated

Term

Content validity

Definition

Is there a proportionate sampling of the universe of possible items?
Variety
ALL aspects taken into account

Term

Criterion-related validity

Definition

Is there a correlation between the test (x) and some criterion (y)?
-predictive (future) vs concurrent (current)
must have direct measure (rating by someone else) (not self report-indirect measure)

Term

Construct validity

Definition

Do the patterns of correlations with other measures make theoretical sense?
"theoretical validity"
-congruent, convergent, discriminate

Term

Under construct validity, what is congruent validity?

Definition

correlations with other measures of the same construct
-high correlation

Term

Under construct validity, what is convergent validity?

Definition

correlations with measures of similar constructs
-intermediate correlation

Term

Under construct validity, what is discriminant validity?

Definition

correlations with measures of unrelated validity
-low correlation

Term

Given y=5x+12 and SEM=3, what is the 95% confidence interval when x=2?

Definition

Interpretation/Translation=
22±6 OR 16 ≤y ≤ 28

**Work Shown y=5(2*for 95%*)+12, so y=22, variance is %# times SEM so 2(3)=6, so 22±6**

Term

What are the sources of error?

Definition

-true change
-item sampling
-statistical factors
-random factors

Term

Which source of error most affects test/retest and split half reliability assessments?

Definition

test/retest - true change
split half - statistical factors

Term

What is the goal of test/retest?

Definition

stability with an acceptable reliability of r≥.80 and staying stable over time

Term

What are the types of reliability methods with the goal of internal consistency?

Definition

-alternate forms (items measuring the same construct)
-split half (correlating 1/2 test with other 1/2 like odd and even with Spearman Brown prophecy formula to correct)
-Cronbach's Alpa (way of estimating avg. correlation among all possible pairs of items like Likert Scale)
-KR20 & KR21 (same as Cronbach's but for dichotomus items)

Term

What is the Spearman-Brown prophect formula?

Definition

to correct estimate of reliability (restricted range)

Term

What are the confidence intervals of SEM?

Definition

68% confidence = score ± 1 SEM
95% confidence = score ± 2 SEM

Term

What is the purpose of item analysis in general?

Definition

to detect and remove items that fail to discriminate to enhance reliability and validity of the test

Term

Explain item analysis for maximal performance measures.

Definition

Defines "top group" and "bottom group" and computes difficulty index (p) and Discrimination index (D)

Term

What is the difficulty index?

Definition

indicated by "p"
equation: (P top + p bottom)/2
Optimum p = .50, higher = easy, lower = difficult

Term

What is the discrimination index?

Definition

indicated by "D"
Equation: P top - P bottom
Acceptable discrimination is D≥.30, unacceptable is D˂.30

Term

What is intrinsic ambiguity?

Definition

bad kind of ambiguity
unacceptable discrimination index
top group does not have knowledge

Term

What is extrinsic ambiguity?

Definition

good kind of ambiguity
acceptable discrimination index
bottom group does not have knowledge

Term

Item Analysis Maximal Performance Sample:
Group #correct #incorrect p D
T 7 3
B 4 6
Give p, give D, give 3 part interpretation

Definition

p=.55 (more than 1/2 got it right)
D=.30
Interpretation: slightly easy (b/c p above .50), acceptable discrimination (b/c D is .30) and extrinsic ambiguity (b/c D is acceptable)

Term

Briefly explain item analysis of typical performance measures.

Definition

Since there are no right and wrong answers, item difficutly index of p is not used (difficulty level is irrelevant)

Term

What are the 2 measures of discrimination in typical performance item analysis?

Definition

--D = Discrimination index (% said yes top group a minus % said yes bottom group b), if D≥.30 then acceptable
--Rpb = item total correlation (correlation between item score and total score, closer to 1 is better correlation)

Term

What does item-total correlation rely on?

Definition

significance as criterion, taking sample size into account which D does not.
item total is more sensitive measure of discrimination

Term

What type of validity is emphasized by achievement tests?

Definition

content validity

Term

What type of validity is emphasized by aptitude tests?

Definition

criterion-related and construct validities

Term

Explain 4 of the subtests of the Stanford-Binet.

Definition

-verbal reasoning, crystallized
-quantitative reasoning, crystallized
-abstract/visual reasoning, fluid
-short term memory, memory specific

Term

How is content validity addressed on achievement tests in schools?

Definition

a matter of finding the best fit between the school/district curriculum and the emphasis of a given test

Term

How is basal age computed in the Stanford-Binet?

Definition

highest level at which all items are passed

Term

How is ceiling age computed on the Stanford-Binet?

Definition

level at which all answered incorrectly

Term

How is the Stanford-Binet scored?

Definition

basal age + months of credit for items gotten right

Term

What is the current deviation IQ equation?
Why is deviation IQ preferred to the old ratio IQ?

Definition

z(16)+100
Deviation IQ means the same across all age levels

Term

What is g-factor & what is s-factor?

Definition

G-Factor: general factor, core of intelligence
S-Factor: specific factors, unqie understanding

Term

What are the verbal subtests of the WAIS-R (Weschler Adult Intelligence Scale)?

Definition

1 information
2 comprehension
3 arithmetic
4 similarities
5 vocabulary
6 digit span
7 letter-number sequencing

Term

What are the performance subtests for the WAIS-R?

Definition

8 picture completion
9 block design
10 picture arrangement
11 object assembly
12 digit symbol
13 matrix reasoning
14 symbol search

Term

Explain Catell's fluid and crystallized IQ concept.

Definition

Fluid IQ is innate and is shown and tested through matrices tests

Crystallized IQ is gained with experience with language and is shown and tested through vocabulary and verbal testing

Term

What is the most common IQ test?

Definition

WAIS-R (Weschler Adult Intelligence Scale)

Term

What does the public law IDEA (1990) provide for?

Definition

For special needs students
1 least restrictive environment
2 individual education plans (IEPs)
3 clearly defined disorders

Term

What does the public law PL95-561 provide for?

Definition

for gifted and talented students
-incentives to school offering gifted programs (gifted is 1 1/2 SD above avg)

Term

What is giftedness?

Definition

1 1/2 SD above avg

Term

What is mental retardation?

Definition

IQ at least 2 SD below mean, some states also require assessment of psychosocial maturity

Term

What are learning disorders?

Definition

LD - not any other disorder, discrepency of 1 1/2 SD in aptitude vs achievement

Term

What are behavioral disorders?

Definition

presence or absence of key behaviors
requires behavioral assessment
Example is ADD, ADHD

Term

What are the 3 issues of test bias?

Definition

measurement bias
prediction bias
contect bias

Term

What is measurement bias?

Definition

differential difficulty across groups
example: item difficulty by group

Term

What is prediction bias?

Definition

differential prediction of scholastic success across groups
predicting better for one group than another
example: IQ predicitng school GPA

Term

What is content bias?

Definition

content of items biased towards the dominant group
example: regional, cultural

Term

What are the 3 approaches to cross-cultural testing?

Definition

-mainstream approach
-pluralistic approach
-"culture-fair" tests

Term

What is the mainstream approach to cross-cultural testing?

Definition

one set of norms that are not differentiated by a group (race/culture)
Tests should reflect dominant culture since schools are embedded in the dominant culture

Term

What is the pluralistic approach to cross-cultural testing?

Definition

compare "like with like"
Same test with different norms for different groups

Term

What is the "Culture Fair" test approach to cross-cultural testing?

Definition

use of non-verbal items
change assessments by removing all words so that anyone can take the test

Term

What are the problems with the "culture fair" test approach to cross-cultural testing?

Definition

-assumptions of universal understanding (there is no such thing)
-restricted range of abilities tested
-lack of comparability with mainstream IQ tests

Term

What are the problems with the "culture fair" test approach to cross-cultural testing?

Definition

-assumptions of universal understanding (there is no such thing)
-restricted range of abilities tested
-lack of comparability with mainstream IQ tests

Term

If you were constructing a depression scale, how would you deal with face validity and content validity?

Definition

face validity: manipulate face validity so that it was not apparent that the depression scale was measuring depression so that the answers are not skewed one way or the other by the test taker

content validity: make sure that there are questions to check for all possible parts, types and situations of depression.

Term

If you were constructing a depression scale, how would you deal with criterion-related validity and construct validity?

Definition

criterion-related validity: compare and look for correlation to scores on such tests as social desirability scale, body attitude, and the stress quiz, see if another test predicts depression in client, see if depression scale matches counselor observations

construct validity: make sure the test correlates well with other depression scales like Beck's depression scale. Make sure it correlates low with unrelated scales like the masculine/feminine scale

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Psychology Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile