the way individuals adapt themselves to the requirements of their physical and social environment.
behaviors that allow an individual to live or keep safe. Thrive in both good and bad times. 


arithmetic average of scores 


distance between each score and every other score in the test.
S2, or O2
measure of the dispersion of scores around the mean of the distribution 


Most frequent score in a distribution. 


The middle
above 50% of test takers and below 50 % of test takers 


S or O
positive square root of the variance
Bell Curve, 68% average, 28% 1 standard deviation above or below.



for translating test scores into a statement about the behavior to be expected of a person with that score or their relationship to a specified subject matter. Most tests and quizzes written by school teachers are criterionreferenced tests. The objective is simply to see whether the student has learned the material.
Predictive in nature. 


basic standard score
mean always = 0
standard deviation always= 1
x number of standard deviations above or below the mean.
when bell curve looks normal we know the exact z scores. 


average of 50
ranve of 4060 (68%) 


Relative absence of random error in measurement
item reliability
inter observer agreement
stability
test retest
having students take 2 tests to see if they get similar scores.



Term
When would you use a coefficient alpha ? 

it is the average of all possible split halves. you would use it when you want to test reliability through internal consistency.
computation of item variance and test variance. 


extent to which interpretation and test scores is supported by evidence and theory.
Must be reliable.
Specific to individual student.
does it lead to correct inferences about the person being tested?
Does it measure what it is meant to measure?
Correlations. 


extent to which a measure represents all facets of a given social construct
example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension 


Term
Criterion Related Validity 

Concurrent and Predictive
is a measure of how well one variable or set of variables predicts an outcome based on information from other variables 


Evaluation of construct validity requires that the correlations of the measure be examined in regards to variables that are known to be related to the construct (purportedly measured by the instrument being evaluated or for which there are theoretical grounds for expecting it to be related 


formal and informal assessment procedures employed by teachers during the learning process in order to modify teaching and learning activities to improve student attainment 


similar to that ofthe gradeequivalent score. A raw score corresponds to a chronological age. Ateacher can use a norms table to determine such scores. A portion of a normstable would look something like this:


Chronological age 
Raw 

6.0 
6.6 
7.0 
7.6 
8.0 
Score 

Ageequivalent score 
20 

6.7 
6.4 
6.2 
6.0 
5.7 
21 

6.8 
6.5 
6.3 
6.1 
5.8 
22 

6.9 
6.6 
6.4 
6.2 
5.9 
23 

6.10 
6.7 
6.5 
6.3 
5.10 



r, and is typically denoted by r) is a measure of the correlation (linear dependence) between two variables X and Y, giving a value between +1 and −1 inclusive. It is widely used in the sciences as a measure of the strength of linear dependence between two variables



Term
Assessments: Why are they given? 

screening: Unrecognized problems?
Progress monitoring: are they making progress
Instructional planning
resource allocation
elligibility for special education services (diagnostic)
Program evaluation
Accountability



ederal program that authorizes state and local aid On December 3, 2004, President Bush signed the Individuals with Disabilities Education Improvement Act, a major reauthorization and revision of IDEA. The new law preserves the basic structure and civil rights guarantees of IDEA but also makes significant changes in the law. Most provisions of Public Law (PL) 108446 go into effect on July 1, 2005. The requirements regarding “highly qualified” special education teachers became effective immediately upon signature.
autism recognized
assessement for students with disabilities (alternative assessement) fairness
test security
confidentiality of information
DO NO HARM (best interest)



Term
MTD (multi disciplinary team) process: 

collect at minimum what is federally required
draw information from variety of sources
use well normed reliable sources
ETR



interval in which a measurement or trial falls corresponding to a given probability. Usually, the confidence interval of interest is symmetrically placed around the mean, so a 50% confidence interval 


Term
SEM (Standard error of measurment) 

he range within which a person's true score may fall (Drummond, 1996). Its interpretation is based upon theories associated with normal distributions.
Basically, we know that 68% of students fall in the average range so it gives us a standard error of measurement to go by.



Term
types of scoring intervals 

Nominal scales are data that record categories.
Ordinal scales record information about the rank order of scores.
Interval scales tell us about the order of data points, and the size of the intervals in between data points.
A ratio scale is an interval scale with a true zero point.



The literature indicates that the benefit of outoflevel testing is that it is a costeffective method for increasing the precision with which low performing students’ ability is measured. However, an unreported downside is that the process by which scores on outoflevel testing are converted back into the scale of the inlevel test reduces measurement precision. Further, it has not been demonstrated that more precision is gained than is lost.
most students with special needs are tested out of level on OAA's and the graduation test



To understand norms and statistical assessment one first needs to understand standardization. Standardization is the process of testing a group of people to see the scores that are typically attained. With a standardized test, the participant can compare where that score fell compared to the standardization group's performance. With standardization the normative group must reflect the population for which the test was designed. The group's performance is the basis for the tests norms. 


eligibility decisions
what is the student "entitled" to 

