measurement of the relationship or association between variables 


correlation: levels of measurement 

2 continuous or 1 continuous, 1 dichotomous categorical; MUST be equal interval 


correlation: research question 

do individuals who have a low/high score on 1 variable also have a corresponding low/high score on another variable? for each unit change in X, is there a corresponding unit change in Y? how strong is the relationship between these two variables? 


normality, linearity (HIGHLY susceptible to outliers), equal interval variables, independence of observations, 


bivariate (2 variables, measured with Pearson's productmoment correlation R, or Spearman's rho for ordinal/high outlier data) or partial (explore relationship b/t 2 variables while statistically controlling for a third) 


benefits of partial correlation 

provides a more clear, accurate depiction of actual relationship between 2 variables 


example of correlation write up 

Results shows a strong, positive correlation between posting stats shit on Facebook and being a total dork (r = .88, p < .01). This relationship remained significant, even after controlling for number of hours spent drooling over David Klonsky's LCA paper ( r = .66, p < .01). 


correlation: measurement (2) 

strength/ magnitude (how spread are data around line of best fit? tight or loose?) direction (+ or ) low~ low, high~high, low~high, high~low 


calculated using zscores (need z's because of different scales, with different means and SDs) cross products= zscore of x X zscore of y = avg of cross products= R 


Pearson's correlation (r) 

correlation coefficient r= regression, sum of cross products of the z scores/ n = +  1 


correlation coefficient when using ordinal data or highly skewed (lots of outlier) data 


small, medium, large correlations 

.12.9, .3.49, .5 and up 


2 methods to test for outliers 

histogram (univariate) scatterplot (multivariate) 


consistency, stability of a measure. necessary but not sufficient for validity of the measure 


degree to which an instrument measures the construct intended, accuracy 


small, medium, large measure of reliability 

.45.65 shitty .70.80 acceptable >.80 optimal .99= measuring same thing? context needed 


Likert, how often behavior or event has occured, opinion as to how strong a person feels about something. Assume linear relationship. neutral point? can use as interval data to use parametric tests. need normality of distribution for certain tests. 


empirical data from judges to ensure attitudes/behaviors being measured are spaced along continuum of equal rating 


hierarchical scaling technique that ranks items such that individuals who agree with higher ranked item will also agree with items of lower rank 


should be done w a smaller sample, but large enough to allow for a cronbach's alpha greater than .70, determine which items should be kept/deleted 


extraction, principal component analysis, varimax/oblimination rotation 


principal component analysis explores relationship between variables, provides basis for removal of uncessary items and ID subscales/domains 


factor rotation that maximized loadings of variables on different subscales. varimax with correlated. Oblimin with uncorrelated. ID number of domains in measure 


degree to which an instrument is related to operationally defined theory and concepts 


construct validity:measurements 

contrasted groups 2 groups high and low known, means should differ hypothesis testing theoretical factor analysis related items kept, exploratory and confirmatory 


convergent/discriminant validity: measurements 

multitrait multimethod approach MTMM 2 more constructs with 2 more methods, correlation matrix for relationships between traits 


face and content validity 


items measure the complete range of the attribute under study (i.e. not 4 types of NSSI, but 14) determined by lit review, experts, population sampling, largest pooling then reduced with factor analysis 


instrument looks like it measures the construct of interest, subjective, 


criterion validity (4 types) 

concurrent, predictive, convergent, discriminant; relationship between measurement and construct based on performance on another variable 


scores on a measure correlated to a related criterion at the same point in time 


concurrent validity: measurements 

exploratory factor analysis, eigen >1; confirmatory factor analysis, loading on each subscale >,4; principal components analysis ( 5 pps per variable) ex. ISAS and SITBI 


degree to which scores predict performance on some future criterion 


predictive validity: measurements 

correlations, regressions ex. ISAS behavioral forecast predicts NSSI 


correspondence between constructs that are theoretically similar. 


convergent validity: measurements 

correlations. ex ISAS and MSIBPD 


measurement differentiates between constructs that are theoretically different 


discriminant validity: measurements 

MTMTT, weak/no correlation. ex ISAS and SITBI gesture, suicide attempt 


content validity: measurements 

content validity ration (CVR) or content validity intex (CVI) depend on number of experts (+7) 


similar scores at different times, 2 weeks1 month, not due to chance, some states do change, so not useful 


cronbach's alpha. interitem correlations to determine if items measure some construct. how well they "hang" together. can add items to increase. if over .9 might be measuring same thing 


parralel/ alternate forms of reliability 

different items pool to test same concepts, ie day and night measure. 


Magnusson quote as to why we need good measures 

"models are never more accurate, reliable, or valid than the measures you put into them" 


use correlations in a systematic/meaningful way to test hypotheses. predict/ calculate any value of y (DV) from a value of x (IV). based on probabilities 


R squared multiple correlation. how much of DV's variance is accounted for by the IV. 01. always positive. closer to 1= greater amount of variance explained. tests magnitude of prediction and mediation. 


y= DV Bo= intercept, constant B1= slope/ beta weight/ regression weight X= value of IV err= random, unsystematic error USE UNSTANDARDIZED SCORES 


Term
linearity/normality, independence and homogeneity of error, heteroscedasticity variability at one point is similar to variability at different point. 


predict DV from IV OR >1 IV from 1 DV (mediation) 


how far case is from other cases 


how in line with linear trend 


regressions standardized coefficients 

interpreted as correlation strength, direction, significance. 


measures are overly related, may be measuring same thing. everything is significant when testing reliablity 


variable centered approach 

given prior behavior, scores, genetic markers, contextual risk and protective factors, individuals are interchangeable units who apart from random error, do not differ qualitatively or quantitatively from each other. 


Term

how individuals change or behave. how individuals function, holistic. do not assume normality, less sensitive than parametric tests and may fail to detect differences that actually exist (type I error) 


Term

developmental processes are active, integrated, complex, dynamic, adaptive in relation to greater system 


drawing false conclusions about individual behavior from population behavior 


drawing false conclusions about aggregate behavior from individuals 


types of person centered analyses 

classification (cluster and LCA) hybrid classification (growth mix model) singlesubject methods (dynamic factor analysis) variable oriented methods: latent growth curve modeling 


examine how groups of individuals are similar to one another, and different from individuals in other groups/clusters. exploratory in nature 


hierarchical cluster analysis 

small >50 data sets standardized continuous data 


moderate sample size, standardized continuous data, assign cluster then recheck means to maximize differences b/t groups. may or may not ask SPSS to create a specific number of groups 


large data sets, computer selects number of clusters based on most relevant variables, maximize differences bt clusters, can use continuous, discrete nominal data. step 1= preclustering put new in with preclusters or make a new one. step 2= use hierarchical techniques to make two clusters then assign to clusters 


Shwarz's Bayesian ID criterion; change of more than one= best number of clusters exactly how much more likely it is to get an individual from group a vs group b 


log likelihood estimates (MLE or LSE) with mix of variable types Euclidian distance same standardized variables 


ID unique classes of cases like cluster analysis with categorical data. estimation uses MLE. assumes variables are uncorrelated within classes, although often not the case 


determine latent variables (ones that cannot be measured directly) circles based on observed/ indicator variables boxes. need 4 


Term
Time spent oogling over David Klonsky's LCA is predictive of desire to TA stats for smelly undergrads, such that more time spent oogling was significantly positively associated with desire to TA smelly undergrads (r = .66, p < .01). Time spent oogling explained 80% of the variance in desire to TA smelly undergrads. 

