* Topics in Survey Methodology and Survey Analysis 2012;
* Design-based and model-based analysis of complex survey data;
* Descriptive statistics - Tuesday 25 Sept. 2012;
* SAS data set OHC (Occupational Health Care Survey)
Clustered (Hierarchical, Multilevel) data
Complex sampling design:
Stratified one-stage and two-stage cluster sampling
In analysis phase the data are treated as one-stage cluster sampling
design with workplaces (establishments) as the sample clusters.
This simplifies calculation and is used as the default in SAS,
SPSS and Mplus procedures.
Features of the data set:
H = 5 strata (Industry type and size of workplace)
m = 250 sample clusters (establishments/workplaces)
n = 7841 persons
p = 12 variables
Data are real survey data and have been anonymized and cleaned
for pedagocical purposes (no missing data, weights are constant)
SPSS use: CSPLAN file (sample plan data set) will be created
in PC session
;
* Methods
Design-based procedures - accounting for clustering effects
SAS SURVEY procedures
Descriptives - SURVEYMEANS
Test of independence - SURVEYFREQ
Logistic regression- SURVEYLOGISTIC
SPSS Complex Samples module
CSPLAN - Complex samples plan
CSLOGISTIC - Logistic regression
Model-based procedures - hierarchical (multilevel) analysis
Logistic regression: SAS GENMOD (GEE/Exchangeable estimation)
Logistic regression: SAS GLIMMIX (Generalixed linear mixed modelling)
Mplus (COMPLEX, TWOLEVEL)
Logistic regression
NOTE: See also VLISS Training Key #298
SAS code will be worked out further during PC session.
;
options nocenter;
ods html;
* Access to SAS data library:
- Use the "New library" button
- Use the libname statement;
libname a "Z:\Documents\My SAS Files\9.3\Social Statistics Course 2012";
data ohc;
set a.ohc;
run;
* see HELP proc contents;
proc contents data=ohc varnum;
title1 "/*write title*/ ";
title2 "/*write subtitle*/ ";
run;
* see HELP proc surveymeans;
proc surveymeans data=ohc nobs mean;
title1 "/*write title*/ ";
title2 "/*write subtitle*/ ";
var /*select variables */;
strata /*select stratum variable*/ ;
cluster /*select cluster variable*/ ;
run;
* see HELP proc surveyfreq;
proc surveyfreq data=ohc;
title1 "/*write title*/ ";
title2 "/*write subtitle*/ ";
tables /*select cross-classification variables*/ / /*options*/;
strata /*select stratum variable*/ ;
cluster /*select cluster variable*/ ;
run;
* Let us carry out the same analysis using SPSS;