Lab 9

POL242 LAB MANUAL:  Lab 9

Index Construction

 

PURPOSE

  • To learn to construct an index.

MAIN POINTS

  • The Reliability procedure tests only the suitability of a set of variables for the construction of an index (internal validation).
  • To construct an index you use the compute command, e.g., COMPUTE NEWNAME=Q1+Q2+Q3+Q4.
  • Recoding your index into fewer categories is often essential to create interpretable crosstabs.
  • Using your index in crosstabulation enables you to examine its relations to other variables. This permits not only external validation of your index; it can also enhance your explanatory research..

EXAMPLE

  • Dataset:
    • CES 2011
  • Concept:
    • Attitude toward reducing inequality (Alpha =.67)
  • Indicators:
  • PES11_41: How much should be done to reduce gap between the rich and the poor in Canada?
  • mbs11_k2 – the government SHOULD ACT/SHOULD NOT ACT to reduce differences in income and wealth;
  • mbs11_b3 – the govt should see to it that everyone has a decent standard of living/leave people to get ahead on their own.

Syntax

*Preparing indicators of Attitudes re Inequality*.

*declare missing values on pes11_41*.
missing values pes11_41 (8,9).

*reverse scoring on pes11_41 and make it range from 0-1*.
recode PES11_41 (1=1) (2=.75) (3=.5) (4= .25) (5=0) into undogap.
value labels undogap 0 'muchless' .25 'someless' .5 'asnow' .75 'somemore' 1 'muchmore'.

*rescale mbs11_k2 from 0-10 to 0-1 and reverse its scoring*.
missing values mbs11_k2 (-99).
compute govact = (((mbs11_k2 * -1) +10)/10).
value labels govact 0'not act' 1 'gov act'.

*recode and re-label mbs11_b3 and pes11_52b*.
recode mbs11_b3 (1=1) (2=0) into goveqch.
value labels goveqch 1 'decent living' 0 'leave alone'.
recode pes11_52b (1=0) (3= .33) (5= .66) (7= 1)into NDPnohurt.
value labels NDPnohurt 0 'strdisagree' .33 'disagree' .66 'agree' 1 'stragree'.

reliability variables = undogap govact goveqch
     /scale (fixineq) = undogap govact goveqch
     /summary = all.

*create a new indexed variable and run a frequency.
compute rawindex3 = undogap + govact + goveqch.
fre var rawindex3
     /stats = mode median mean std skewness kurtosis.

*recode the new index into three categories*.
recode rawindex3 (0 thru 2.10=0)(2.15 thru 2.50=.5)
(2.55 thru 3= 1) into fixineq3.
value labels fixineq3 0 'low' .5 'med' 1 'hi'.
fre var fixineq3.

*preparing IV indicators.*
*create language grouip indicator*.
recode cps_intlang11 (1=0) (5=1) into french.

*create finance measures.
missing values cps11_66 (8,9).
recode cps11_66 (1=1) (3=0) (5=.5) into finances.
variable labels finances 'personal finances'.
value labels finances 0 'worse' .5 'same' 1 'better'.

crosstabs tables = fixineq3 by french finances
     /cells = column count
     /statisics btau ctau.

*Preparing X indicator-party identification*.
recode cps11_71 (2=1) (1=2) (4=3) (3=4)into PID4.
value labels PID4 1 'Cons' 2 'Lib' 3 'BQ' 4 'NDP'.

crosstabs tables = fixineq3 by PID4
     /cells = column count
     /statistics = phi.
  • Syntax Legend
    • Comments an be inserted between asterisks *. . .*.
    • The compute command is where the index is constructed.
    • Recoding an index is essential to produce effective tables. Here recodes place about 1/3 of the cases in each category, using the cumulative percent column of the frequency analysis as a guide.
    • Recode the index into a new name as it is will be useful to retain both the complete raw and recoded forms of an index.
  • Output
rawindex3
Frequency Percent Valid Percent Cumulative Percent
Valid .00 10 .2 .8 .8
.10 1 .0 .1 .9
.20 1 .0 .1 1.0
.25 5 .1 .4 1.4
.30 1 .0 .1 1.5
.35 5 .1 .4 1.9
.45 2 .0 .2 2.1
.50 13 .3 1.1 3.2
.55 4 .1 .3 3.5
.60 6 .1 .5 4.0
.65 2 .0 .2 4.2
.70 24 .6 2.0 6.2
.75 2 .0 .2 6.4
.80 18 .4 1.5 7.9
.85 5 .1 .4 8.3
.90 14 .3 1.2 9.5
.95 9 .2 .8 10.3
1.00 20 .5 1.7 11.9
1.05 6 .1 .5 12.4
1.10 4 .1 .3 12.8
1.15 7 .2 .6 13.4
1.20 6 .1 .5 13.9
1.25 17 .4 1.4 15.3
1.30 5 .1 .4 15.7
1.35 14 .3 1.2 16.9
1.40 1 .0 .1 17.0
1.45 10 .2 .8 17.8
1.50 10 .2 .8 18.7
1.55 2 .0 .2 18.8
1.60 6 .1 .5 19.3
1.65 2 .0 .2 19.5
1.70 16 .4 1.3 20.8
1.75 11 .3 .9 21.8
1.80 18 .4 1.5 23.3
1.85 9 .2 .8 24.0
1.90 15 .3 1.3 25.3
1.95 9 .2 .8 26.1
2.00 39 .9 3.3 29.3
2.05 17 .4 1.4 30.8
2.10 21 .5 1.8 32.5
2.15 32 .7 2.7 35.2
2.20 19 .4 1.6 36.8
2.25 75 1.7 6.3 43.1
2.30 27 .6 2.3 45.4
2.35 77 1.8 6.5 51.8
2.40 19 .4 1.6 53.4
2.45 79 1.8 6.6 60.1
2.50 61 1.4 5.1 65.2
2.55 43 1.0 3.6 68.8
2.60 40 .9 3.4 72.2
2.65 21 .5 1.8 73.9
2.70 81 1.9 6.8 80.8
2.75 19 .4 1.6 82.4
2.80 78 1.8 6.6 88.9
2.90 52 1.2 4.4 93.3
3.00 80 1.9 6.7 100.0
Total 1190 27.6 100.0
Missing System 3118 72.4
Total 4308 100.0

 

Statistics
rawindex3
N Valid 1190
Missing 3118
Mean 2.1661
Median 2.3500
Mode 2.70
Std. Deviation .70098
Skewness -1.147
Kurtosis .507

 

Crosstabulation of support for action against inequality by personal finances

fixineq3 personal finances
worse same better
low
22.0% 32.0% 50.0%
med
30.7% 35.8% 22.2%
hi
47.2% 32.2% 27.8%
Total
254 754 180

Taub = -.160

 

Crosstabulation of support for action against inequality by partisan identity

 

fixineq3 PID4
Cons Lib BQ NDP
fixineq3 low
57.0% 28.4% 12.1% 17.5%
med
26.2% 38.0% 40.4% 29.9%
hi
16.8% 33.6% 47.5% 52.6%
Total
321 324 99 154

Cramer’s V =.276

Interpretation

    • The newly computed index variable has so many categories that crosstabs will be unwieldy. Therefore recode into fewer categories.
    • The recoded variable is more manageable.
    • Crosstabs permit calculation of measures of association between the recoded index and other variables. This can be useful for both external validation and explanatory research.
    • The frequency analysis for the index shows that scores range from zero through three. This makes sense since the index is composed of three items each of which is scored between zero and one.
    • The mean and median score are both just above 2, which suggests that on average Canadians score somewhat above the mid-point on the index.
    • The standard deviation shows the average deviation from the mean is .7.
    • The skewness and kurtosis scores show that there are a number of cases with low values but the distribution is relatively flat.
    • The index is recoded into three categories using the cumulative percentages as a guide in finding the 33% and 66%.
    • The index is crosstabulated with language group (not shown), personal finances and party identification in line with hypotheses that francophones, those in difficult financial straits and left leaning parties are likely to be more supportive of taking action against inequality.
    • The crosstabs offer weak to moderate support for the hypotheses.

INSTRUCTIONS

  1. Use the data set and questions you worked with in Lab 8.
  2. Having found a combination of questions that produce an alpha greater than .60, ensure that the range for each of the questions is similar to one another.  This is to ensure that none of the items are over or under-represented in the index.  For example, if the first question has a range from 1 to 3 and the second has a range from 1 to 100, then the second will be disproportionately over-represented.  Recode all the questions such that their ranges are similar, not necessarily identical.
  3. To create the index, combine all the different questions into a new measure using a compute command in the following form
    • Compute rawindex=V12+V14+V18+V23.
  4. Run a frequency distribution of the new indexed variable and determine whether it is suitable for further data analysis.
  5. Recode the index into appropriate categories as necessary.
  6. The new index can be used in crosstabulation like any other variable. This enables you to investigate both the external validity of your measure as well as use it in explanatory research. For example, use your index with an independent or dependent variable and calculate the appropriate measures of association.

QUESTIONS FOR REFLECTION

  • How does the relationship between your index and an independent variable differ from what you would obtain using each element of the index to produce a crosstab?
  • How is the relationship produced with the index affected by the choices in recoding the indexed variable?

DISCUSSION

  • An index often leads to stronger relationships because the measurement errors in each of the constituent indicators tend to balance out.
  • Proper recoding of your index requires careful consideration of the possibilities and attention to the substantive meaning of your categories.
  • Depending on your coding choices the strength of the relationship in your table may increase, decrease or stay roughly the same.

Advanced Topics

  • In this lab only three of the four indicators considered in Lab 8 are used to create an index. Including the fourth indicator produces a measure of association between the resulting index and Party Id, which exceeds .4. This suggests that the two variables are perhaps measuring the same thing.
  • One can create standardized scores (or z-scores) for the indicators used to create an index in this lab using the following SPSS syntax:
  • descriptives variables = undogap govact goveqch /save.
  • This will create three new standardized variables in the data set: zundogap zgovact zgoveqch. Their existence can be confirmed by looking at the dataset of by running a frequency analysis on each of these variables. These new variables can be used to create a standardized index using same procedures employed in this lab. Doing so will ensure that all variables are equally weighted in the index. For our purposes, coding our indicators on a common range of values will suffice.