Lab 7
POL242 LAB MANUAL: Lab7
Crosstabulation with Ordinal Variables
PURPOSE
- To learn how to interpret the results of an ordinal X ordinal crosstabulation.
- To learn how to apply and interpret Kendall’s Tau-b and Tau-c.
MAIN POINTS
Crosstabulation
- In the case where at least one of the variables in the crosstab is nominal (Lab 6), a relationship can be evident in the differences in column-percentage in one or two rows of a table. However, this requirement is not sufficient to draw any conclusions for an ordinal X ordinal crosstab.
- For ordinal X ordinal crosstabs, a relationship is demonstrated only when the value of the dependent variable either increases or decreases as the independent variable increases in value. When the dependent variable increases in value as the independent variable increases in value, then the relationship is positive. For example, as the level of education rises the income-level also increases. On the other hand, when the dependent variable decreases in value as the independent variable increases in value then we have a negative relationship.
- A practical technique that students sometimes employ to identify whether a relationship exists for ordinal X ordinal crosstabs begins by finding the bulge in each row. The bulge is in the cell(s) in each row that has a disproportionately high column-percentage. The column-percentage for the bulge should differ substantially from that of the other cells in the row.
- For the relationship to be properly identified, it is necessary that your ordinal variables be properly coded (from lowest to highest). If there is a positive relationship, then the bulges will follow a pattern that leads diagonally downward (top-left to bottom-right). Or in the case of a negative relationship, the bulges will follow a pattern that leads diagonally upward (bottom-left to top-right).
POSITIVE RELATIONSHIP | INDEPENDENT VARIABLE | |||
Low | Mid | High | ||
DEPENDENT VARIABLE | Low | XXX | ||
Mid | XXX | |||
High | XXX |
NEGATIVE RELATIONSHIP | INDEPENDENT VARIABLE | |||
Low | Mid | High | ||
DEPENDENT VARIABLE | Low | XXX | ||
Mid | XXX | |||
High | XXX |
Kendall’s Tau
- Kendall’s Tau, which is built on the notion of like and different ordered pairs of cases, is a measure of association that calculates the strength of the relationship between two ordinal variables.
- Unlike Cramer’s V, Kendall’s Tau also indicates the direction of the relationship. A positive Tau value means that as the first variable increases in value, the second variable also increases in value. A negative Tau value means that as the first variable increases, the second decreases, and vice-versa.
- Use Tau-b for square crosstabs (where the number of categories is same for both variables, 3 X 3, 5 X 5). Use Tau-c for rectangular crosstabs (where the variables have a different number of categories, 3 X 5).
- Once again, the Tau value can be interpreted as all other measures of association
INSTRUCTIONS
- Select a Datasetfor the purpose of this exercise, perhaps one of the 2013 Forum Research studies.
- Using the Codebook for the chosen dataset hypothesize a relationship between two indicators measured at the ordinal level that one of the variables is the dependentvariable and the other is the independent Indicators measured at the interval level may be used as long as you recode them into ordinal variables with three to five categories. This will make the crosstab easier to interpret.
- For example, Education (independent)explains the variation in attitudes toward decreasing gap between rich and poor (dependent), i.e., the more educated a person is, the more one favours reducing the gap.
- Using SPSS perform separate trial-runsof the Frequency distribution for each of the variables. (i.e. Perform Frequency analyses without entering any recodes or missing values). Based on the Frequency output, decide how to recode each variable (if necessary) and identify the missing values. If the number of categories of your variable is large then you may want to combine them into 3 to 5 categories to make the crosstab easier to interpret.
- Again using SPSS perform a Crosstabulation making certain to enter the dependent variable first, followed by the independent. Remember to account for the missing values.
- On the /statistics subcommand specify either b-tau or c-tau to generate either “Kendall’s Tau-b” or “Kendall’s Tau-c”.
- Based on the column-percentages in the crosstabular output, determine whether there is a relationship between the variables. Identify a relationship as either positive or negative. Also, judge the strengthof the relationship and determine whether it meets the standards set out in the previous lab (Lab 6)
- Repeatthe process until you find an acceptable relationship. meets or exceeds the standards set out above. Continue to find other interesting crosstab relationships between variables.
EXAMPLE #1
- Dataset
- CES 2011
- Dependent Variable
- PES11_41: How much should be done to reduce gap between the rich and the poor in Canada? Much more, somewhat more, the same as now, somewhat less, much less.
- Independent Variable
- CPS11_20: How do you feel about the federal NDP?
- Arrow Diagram:
- Warm Feelings toward NDP→ reduce rich/poor gap
- Syntax:
fre var pes11_41. missing values pes11_41 (8,9). recode PES11_41 (1=1) (2=.5) (3,4,5=0) into undogap3. value labels undogap3 0 'less or same'.5 'somemore' 1 'muchmore'. fre var cps11_20. missing values cps11_20 (996, 998, 999). recode cps11_20 (0 thru 40 =0) (41 thru 65= .5) (66 thru hi = 1) into NDPFeel . value labels NDPFeel 0 ' cool' .5 'neutral' 1 'warm'. crosstabs tables = undogap3 by NDPFeel Confeel /cells = column count /statisics btau ctau.
- Syntax Legend
- Missing Values and recodes are decided upon based on frequencies trial runs. In this case both variables are recoded into three categories. Note that he last three categories of the dependent variable were combined (collapsed) due to small numbers in each category. Recode for the independent variable were decided upon with reference to the cumulative percentages in the frequency distribution.
- Crosstabs command list the Dependent Variable first, then the Independent Variable
- /cells subcommand tells SPSS to put column percentages and frequencies in each cell.
- /statistics is the syntax subcommand that needs to be included in order to calculate the Measures of Association. In this case we want to calculate Kendall’s Tau.
- With a square table specify btau. If the table is rectangular specify ctau.
Output
NDPFeel | |||||
cool | neutral | warm | |||
undogap3 | less or same | ||||
% within NDPFeel | 34.6% | 21.6% | 15.7% | ||
somemore | |||||
% within NDPFeel | 36.5% | 40.2% | 40.0% | ||
muchmore | |||||
% within NDPFeel | 28.9% | 38.1% | 44.2% | ||
Total | Count | 1123 | 1041 | 1049 | |
Measures of Association
Value | ||
Ordinal by Ordinal | Kendall’s tau-b | .161 |
Kendall’s tau-c | .159 | |
N of Valid Cases | 3213 |
- Interpretation of Crosstab:
- First, look at top and bottom rows of the table and identify the pattern of results. In the “same or less ” row the column percentages are decreasing from 34.6%%, to 21.6% to 15.7%. In the bottom row the percentages are increasing from 28.9% to 38.1% to 44.2%.
- There is clearly a noticeable pattern to the results. Warmer feelings toward the NDP are associated with less support for doing the same or less (top row) about the gap between rich and poor. And warmer feelings are associated with greater support for doing more (bottom row).
- Students often find it useful to identify the cell or cells in each row that contain a disproportionately high column-percentage (bulges). The bulge is the cell with the highest column-percentage for its row.
- The bulges for each row are shown in bold text. For the “less or same” row, the bulge (34.6%) is under the “cool” category of the independent variable. Bulges can also clearly be identified in the “some more” and “much more” rows.
- Since the bulges follow a pattern running diagonally downward along the main diagonal, there is a positive relationship between the independent and dependent variable.
- We can conclude that the more warmly a respondent feels toward the NDP, the more they are likely to favour reducing the gap between rich and poor.
- Interpretation of Tau
- Since the crosstab is symmetrical (3 X 3), we use the tau-b measure of.161 rather than the tau-c measure.
- The tau-b value is positive, meaning that the more warmly a respondent feels toward the NDP (independent variable), the more support they are likely to favour reducing the gap between rich and poor (dependent). This confirms the conclusion reached by the column-percentage analysis performed above.
- Using the interpretative standards from the table in Lab 6, we see however that the relationship is weak and minimally acceptable.
Example #2
- Dataset:
- CES 2011
- Dependent Variable
- PES11_41: How much should be done to reduce gap between the rich and the poor in Canada? Much more, somewhat more, the same as now, somewhat less, much less.
- Independent Variable
- [CPS11_18: How do you feel about the federal Conservative party?
- Arrow Diagram
- Warm Feelings toward NDP→ reduce rich/poor gap
- Syntax
missing values pes11_41 (8,9). recode PES11_41 (1=1) (2=.5) (3,4,5=0) into undogap3. value labels undogap3 0 'less or same'.5 'somemore' 1 'muchmore'. fre var cps11_18. missing values cps11_18 (996, 998, 999). recode cps11_18 (0 thru 35 =0) (36 thru 69= .5)(70 thru hi = 1) into ConFeel . value labels ConFeel 0 ' cool' .5 'neutral' 1 'warm'. crosstabs tables = undogap3 by ConFeel /cells = column count /statistics btau ctau.
- Syntax Legend
- Note that the recodes used to create ConFeel differ from those used for NDPFeel. Again, looking at the cumulative frequency column helped in deciding on the recodes.
- Output
ConFeel | |||||
cool | neutral | warm | |||
% within ConFeel | 14.1% | 21.0% | 36.7% | ||
% within ConFeel | 34.7% | 45.1% | 37.2% | ||
% within ConFeel | 51.2% | 33.9% | 26.2% | ||
Total | 1071 | 998 | 1143 | ||
Value | ||
Ordinal by Ordinal | Kendall’s tau-b | -.229 |
Kendall’s tau-c | -.227 | |
N of Valid Cases | 3212 |
- Interpretation of Crosstab
- Reading across the first row of the table the column percentages increase from14.1% to 21.0% to 36.7%
- In the bottom row the percentages decrease from 51.2% to 33.9% to 26.2%.
- There is clearly a noticeable pattern to the results. Warmer feelings toward the Conservatives are associated with more support for doing the same or less (top row) about the gap between rich and poor. And warmer feelings are associated with less support for doing more (bottom row).
- Note that the percentage differences across the rows of this table are greater than those in the previous table, suggesting a stronger (more predictable) relationship.
- The bulges for each row are shown in bold text. For the “less or same” row, the bulge (36.7%) is under the “warm” category of the independent variable. Bulges can also clearly be identified in the “some more” and “much more” rows.
- Since the bulges follow a pattern running diagonally along the off diagonal, there is a negative relationship between the independent and dependent variables.
- We can conclude that the more warmly a respondent feels toward the Conservatives, the less they are likely to favour reducing the gap between rich and poor.
- Interpretation of Tau
- Since both variables are ordinal and the table is square the appropriate measure of association is again Tau-b.
- The tau-b value is negative, meaning that the more warmly a respondent feels toward the Conservatives (independent variable), the less they favour doing to reduce the gap between rich and poor (dependent). This confirms the conclusion reached by the column-percentage analysis performed above.
- Using the interpretative standards from the table in Lab 6, we see that the relationship is moderate and regarded as acceptable.
- Note that this relationship is stronger than the previous one.
Example #3
- Dataset:
- CES 2011
- Dependent Variable
- PES11_41: How much should be done to reduce gap between the rich and the poor in Canada? Much more, somewhat more, the same as now, somewhat less, much less.
- Independent Variable
- [CPS11_78: In what year were you born
- Arrow Diagram
- Young → reduce rich/poor gap
- Syntax
missing values pes11_41 (8,9). recode PES11_41 (1=1) (2=.5) (3,4,5=0) into undogap3. value labels undogap3 0 'less or same'.5 'somemore' 1 'muchmore'. fre var cps11_78. missing values cps11_78 (9998, 9999). compute age = (2011- cps11_78 ). recode age (lo thru 39=1) (40 thru 50 =2) (51 thru 60=3) (61 thru 70=4) (71 thru hi=5) into age5. value labels age5 1 '<40' 2 '40-50' 3 '50-60' 4 '60-70' 5 '71+'. crosstabs tables = undogap3 by age5 /cells = column count /statisics btau ctau.
- Syntax Legend
- Age is computed from the respondents’ birthyears. Cumulative frequencies help in arriving att the recodes.
- Output (edited to fit the available space)
age5 | |||||||
<40 | 40-50 | 50-60 | 60-70 | 71+ | |||
undogap3 | less or same | 22.5% | 25.6% | 23.2% | 24.7% | 25.6% | |
somemore | 43.3% | 35.0% | 37.7% | 41.1% | 36.6% | ||
muchmore | 34.2% | 39.4% | 39.1% | 34.2% | 37.9% | ||
Total | 644 | 634 | 741 | 628 | 536 |
Value | ||
Ordinal by Ordinal | Kendall’s tau-b | -.004 |
Kendall’s tau-c | -.004 | |
N of Valid Cases | 3183 |
- Interpretation of Crosstab:
- Reading across the rows of the there is little apparent variation
- There is not much of a noticeable pattern to the results.
- The bulges do not form a pattern.
- We can conclude age is not related to the DV.
- Interpretation of Tau
- Since both variables are ordinal and the table is rectangular the appropriate measure of association is Tau-c.
- The tau-c value is negative, but scarcely differs from zero. This confirms the conclusion reached by the column-percentage analysis performed above.
- Using the interpretative standards from the table in Lab 6, the relationship very weak, verging on there not being a relationship.
- This relationship is weaker than the previous ones.
QUESTIONS FOR REFLECTION
- Do the direction and the strength of the relationship depend on how you code the variables?
- Is a stronger relationship always better?
- How can measures of association help you determine the relative strength of the relationships?
DISCUSSION
- The direction of the relationship, and hence the interpretation of any ordinal X ordinal crosstab, depends upon the manner in which the variables were coded. If we took one variable from a pair of positively related variables and recoded it such that the categories of one of the variables ran in reverse order, the relationship would become negative. Moreover, declaring missing values or recoding variables may affect measures of association. Since the interpretation of the crosstab depends on the way you code the variables, be sure to label and code the variables carefully. If you have a variable called Education, for example, arrange the categories of its indicator from lowest to highest levels of education. Do not code it such that it runs from highest to lowest levels of education. Age variables can often present a problem when based on birth year. Coding variables appropriately makes it easier for the reader of your work to understand your results.
- Remember that relations stronger than .4 may indicate that your two variables measure essentially the same thing.
- Summary measures of association enable you to compare the relative influence of several IVs on the same DV. For example, the following table was created using syntax at the end of this lab with the CES2011 data.
Predicting attitudes toward reducing the gap between the rich and poor (cps11_18).
Variable | Indicator | Measure | Association |
Education | cps11_79 | Tau-b | -.024 |
Age | cps11_78 | Tau-c | -.004 |
Income | cps11_92 &93 | Tau-c | -.122 |
Personal Finance Improving | cps11_66 | Tau-b | -.179 |
Economy Improving | cps11_39 | Tau-b | -.179 |
Government improved Econ | cps11_40 | Tau-b | -.163 |
Feelings toward NDP | cps11_20 | Tau-b | .161 |
Feelings toward Conservatives | cps11_18 | Tau-b | -.229 |
Source: CES 2011
Syntax
*Selecting an interesting survey item-Cdn Attitudes re Inequality* missing values pes11_41 (8,9). recode PES11_41 (1=1) (2=.5) (3,4,5=0) into undogap3.value labels undogap3 0 'less or same'.5 'somemore' 1 'muchmore'. fre var cps11_20. missing values cps11_20 (996, 998, 999). recode cps11_20 (0 thru 40 =0) (41 thru 65= .5) (66 thru hi = 1) into NDPFeel . value labels NDPFeel 0 ' cool' .5 'neutral' 1 'warm'. fre var cps11_18. missing values cps11_18 (996, 998, 999). recode cps11_18 (0 thru 35 =0) (36 thru 69= .5) (70 thru hi = 1) into ConFeel . value labels ConFeel 0 ' cool' .5 'neutral' 1 'warm'. crosstabs tables = undogap3 by NDPFeel Confeel /cells = column count /statisics btau ctau. *Create Income measure*. missing values cps11_92 (998, 999). missing values cps11_93 (98,99). numeric income. if (cps11_93 =1) or (cps11_92 lt 30) income = 1. if (cps11_93 =2) or ((cps11_92 ge 30) and (cps11_92 lt 60)) income=2. if (cps11_93 =3) or ((cps11_92 ge 60) and (cps11_92 lt 90)) income=3.if (cps11_93 =4) or ((cps11_92 ge 90) and (cps11_92 lt 110)) income = 4. if (cps11_93 =5) or (cps11_92 ge 110) income = 5. value labels income 1 '<$30k' 2 '$30k-$59k' 3 '$60k-$89k' 4 '$90k-$109k' 5 '$110k+'. *Create Age indictor*. missing values cps11_78 (9998, 9999). compute age = (2011- cps11_78 ). recode age (lo thru 39=1) (40 thru 50 =2) (51 thru 60=3) (61 thru 70=4) (71 thru hi=5) into age5. value labels age5 1 '<40' 2 '40-50' 3 '50-60' 4 '60-70' 5 '71+'. *Create 3 value education measure*. missing values cps11_79 (98,99). recode cps11_79 (1 thru 5 = 1) (6,7,8=2) (9,10,11=3) into educ3. value labels educ3 1 'secondary' 2 ' some postsec' 3 'compl univ'. fre var educ3. *Crosstabulate Att re Ineq by educ income agecat. crosstabs tables = undogap3 by educ3 income age5 /cells = column count /statisics btau ctau. missing values cps11_39 (8,9). recode cps11_39 (1=1) (3=0) (5=.5) into Econ. value labels Econ 0 'worse' .5 'same' 1 'better'. missing values cps11_66 (8,9). recode cps11_66 (1=1) (3=0) (5=.5) into Finances. value labels Finances 0 'worse' .5 'same' 1 'better'. crosstabs tables = undogap3 by cps11_40 Econ Finances /cells = column count /statisics btau ctau.