Lab 2


Crosstabulation with Non-Interval Variables


  • To learn how to perform a crosstabulation and practice formulating hypotheses.
  • To appreciate how crosstabulation allows us to make comparisons relevant to our hypotheses.
  • Introduce the logic of comparison



  • Crosstabulation brings together the indicators for two variables and displays the relationship between them in a single table. Each column in the crosstab corresponds to a category of the independent variable, and each row corresponds to a category in the dependent variable. Hence the dependent variable goes on the left, and the independent variable goes on the top.
  • Each cell represents a unique combination of categories from each of the variables. For example, in the table below, the cell “G” represents all the respondents who selected Category I for the independent variable and Category III for the dependent variable.
  • The percentage in each cell is calculated by dividing the number of respondents in the cell by the total number of respondents for the column. Note: the cell-percentage values will be affected by whether or not we treat some categories of our indicators as missing values. Pay attention to the percentages in each cell rather than the number (n) of respondents in each cell.
  • To interpret crosstabs compare the column-percentages across the rows to see whether they differ. For instance, in the table below, compare the percentage values for cells A, B, and C, then compare D, E, and F, and finally compare G, H, and I. If the column-percentages of cells A-B-C, and/or D-E-F, and/or G-H-I remarkably differ from one another then you may have found a relationship.
  • Crosstabulation does not work effectively if either variable has a great many value categories.
Category I Category II Category III
Category II D E F
Category III G H I


Crosstabulating Variables

  1. Select an appropriate data set such as the CES 2011, the ANES 2012 or one of the many Forum 2013 data sets.
  2. Enter the codebook for the dataset you have chosen.
  3. Hypothesize a relationship between two variables in the dataset.
    • For example, you might think that attitudes toward inequality may vary by partisanship
  4. In order to avoid corrupting your data, lock your data set prior to beginning your analyses.
  5. To make certain there is some variation on the variables, use SPSS to perform a frequency analysis for each variable.
  6. In the Analysis menu of SPSS, select Descriptive Statistics and then Crosstabs. Place your dependent in the rows box and your independent variable in the columns box. Click on the “Cells” tab and select column percentages.
  7. Click on the “Paste” button. Select the syntax and run it.
  8. Determine whether there is a relationship between the variables based on the column-percentages in the crosstab.
  9. Repeat the analysis until you find a set of variables with a relationship.


  • Dataset:
    • CES 2011
  • Dependent Variable:
    PES11_41: How much should be done to reduce gap between the rich and the poor in Canada?
  • Independent Variable:
    CPS11_71: In federal politics, do you usually think of yourself as a: Liberal, Conservative, N.D.P, Bloc Québécois, Green Party, or none of these?


  • Arrow Diagrams :
    • X → Y
    • Party identification →Attitude toward inequality reduction
  • Syntax:

    *Preparing the DV*.
    missing values pes11_41 (8,9).
    recode PES11_41 (1=1) (2=.75) (3=.5) (4= .25) (5=0) into undogap.
    value labels undogap 0 'muchless' .25 'someless' .5 'asnow' .75 'somemore' 1 'muchmore'.
    fre var undogap.
    *Preparing the IV*.
    recode cps11_71 (2=1) (1=2) (4=3) (3=4) (5=5) (6 =6) into PID.
    value labels PID 1 'Cons' 2 'Lib' 3 'BQ' 4 'NDP' 5 'Green' 6 'None'.
    fre var PID age female French educ5 income.
    *Running the Crosstabulation*.
    crosstabs tables = undogap by PID
         /cells = column count.
  • Output:

Crosstabulation of Attitudes toward Inequality by Party Identification

Action on Inequality Conserv Liberal BQ NDP Green None
Much less 3.7% 1.2% 0.4% 1.1% 2.2%
Someless 5.1% 2.5% 0.4% 1.2% 1.7%
As now 31.5% 19.1% 7.5% 8.2% 4.7% 17.1%
Somemore 37.7% 39.4% 45.7% 34.6% 37.6% 40.0%
Muchmore 22.1% 37.9% 46.1% 56.1% 56.5% 39.0%
N= 903 808 254 353 85 590

Source: CES 2011

  • Interpretation of Crosstab:
    • The number in each cell is a column-percentage.  At the bottom of each column is the number of cases on which the column percentages are based. The column percentages are key in interpreting your findings.
    • Comparing the column-percentages for the cells across each row of the table we can see that there is are differences among the partisan groups
    • In particular, Conservatives seem least supportive of government action on equality, whereas NDP and Green partisans are most supportive
    • The Liberals and those with no partisan identification fall in the middle ground.


Did you discover a relevant relationship in your crosstab based on the column-percentages? If so, was it evident in only one row of the table or in all rows?


When you find a cell that has a substantially different column-percentage from the other cells in that row, there are usually other rows in the table that also have a difference. For example, if you find a difference in the column-percentage for cells A-B-C, then there is probably also a difference between D-E-F, or G-H-I. This happens because the column-percentage in any given cell influences the column-percentage of the other cells in that column.

Category I Category II Category III
Category II D E F
Category III G H I