new Lab 19

POLI 101 LAB 19

Control Tables


  • To learn how to control for a third variable with crosstabulations
  • To learn how to compare partial relationships against the original and to distinguish among types of relationships. Part 1 will discuss replication and specification, the most common results with control tables. Part 2 will discuss understanding spurious relationships as either explanation or interpretation.



  • Control tables allow us to examine the relationship between two variables while controlling for a third variable. We do this in order to determine whether the control variable has an effect on the original relationship.
  • When we use a control variable the original relationship is separated into partial relationships. The number of partial relationships obtained corresponds to the number of categories of the control variable.
    • For example, if we crosstabulate attitudes regarding recreational marijuana (the dependent variable) with the independent variable political ideology  while controlling for Gender, we will obtain two partial tables.
    • Assuming gender is coded with Female as the high score (1) and Male as the low score (0), the first partial table would show the relationship between Marijuana attitude and Political Ideology among Male respondents.
    • The second partial would show the relationship between the two variables among Female respondents.
  • Partial tables can reveal several different types of results. The most common results are replication and specification. They are discussed in this part of the lab. Explanation and Interpretation will be discussed in the next part.
    • With Replication: The partial relationships are roughly the same as in the original crosstabulation. This means that the original relationship holds true even when taking into account the control variable.
      • For example, if the partial relationships between Marijuana attitude and Political Ideology for both Male and Female are essentially the same as the original relationship, then we would conclude that the result is replication
      • Thus we might find that Ideology predicts Marrijuana attitudes equally well for both Male and Female respondents.
    • With Specification: One (or more) of the partial relationships becomes stronger and the other(s) become(s) weaker. This means that we have specified the conditions under which the original relationship occurs.
      • For an example, let’s return to the relationship between Marijuana attitude and Political Ideology. If we control for the effect of being a parent and find that the relationship is more pronounced among the non-parents than among parents, then we have found specification.
      • This means that we have ‘specified’ certain conditions under which the relationship is stronger.

EXAMPLE 1: replication

Variables, Missing Values & Recodes

  • Dataset: PPIC October 2016
  • Independent Variable: Political Ideology
  • Dependent Variable: Attitudes regarding Recreational Marijuana
  • Control Variable: Gender (coded as Female)

Control Table Syntax

*Weighting the Data*.
weight by weight.
*Recoding MJ Index Items*.
recode q21 (1=1) (2=0) into MJPropD.
value labels MJPropD 1 'yes' 0 'no'.
recode q36 (1=1) (2=0) into MJLegalD.
value labels MJLegalD 1 'yes' 0 'no'.
recode q36a (1=1) (2=.5) (3=.0) into MJTry.
value labels MJTry 1 'recent' .5 'not recent' 0 'no'.

*Constructing an Index with alpha = .777*.
compute RawMJ3 = (MJPropD + MJLegalD + MJTry).

*recoding the index*.
recode RawMJ3 (0, .5=0) (1 thru 2= .5) (2.5, 3 =1) into MJ3.
value labels MJ3 0 'low' .5 'med' 1 'hi'.

*Creating IV Indicator for Ideology*.
recode q37 (1,2=1) (3=.5) (4,5= 0) into liberal.
value labels liberal 1 'liberal' .5 'middle' 0 'conserv'.
recode q37 (1=1) (2=.75) (3= .5 ) (4=.25) (5= 0) into liberal5. 
value labels liberal5 1 'vlib' .75 'liberal'.5 'middle' .25 'conserv' 0 'vcons'.

*creating dummied control variable*.
recode gender (1=0) (2=1) into female.
recode d4 (1=1) (2=0) into parent.
value labels parent 1 'yes' 0 'no'.

*crosstabulation with gender control*.
crosstabs tables = mj3 by liberal
 /mj3 by liberal by female
 /cells = column count
 /statistics = d btau corr chisq.

crosstabs tables = mj3 by liberal
 /mj3 by liberal by parent
 /cells = column count
 /statistics = d btau chisq.

Selected Output

Attitude toward Recreational Marijuana by Ideology


conserv middle liberal
low 57.3% 31.3% 19.3%
medium 20.6% 32.3% 32.4%
high 22.1% 36.5% 48.3%
Total 335 288 358

Taub = .282; Chi-square = 116.3; 4df, p =.000.

Summary Statistics 

All respondents
Taub= .282 (.027) ; p =.000

Male respondents
Taub= .253 (039); p =.000

Female respondents
Taub= .322 (038); p =.000

  • Interpretation of Results
    • The resulting measures of association appear to differ substantially from the original relationship of 0.282. The relationship for Males seems weaker and that for Females rather stronger.
    • Some authors suggest using a ‘rule of thirds’ as a rough and ready technique for determining whether or not the relationship is a replication. This rule of thumb suggests looking to see whether the original relationship has changed by one third (or more) of its original value. In this case, 1/3 of .284 = .095. . Neither the relationship for males nor the relationship for Females differs from the original by this much. The relationship among males is .029 smaller. For females it is .04 larger.
    • Moreover, the relationship remains statistically significant among both gender groups.
    • A more statistically rigorous technique is to refer to the standard error reported as ASE1 reported in conjunction with ordinal or interval measures of association. Although these are technically inappropriate here, calculating them here suggests the coefficients do not differ. As you will recall from our discussion of statistical significance, 95% of the cases on a normal distribution fall within about two (1.96) standard errors on each side of a point estimate. Therefore, multiplying the standard error by two and adding and subtracting the result to the coefficient can be used to determine significant differences between measures of association.
    • By any of these standards, this is a case of replication. So we can conclude that the original relationship between Egalitarianism and PartID holds among both French and English respondents.

Example 2: Specification

    • Preliminary Hypothesis: Political Ideology is more related to Marijuana Attitude among non-parents.
    • Partial Syntax
*create parental dummy*.
recode d4 (1=1) (2=0) into parent.
value labels parent 1 'yes' 0 'no'.

*crosstab with parental control*.
crosstabs tables = mj3 by liberal
  /mj3 by liberal by parent
  /cells = column count
  /statistics = d btau chisq.

Control Table Statistical Output

Summary Statistics

All respondents
Taub= .282 (.027); p =.000;

Taub= .351 (.032); p =.000.

Cramer’s V= .182 (.048) p =.000

Interpreting the Control Table Results

  • The resulting measures of association for the subgroups differ from the original relationship. In particular, as indicated by their Taub values which is substantially stronger among those without children than those with them.
  • Using the rough and ready ‘rule of thirds’ suggests that, as measured by Taub, the relationship among the non parents differs from the original value by more than one third of the original value. In this case 1/3 of .282 = .094. The relationship for those with children exceeds the original relationship by at more than this.(.282- .182= .100).
  • Nevertheless, the relationship remains significant among both subgroups.
  • In using the ordinal tau measure of association, another more rigorous technique is available. The estimated standard errors produced by Taub can be used to calculate confidence intervals for ordinal measures of association. In this case, multiplying the standard error of .032 for the non parents group by 1.96 (or roughly two) and subtracting (or adding) this from the parents’ Taub value for the parents value suggests that .351 likely differs from both.182.
  • By both the more rigorous and the informal rule of thirds  standards, this appears to be a case of specification. So we can conclude that the original relationship between Views on Marijuana and Ideology is significantly weaker among parents.
  • Note however that the relationship is statistically significant among both groups.

INSTRUCTIONS – Part 1attempting control

  1. Use SPSS to access an appropriate dataset and run a crosstabulation between a Dependent and Independent variable, selecting the appropriate measure of association and statistical significance.
  2. Note the strength of the original relationship by looking at the measure of association and check the significance using either the p-value for the measure of association or chi-square.
  3. Edit your syntax of your crosstabs tables = command to add a control variable by adding a second “by” and then the name of the control variable. The addition of “by” and the third variable is the only change required to create control tables.
  4. Examine the strength of the new partial relationshipsby comparing the measure of association (like Kendall’s Tau or the Correlation Coefficient) for each partial relationship to the original measure of association.
  5. Determine whether the results indicate replication or specification.
  6. Carefully explain what factors led you to your conclusion.
  7. Repeat with a new control variable until you find an instance of replication and one of specification.

Part 2:


    • While the most common results using control tables are replication and specification, partial tables can also be used to examine more complex three-variable relationships.
    • If all of the partial relationships are substantially weaker than the original relationship, then the relationship may be either partly or wholly spurious. Recall that in a spurious relationship, the original relationship is revealed to be due to the influence of a third variable (Z) used as a control. We can often better understand statistically spurious relationships by analyzing the theoretical relations among the three variables.
    • In order to understand a spurious relationship it is often useful to determine theoretically whether the control variable is antecedent to the other two variables or intervening between them.
      • An antecedent variable is logically (or temporally) prior to both of the original X and Y variables. Where the control variable is antecedent to both the independent and dependent variables, the finding of a spurious relationship is termed “explanation.” Symbolically, –> X,Y. The idea is that the control variable explains why X and Y are related. X and Y are not related because they are a cause and an effect; they are related because both are affected by Z.
      • An intervening variable is one that is logically (or temporally) prior to one of the variables, but not both of them. Where the control variable is intervening, a finding of a spurious relationship is called “interpretation” because it clarifies (in whole or part) the process through which the relationship between X and Y functions. This is represented symbolically as: X–>  Z –> Y
        • In some cases, certain variables could not plausibly be considered to be antecedent. In the case of a spurious relationship between gender and income the control variable, education cannot possibly be antecedent to both gender and income because education cannot be the cause of gender.
        • In other cases, you can determine whether the control variable is intervening or antecedent by observing which variable(s) the control variable influences.
      • There are, of course, other possible outcomes in using control tables such as suppression and distortion as mentioned in lecture.

EXAMPLE – 3: Interpretation and Explanation

Several unsuccessful attempts to uncover an instance of interpretation or explanation for the Ideology and Marijuana attitude  relationship were made. . In each instance either replication or specification was found.

Efforts to find a suitable example will continue.

INSTRUCTIONS – Part 2: using additional control variables

  1. Using a data set of interest and select a dependent variable, an independent variable, and a control variable that may help you to explain or interpret the original IV-DV relationship.
  2. Select appropriate measures of association and statistical significance.:
  3. Perform the simple crosstaband note the strength of the original relationship by looking at the measure of association and check the significance using the p-value for the measure of association or chi-square.
  4. Add a control variable to the analysis by adding declaring the appropriate missing values and making the relevant recodes. This is important, because it will determine the number and composition of the control tables. Remember that it is also a good idea to add value labelsfor newly recoded variables in order to facilitate reading your tables.
  5. Examine the strength of the new partial relationships. Determine whether the results indicate replication, specification, or a spurious relationship.
  6. Carefully explain what factors led you to your conclusion.
  7. Repeat with a second control variable.


  • Try to understand each set of control tables you run: do the results indicate replication, specification, interpretation, or explanation?
  • Why are replication and specification more common than interpretation and explanation?