new Lab 19 | Data Art

POLI 101 LAB 19

Control Tables

PURPOSE

To learn how to control for a third variable with crosstabulations
To learn how to compare partial relationships against the original and to distinguish among types of relationships. Part 1 will discuss replication and specification, the most common results with control tables. Part 2 will discuss understanding spurious relationships as either explanation or interpretation.

MAIN POINTS

Control tables allow us to examine the relationship between two variables while controlling for a third variable. We do this in order to determine whether the control variable has an effect on the original relationship.
When we use a control variable the original relationship is separated into partial relationships. The number of partial relationships obtained corresponds to the number of categories of the control variable.
- For example, if we crosstabulate attitudes regarding recreational marijuana (the dependent variable) with the independent variable political party affiliation while controlling for Gender, we will obtain two partial tables.
- Assuming gender is coded with Female as the high score (1) and Male as the low score (0), the first partial table would show the relationship between Marijuana attitude and Party Affiliation among Male respondents.
- The second partial would show the relationship between the two variables among Female respondents.
Partial tables can reveal several different types of results. The most common results are replication and specification. They are discussed in this part of the lab. Explanation and Interpretation will be discussed in the next part.
- With Replication: The partial relationships are roughly the same as in the original crosstabulation. This means that the original relationship holds true even when taking into account the control variable.
  - For example, if the partial relationships between Marijuana attitude and Party Affiliation for both Male and Female are essentially the same as the original relationship, then we would conclude that the result is replication
  - Thus we might find that Party Affiliation predicts Marrijuana attitudes equally well for both Male and Female respondents.
- With Specification: One (or more) of the partial relationships becomes stronger and the other(s) become(s) weaker. This means that we have specified the conditions under which the original relationship occurs.
  - For an example, let’s return to the relationship between Marijuana attitude and Political Party Affiliation. If we control for the effect of being a parent and find that the relationship is more pronounced among the non-parents than among parents, then we have found specification.
  - This means that we have ‘specified’ certain conditions under which the relationship is stronger.

EXAMPLE 1: replication

Variables, Missing Values & Recodes

Dataset: PPIC October 2016
Independent Variable: Party Affiliation
Dependent Variable: Attitudes regarding Recreational Marijuana
Control Variable: Gender (coded as Female)

Control Table Syntax

*Weighting the Data*.
weight by weight.
*Recoding MJ Index Items*.
recode q21 (1=1) (2=0) into MJPropD.
value labels MJPropD 1 'yes' 0 'no'.
recode q36 (1=1) (2=0) into MJLegalD.
value labels MJLegalD 1 'yes' 0 'no'.
recode q36a (1=1) (2=.5) (3=.0) into MJTry.
value labels MJTry 1 'recent' .5 'not recent' 0 'no'.

*Constructing an Index with alpha = .777*.
compute RawMJ3 = (MJPropD + MJLegalD + MJTry).

*recoding the index*.
recode RawMJ3 (0, .5=0) (1 thru 2= .5) (2.5, 3 =1) into MJ3.
value labels MJ3 0 'low' .5 'med' 1 'hi'.
execute.

*Creating IV Indicator for Party Affiliation*.

recode q40c (1=0) (3=.5) (2=1) into Democrat3.
value labels Democrat3 1 'Democ' .5 'Indep' 0 'Repub'.

*creating dummied control variable*.
recode gender (1=0) (2=1) into female.
recode d4 (1=1) (2=0) into parent.
value labels parent 1 'yes' 0 'no'.

*crosstabulation with gender control*.
crosstabs tables = mj3 by Democrat3
 /mj3 by Democrat3 by female
 /cells = column count
 /statistics = d btau corr chisq.

Selected Output

Attitude toward Recreational Marijuana by Ideology

		Party
		Repub	Indep	Democ
Support Rec MJ	low	54.4%	28.0%	30.6%
	medium	23.5%	28.0%	32.2%
	high	19.1%	43.9%	37.1%
Total		230	371	369

Taub = .147; p =.000.

Summary Statistics

All respondents
Taub= .147 (.028) ; p =.000

Male respondents
Taub= .194 (040); p =.000

Female respondents
Taub= .138 (039); p =.000

Interpretation of Results
- The resulting measures of association appear to differ substantially from the original relationship of 0.147. The relationship for Males seems stronger and that for Females a bit weaker.
- Some authors suggest using a ‘rule of thirds’ as a rough and ready technique for determining whether or not the relationship is a replication. This rule of thumb suggests looking to see whether the original relationship has changed by one third (or more) of its original value. In this case, 1/3 of .147 = .049. Neither the relationship for males nor the relationship for Females differs from the original by this much. The relationship among males is .047 larger. For females it is .009 smaller.
- Moreover, the relationship remains statistically significant among both gender groups.
- A more statistically rigorous technique is to refer to the standard error reported as ASE1 reported in conjunction with ordinal or interval measures of association. Although these may be considered technically inappropriate with ordinal data, calculating them here suggests the coefficients do not differ. As you will recall from our discussion of statistical significance, 95% of the cases on a normal distribution fall within about two (1.96) standard errors on each side of a point estimate. Therefore, multiplying the standard error by two and adding and subtracting the result to the coefficient can be used to determine significant differences between measures of association.
- By any of these standards, this is a case of replication. So we can conclude that the original relationship between Support for Marijuana and Ideology holds among both Male and Female respondents.

Example 2: Specification

- Preliminary Hypothesis: Party Affiliation is more related to Marijuana Attitude among the politically interested.
- Partial Syntax

*creating measure of interest.
recode q38 (1=1) (2=2) (3,4=3) into interest3.
value labels interest3 1 'great deal' 2 'some' 3 'little or none'.

*crosstab with interest control*.
crosstabs tables = mj3 by Democrat3
  /mj3 by liberal by interest
  /cells = column count
  /statistics = d btau chisq.

Control Table Statistical Output

Summary Statistics

All respondents
Taub= .147 (.028); p =.000;

low interest
Taub= -.045 (.052); p =.385.

medium interest
Taub= .163 (.045); p =.000

high interest
Taub= .295 (.049); p=.000

Interpreting the Control Table Results

The resulting measures of association for the subgroups differ from the original relationship. In particular, as indicated by their Taub values which is substantially stronger among those who are highly interested in politics
Using the rough and ready ‘rule of thirds’ suggests that, as measured by Taub, the relationship for both low and high interest groups differ from the original value by more than one third of the original value. In this case 1/3 of .147 = .049. The relationship for those of high and low interest differ from the original relationship by at more than this. (.147 + .049 =.196; .147 – .049 = .098).
Moreover, the relationship is no longer significant among the low interest group.
In using the ordinal tau measure of association, another more rigorous technique is available. The estimated standard errors produced by Taubcan be used to calculate confidence intervals for ordinal measures of association. In this case, multiplying the standard error of .052 for the low interest group by 1.96 (or roughly two) and adding this to the low interest Taub value suggests that -.045 likely differs from both.163 and .295. Moreover, the relationship for the medium interest group also likely differs from that for the high interest group because .163 + 1.96 (.045) = .253.
By both the more rigorous and the informal rule of thirds standards, this appears to be a case of specification. So we can conclude that the original relationship between Views on Marijuana and Ideology is significantly weaker among the uninterested and stronger among those who with high interest.

INSTRUCTIONS – Part 1: attempting control

Use SPSS to access an appropriate dataset and run a crosstabulation between a Dependent and Independent variable, selecting the appropriate measure of association and statistical significance.
Note the strength of the original relationship by looking at the measure of association and check the significance using either the p-value for the measure of association or chi-square.
Edit your syntax of your crosstabs tables = command to add a control variable by adding a second “by” and then the name of the control variable. The addition of “by” and the third variable is the only change required to create control tables.
Examine the strength of the new partial relationships by comparing the measure of association (like Kendall’s Tau or the Correlation Coefficient) for each partial relationship to the original measure of association.
Determine whether the results indicate replication or specification.
Carefully explain what factors led you to your conclusion.
Repeat with a new control variable until you find an instance of replication and one of specification.

Part 2:

USING CONTROL VARIABLES TO UNDERSTAND SPURIOUS RELATIONSHIPS: Explanation and Interpretation

- While the most common results using control tables are replication and specification, partial tables can also be used to examine more complex three-variable relationships.
- If all of the partial relationships are substantially weaker than the original relationship, then the relationship may be either partly or wholly spurious. Recall that in a spurious relationship, the original relationship is revealed to be due to the influence of a third variable (Z) used as a control. We can often better understand statistically spurious relationships by analyzing the theoretical relations among the three variables.
- In order to understand a spurious relationship it is often useful to determine theoretically whether the control variable is antecedent to the other two variables or intervening between them.
  - An antecedent variable is logically (or temporally) prior to both of the original X and Y variables. Where the control variable is antecedent to both the independent and dependent variables, the finding of a spurious relationship is termed “explanation.” Symbolically, Z –> X,Y. The idea is that the control variable explains why X and Y are related. X and Y are not related because they are a cause and an effect; they are related because both are affected by Z.
  - An intervening variable is one that is logically (or temporally) prior to one of the variables, but not both of them. Where the control variable is intervening, a finding of a spurious relationship is called “interpretation” because it clarifies (in whole or part) the process through which the relationship between X and Y functions. This is represented symbolically as: X–> Z –> Y
    - In some cases, certain variables could not plausibly be considered to be antecedent. In the case of a spurious relationship between gender and income the control variable, education cannot possibly be antecedent to both gender and income because education cannot be the cause of gender.
    - In other cases, you can determine whether the control variable is intervening or antecedent by observing which variable(s) the control variable influences.
  - There are, of course, other possible outcomes in using control tables such as suppression and distortion as mentioned in lecture.

EXAMPLE – 3: Interpretation and Explanation

The bivariate relationship between Party ID and attitudes toward marijuana using the three value ideology measure Liberal and the three value measure MJ3 produces a taub = .147; p=.000.
Controlling for Ideology (Liberal3) reduces taub to non-significance among all three ideological groupings:

Conservative Taub= .014; p = .784:
Middle of the Road Taub= .085; p = .111:
Liberal Taub= -.011; p = .982.

Using the 5 value measures of partisanship (liberal5) and Party Id (democrat5) with the 7 value RawMJ3 index produces similar results:

All respondents Tauc= .134; p = .000;

Very Conservative Tauc= -.012; p = .856;
Conservative Tauc= -.033; p = .542;
Middle Tauc= .030; p = .524;
Liberal Tauc= .033; p= .552;
Very Liberal Tauc= .074; p= .239.

These results suggest the effect of Party Identification on Attitudes toward Marijuana are due to the influence of Ideology. And since the empirical effects of interpretation and explanation are indistinguishable, we can understand these results as showing that the original relationship between Party Id and Marijuana attitude is due to Ideology affecting both Party Id and Marijuana attitude (explanation) or because Party Id affects Ideology with in turn effects Marijuana attitude (interpretation).

INSTRUCTIONS – Part 2: using additional control variables

Using a data set of interest and select a dependent variable, an independent variable, and a control variable that may help you to explain or interpret the original IV-DV relationship.
Select appropriate measures of association and statistical significance.:
Perform the simple crosstaband note the strength of the original relationship by looking at the measure of association and check the significance using the p-value for the measure of association or chi-square.
Add a control variable to the analysis by adding declaring the appropriate missing values and making the relevant recodes. This is important, because it will determine the number and composition of the control tables. Remember that it is also a good idea to add value labelsfor newly recoded variables in order to facilitate reading your tables.
Examine the strength of the new partial relationships. Determine whether the results indicate replication, specification, or a spurious relationship.
Carefully explain what factors led you to your conclusion.
Repeat with a second control variable.

QUESTIONS FOR REFLECTION

Try to understand each set of control tables you run: do the results indicate replication, specification, interpretation, or explanation?
Why are replication and specification more common than interpretation and explanation?