new Lab 1



  • Learn how to use variables and their indicators.
  • Become familiar with searching out indicators for variables in codebooks.


  • Social Research entails systematic observation and the understanding of patterns. Systematic observations are made using surveys, experiments, available data and field research. In this exercise we will use survey data. Patterns are understood by identifying and explaining variation in our observations. An essential aspect of social research is making certain that our observations exhibit some degree of variation. If there is little or no variation in our observations, they are regarded as constants. Constants are of limited value for most social research. So we try to look for variation.
  • Variables are abstract concepts used to describe sets of attributes or characteristics. Some simple examples are gender, age and race, but social researchers are also interested in more complex variables such as ideological orientation or view of water use reduction policies.
  • Indicators are used to measure variables. For example, to measure gender we could use the indicator coded by the interviewer or the answer to a direct question. To measure views of the recreational use of marijuana we might look at a question such as, “Proposition 64 is called the ‘Marijuana Legalization. Initiative Statute’ … If the election were held today, would you vote yes or no on Proposition 64?”
  • Hypotheses express an expected relationship between two variables. In causal explanations an independent variable (IV) is generally identified as X and a dependent variable (DV) is identified as Y. Symbolically,  X→Y or X↔Y .

Stage I


  1. Find a data set from either the 2016 or 2017  Statewide surveys made available by the Public Policy Institute of California (PPIC).
    They are available at: or through the data link on the website.
  2. Download the .zip file containing the appropriate codebook & data. The example shown below uses the October 2016 data.
  3. Open the codebook, browse through the questions in the survey and select an indicator of a variable which interests you.
  4. Place the data file on your desktop or in another appropriate place such as a folder where you intend to keep your data sets.
  5. On a PC, right click on data set icon, select Properties and lock the dataset by clicking the box labeled “Read Only.” On a Mac right click (two finger click) on the icon and select Get Info and lock the dataset by clicking the box labeled locked
  6. Open the dataset using SPSS.
  7. Under the Analyze menu, select Descriptive Statistics and then Frequencies.
  8. Locate the item you have chosen for analysis and move it into the Variables area.
  9. Click Paste and review the syntax that has appeared in the syntax window
  10. Select the syntax.
  11. Go either to the Run menu and choose the green triangle or click on the green triangle in the row of icons.
  12. Review the results and ask yourself: Is there variation on this question? As a rough standard, no more than 85% of the cases should be in a single value category.
  13. If there is variation try to think of an explanation for the variation you observe by looking at the other available questions in the survey.
  14. Formulate a hypothesis in the form of X –>Y.
  15. Repeat steps 1-6 to examine variation on the hypothesized explanatory (X) variable.

EXAMPLE (using PPIC data)

  • Questionnaire
    • Statewide Survey October 2016
  • Y Variable
    • Marijuana Initiative
  • Indicator for Y
    Q21. “Proposition 64 is called the ‘Marijuana Legalization. Initiative Statute’ … If the election were held today, would you vote yes or no on Proposition 64?”
  • Possible Explanation (X)
  • Indicator for X
  • SPSS Syntax

    fre var q21 gender
  • Formatted Tables Produced from Output

    Q21. Proposition 64 is called the ‘Marijuana Legalization. Initiative Statute.’ If the election were held today, would you vote yes or no on Proposition 64?
    Frequency Percent Valid Percent Cumulative Percent
    Valid yes 712 41.8 52.3 52.3
    no 575 33.7 42.2 94.6
    [VOL] don’t know 74 4.3 5.4 100.0
    Total 1361 79.9 100.0
    Missing [VOL] refuse 12 .7
    System 331 19.4
    Total 343 20.1
    Total 1704 100.0
  • Respondents’ Gender

    Frequency Percent Valid Percent Cumulative Percent
    Valid Male 857 50.3 50.3 50.3
    Female 847 49.7 49.7 100.0
    Total 1704 100.0 100.0

    Source: PPIC October 2016

    Is there variation on both the dependent and independent variables that form your hypothesis.
    It is essential that there is variation of the indicators of both X and Y. Otherwise, they are not likely useful for further analysis.
  • Give some thought to other possible IVs such as Language of Interview.
  • Modify the Syntax to produce graphics: