Univariate Statistics and Crosstabulation Analysis (HW1)
Date Assigned: Week 2
Due Date: Week 3
Students are strongly advised to work with a partner in preparing a joint submission for this assignment. Both partners names should appear on the assignment.
For this assignment you may use any of the 2016 or 2017 Statewide data sets available from the Public Policy Institute of California (PPIC). All the PPIC data are available for download at: http://www.ppic.org/main/dataSet.asp?i=1823
Some are available on the Data Link on the DataArt.ca website.
Please note that while each of the PPIC datasets contain the results of a full scale survey sample of the California population, the number of variables contained in each of these data sets is relatively small. While this makes them easy with which to work for this first assignment, these data sets may not contain a sufficient range of variables to complete all the assignments for POLI 101. So keep in mind that this is a first effort and that you are likely also to investigate further topics at a later date.
Students may find that the PEW surveys and World Values Survey data sets available on the DataArt.ca website may be good alternatives that provide a broader range of variables. If you do choose to work with the WVS, please work with the data from only one country. Other data sets, such as the ANES 2016 or the 2016 Pretest data, may be used with explicit, written approval from the instructor.
In this and all assignments, examples provided in lectures or labs may not be used.
This assignment provides you (and your partner) with an opportunity to select an indictor for a dependent variable (DV) and two independent variables (IVs). In choosing indicators with which to work, please review the relevant codebook for the data set selected. Then work with SPSS to ensure that all three selected indicators (one DV & 2 IVs) have sufficient variation for analysis. Ideally, responses should be relatively evenly divided among the values (or categories) of a variable’s indicator. In any case, no more than 85 percent of the cases should be concentrated in one value. Details on working with SPSS are available in the online Labs available on the course website at: DataArt.ca.
Once you have selected your indicators, please describe them using univariate statistics; then hypothesize expected relationships between your dependent and independent variables; run the relevant crosstabulations using SPSS; and finally summarize your findings in terms of the support they offer for your hypotheses.
Examples of such tables and instructions on constructing them are contained in the labs.
There are several steps to this assignment. Please refer to these in completing your report
1) Select an indicator to measure your dependent variable (Y). In describing the indicator, be sure to:
- Specify the dataset you are using in this assignment.
- Include the indicator’s name, a short description of it and the full wording of the indicator. Clearly state the concept that you think the indicator is measuring.
- Be sure to explain and declare any missing values and make appropriate recodes to the selected indicator of your dependent variable (Y). If there are no missing values to be declared or recodes to be made, explain why. If your indicator has more than 5 categories, consider recoding into fewer categories in order to simplify the interpretation of your crosstabulations. If this recoding alters the concept your indicator is measuring, be sure to reflect this change in your discussion.
2) Run a frequency distribution, and describe the distribution of scores/responses for your indicator of Y with reference to its level of measurement, and appropriate measures of central tendency, dispersion and shape. Be sure to report only relevant statistics. What do these measures reveal, and what ‘story’ do they enable you to tell?
3) Select and describe indicators for two independent variables (X1 & X2) that may explain variation in the dependent variable (Y).
- Again, include their names, a description and full question wording. And clearly state which concepts X1 & X2 are measuring.
- Explain any missing values or recodes in the X1 & X2 indicators. While a full frequency analysis for each IV may be useful, you need not report these in you report.
4) State a hypothesis about the expected relationship between each independent variable and your dependent variable, as well as the rationale behind each hypothesis. While it is not necessary to cite any of the relevant research literature for this particular assignment, it is generally useful to read what others have written relating to your topic.
5) While properly formatted crosstabulations of you DV (Y) and IVs (Xs) are always preferable, for this assignment you may cut and paste into your report the tables and summary measures produced by SPSS. Describe your results in a clear yet concise manner. In doing so be sure to:
- Observe the column percentages, noting any differences across the rows and whether there is an observable pattern/relationship.
- Identify the appropriate measure(s) of association and why they are appropriate.
- Specify the values of the selected measures of association and explain what they tell us about each relationship.
6) Briefly state what the crosstabulations enable you to say about your hypotheses, and why this is the case. Focus on the ‘story’ that these results enable you to tell.
7) Which of the independent variables for which you have indicators (X1 X2) do you think offers the better explanation of variation in the indicator of your dependent variable (Y)? Refer to the evidence available in your crosstabulations as well as to your thinking about the underlying causal mechanisms and broader theoretical considerations. A small summary table similar to that presented near the end of Lab 7 may be useful.
8) Include all your syntax as an attachment to a written assignment.