Sociology 104/205a: Quantitative Methods for Sociological Research

Professor:  James Moody (moody.77@sociology.osu.edu)

Office Hours:  Tuesday & Thursday 1:45 – 3:00

TA: Bob Ngo (bobngo@umail.ucsb.edu)

Office Hours:  Tuesday 10:00 – 12:00, Ellison 2626


Meeting times: Tuesday & Thursday 12:30 to 1:45

Meeting Place:  Ellison 2626 (Lab)

 

Overview:  This course is designed to introduce you to the basic graphical and statistical techniques for analyzing quantitative sociological data.  The approach we take is “hands on”: we will read, interpret and critique a wide variety of current work that uses similar methods learn the practical skills necessary to perform basic quantitative analyses.  The course will focus on both the “art” and “science” of quantitative methods by identifying the tricks, tips, and techniques needed to draw careful insights from analyses (rather than rote recitation).  Our focus, then, is on using quantitative methods to tell compelling sociological stories.  Since most quantitative work can be reduced to variants on a few key building blocks, we focus on those building blocks in this course and use the substantive readings to help appreciate the variants on these foundations. 

 

Goals:  At the end of this course you should: (a) feel comfortable reading most papers published in ASR and AJS, (b) be able to perform basic data analyses, ranging from simple descriptive graphics to multiple regression modeling, (c) understand the strengths and weaknesses of quantitative approaches to sociology.

 

Class Schedule:  The course meets on Tuesdays and Thursdays.  Typically we will use Tuesday for lecture and discussion and Thursday will be a hands-on day in the lab.  Attendance is required.

 

Course products:  There are three types of graded products for this seminar:

  1. Reading responses. (25%)  Each week, we will read a paper published in ASR, AJS or Social Forces that uses a quantitative technique.  At 4 times over the quarter you will be required to turn in a two-page report with the following information:
    1. A summary of the problem or question that motivates the paper we read
    2. A description of the data (source, collection method, sample size)
    3. A description of the key measures (so not every measure, just the ones critical to the argument).  This includes the unit of measure (person, state, organization, etc.) and the type of measure (continuous scale, dichotomy, etc).
    4. A description of the technique used (name and key feature)
    5. Describe what set of numbers (table and row) the author used to draw the main conclusion
    6. Your evaluation of the argument: is it well supported?

 

Reading responses must be sole authored and turned in the day we discuss the reading in class.  You must turn in 4 of them over the quarter and at least 2 by class 10.

 

  1. Exercises (25%).  These are short exercises designed to make sure you understand the material and can use the computer program effectively.  These involve (a) responses to questions from the text and (b) sets of computer program problems.  These should be short, and you are simply required to turn in the answers for text questions.  For lab questions, you will turn in your computer code, log, output and a few paragraphs describing the results.  You can work with others on these homework problems, but the honor system implies that everyone working in the group could do the work by themselves if they had to.  If you work together as a group, turn in one copy with everyone’s names.

 

  1. Final research paper (50%). The final research paper for this class should be a good first draft of a paper that could form a publication or master’s thesis.  This paper should use some form of multiple regression (though I will entertain other quantitative techniques on a case-by-case basis) to answer a real question.  That is, the paper should use real data to answer a question we do not already know the answer to (though replication of other work is allowed if you are using different data).  Details for the paper will be handed out later.  Briefly, the requirement are:

 

    1. The paper can be coauthored (up to three authors).  Any coauthored paper must include a brief description of the division of labor used for the paper (i.e. who did what).  All coauthors will be given the same grade.

 

    1. The paper should include enough prior literature to motivate the question and justify the data. 

 

    1. So long as the analyses are substantively new, you can use a paper from a prior course as the foundation for this paper.

 

    1. You can use any data source you want, though typically you’ll want to use something that is already archived, since it’s unlikely that you’ll have time to collect your own data for this project (such as the GSS). 

 

    1. The paper is due the last day of class (Mar 16).  To help avoid a last-minute rush, there are 4 staged deadlines for the final paper:

 

                                                               i.      topic and data source (class 6)

                                                             ii.      univariate descriptive statistics (class 9)

                                                            iii.      bivariate statistics, (class 15)

                                                           iv.      final draft (last class). 

 

For every week an assignment is late, the final paper grade will be lowered by half a letter.


Readings

Required: 

·        The course text will be Agresti & Franklin (2006) Statistics: The Art and Science of Learning from Data. Prentice-Hall.  Information at:  http://esminfo.prenhall.com/agrestiinfo/toc.html

·        I’ve pulled together a set of articles in electronic format from JSTOR and other web sources.  These are compiled in a single folder here: http://www.soc.duke.edu/~jmoody77/205a/ECP.htm   where you can download them. 

·        You can also find all programs and data links here: http://www.soc.duke.edu/~jmoody77/205a/Datafiles/datasets.htm

 

Recommended & Supplementary

We are focusing on the use and practices of quantitative methods.  We leave out many details and complications.  Here I list a set of supplementary sources and guides that you might find useful for filling in the details.

 

·        Kennedy  A Guide to Econometrics.   This book presents each topic at multiple levels, starting with intuition and moving through the deeper math of each technique.  This is an excellent resource.

·        The Little SAS Book.  A good primer for using SAS.

·        Kranzler & Moursund.  Statistics for the Terrified.    This little book walks you through some of the basic steps in statistics at an intro level.  Good for getting your head into things.

·        Blalock.  Social Statistics (various editions).  This is a classic text that should probably be on any sociologists shelf.  Excellent research reference for looking up the appropriate measure or technique given the type of data you have.

·        Cohen & Cohen.  Applied multiple regression / correlation analysis for the Behavioral Sciences.  This is a focused text on the practical uses of regression and correlation. It includes lots of detail in a readable text and covers everything from the basics to relatively advanced multiple equation models.  Another great resource.

·        Gujarati.  Basic Econometrics.  This is a ‘hard-core’ econometrics text, including derivations and all the math you’d need to make even the most geeky among us happy (myself included).  It is the perfect reference when you really need to know the details (like when you’re arguing with a reviewer!).

·        Ultimately any sociological analyses only begins with the computer output.  You need to be able to interpret that output then communicate it to an audience.  Some good writing works:

o       Thomas & Turner.  Clear and Simple as the Truth: Writing classic Prose.  (Great discussion of writing style)

o       Lamott.  Bird by Bird: Some instructions on writing & life

o       Zinsser.  On Writing Well. A classic text on writing non-fiction.

o       Strunk & White.  The Elements of Style.  This is a classic.

Class Calendar

 

Class 1. (Tuesday, Jan 10)

Topic:  Introduction to the class, strengths & weaknesses of quantitative data, overview.  Goals & logic of empirical analyses in sociology.

 

Goal:  Get to know each other and the plan for the course. 

 

Reading: None

 

Assignment Due:  None

 

Supplementary reading:  There are many good critical papers on the processes of empirical analyses in sociology.  A few that are worth your time:

·   Lieberson, Stanley. 1992.  “Einstein, Renoir, and Greeley: Some thoughts About Evidence in Sociology” American Sociological Review.  57:1-15.

·   Leifer, Eric.  “Denying the data: Learning from the Accomplished Sciences” Sociological Forum 7:283-299

·   Andrew Abbott. 1998.  "The Causal Devolution"  Sociological Methods and Research 27:148-181

·   Levine, John H. 1983.  Exceptions are the Rule: An Inquiry into Methods in the Social Sciences  Westview Press, Chapter 2 is particularly salient.

·   Lieberson, Stanley (1991) "Small N's and Big Conclusions: An Examination of the Reasoning in Comparative Studies Based on a Small Number of Cases" Social Forces, 71:307-320

·   Raftery (2001) “Statistics in Sociology, 1950-2000: A selective Review”  Sociological Methodology 31:1-45

Class 2. (Thursday, Jan 12)

Topic: LAB DAY.  Introduction to SAS.  We will cover user interface, basic SAS setup, writing programs, reading archived datasets, converting data from Excel to SAS, entering data by hand.

 

Goal: To familiarize ourselves with the general way SAS works and get comfortable importing data from various sources.  Use of Proc Print, Proc Contents.

 

Reading: A&F Preface & Chapter 1 (this is catch up for yesterday)

 

Assignment Due: None

 

Supplementary reading:

·        You might find this intro useful for working through SAS programming:: http://darkwing.uoregon.edu/~robinh/sas.html

·        http://www.ats.ucla.edu/stat/sas/  (Very good site, lots of detail).

 

 

Class 3. (Tuesday, Jan 17)

Topic:  Graphic Display & Presentation of Summary Data & Distributions.

 

Goal:  (1) Review summary location (Mean, median, mode, proportions) and distribution (Variance, skew, quantiles) statistics.  (2) Identify best practices for graphical display of data.  We will review a number of graphic displays in the literature and critique them for quality, readability, information content and so forth.

 

Reading:

·  A & F Chapter 2

·  DiMaggio, Evans & Bryson (1996) “Have American’s Social Attitudes Become More Polarized?”  American Journal of Sociology 102:690-755

 

Assignment Due: 

·        Identify one graphic example from a sociology paper to share with the class for discussion.  Please email me an electronic copy if possible, else make enough copies for class.

·        Text questions for chapters 1& 2.

 

Supplementary reading: 

·  Tufte Visual Display of Quantitative Data

·  Moody, McFarland, & Bender deMoll (2005) "Dynamic Network Visualization: Methods for Meaning with Longitudinal Network Movies” American Journal of Sociology 110:1206-1241

·  Handcock & Morris. 1998.  “Relative Distribution Methods” Sociological Methodology.28L:53-97

 

Class 4. (Thursday, Jan 19)

Topic:  LAB DAY.  Using SAS for Graphic display of data.

 

Goal:  To introduce us to the many options for presenting and exploring data graphically.  Will cover SAS Graph concepts, the interactive data analysis window, graph-n-go, and excel options for plotting & smoothing.

 

Reading:

·     Curtis.  Nd. “Are histograms giving you fits?” (ECP)

·     Mason. Nd.  “Introduction to SAS Graph” (ECP)

 

Assignment Due:  SAS Intro & Data exercises

 

Supplementary reading: 

·        Some great SAS graphics tips, sample code & examples can be found here: http://www.ats.ucla.edu/stat/sas/topics/graphics.htm

·        More Examples: http://support.sas.com/techsup/sample/sample_graph.html

·        http://support.sas.com/rnd/datavisualization/

 

Class 5. (Tuesday, Jan 24)

Topic:  Association & Contingency.     

 

Goal:  To identify a set of descriptive measures for determining whether two variables are related to each other.  At this point, we are focusing on the descriptive aspect & interpretation, not the population inference aspects.  We will identify measures for both categorical and continuous variables.

 

Reading:

·        A&F Chapter 3.

·        Budig & England. 2001. “The Wage penalty for Motherhood” American Sociological Review 66:204-225

 

Assignment Due:

·        Text questions for chapter 3

·        After reading the chapter, visit: http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/  and try to draw a regression line by hand.   How well do you do?  Turn in a screen shot (press alt-prtscn and then paste into a word file) of two trials and a paragraph describing how you did.

 

Supplementary reading: 

·        Categorical Data Analysis using the SAS System.  A nice treatment of contingency tables and their associated measures in SAS.

·        Goodman & Hout (2001) “Statistical Methods and Graphical Displays for Analyzing How the Association between Two Qualitative Variables Differes Among Countries, Among Gorups or over time. Part II” Sociological Methodology  31:189-221

 

Class 6. (Thursday, Jan 26)

Topic:  LAB DAY.  Association and Contingency

Goal:  Using SAS to measure association. Go over Proc Corr,  Proc Freq,  Proc Reg, Proc Plot & Proc Report.

 

Reading:  None

 

Assignment Due:

·        SAS Graphing & Summary Statistics exercize

·        1-page topic summary for final paper due.

 

 

Supplementary reading: 

·        A good guide to the multitude of different association measures and how to implement them can be found here: http://www.ats.ucla.edu/stat/stata/whatstat/default.htm

Class 7. (Tuesday, Jan 26)

Topic:  Probability

 

Goal:  To review the basic properties of probability distributions and rules and applications to our problems and see how they relate to substantive sociological questions.

 

Reading:

·        A&F Chapter 5 & 6.  Focus most of your time on 5, but be sure to know the basic rules of probabilities (figuring joint occurrences, meaning of conditional, etc.)

·        Rytina and Morgan. 1982.  “Arithmetic of Social Relations” American Journal of Sociology 88:88-113

 

Assignment Due:  Text questions, Chapter 5 & 6

 

Supplementary reading: 

Rudas (2004) Probability Theory  Vol 142 Sage Quantitative Applications in the Social Sciences series.

Lieberson (1997) “Modeling Social Processes: Some Lessons from Sports”  Sociological Forum 12:13-65

White (1997) “Can Mathematics Be Social? Flexible Representations for Interaction Process and Its Sociocultural Constructions” Sociological Forum 12: 53-71.

 

Class 8. (Thursday, Feb 2)

Topic:  Lab Day.  Probabilities & distributions.

 

Goal:  Explore the different probability distribution function in SAS, both to generate and test draws from distributions.  This will be important for later regression-like models (logit, negative binomial, etc.) for categorical dependent variables.  Work on data cleaning w. our won data, generally catch-up on your own dataset.

 

Reading:  None

 

Assignment Due:  SAS Association Exercises

 

Supplementary reading: 

 

 


Class 9. (Tuesday, Feb 7)

Topic:  Statistical Inference: Confidence Intervals & Significance Tests

 

Goal:  To be able to identify the distinction between sample differences and population differences, effects of various sample & estimation issues on tests of statistical significance, and generally sensitize ourselves to the differences between statistical and substantive significance.

 

Reading:

·        A&F Chapter 7 & 8.  You’ll probably focus attention most on chapter 8, but the logic of C7 is needed to get there.

·        Hout & Fischer (2002).  “Why More Americans Have No Religions Preference: Politics & Generations”  American Sociological Review 67:165-190

 

Assignment Due: 

·        Sample & Descriptive Statistics for final paper due

·        Text questions, Chapter 7 & 8.

 

Supplementary reading: 

·        Berk, Western & Weiss (1995) “Statistical Inference for Apparent Populations” Sociological Methodology 25:421-458. 

There were 4 responses to this paper in the same issue that raise a set of good questions about significance tests:

o       Bollen “Apparent and nonapparent Significance Tests” SM 25:459-468

o       Firebaugh “Will Bayesian Inference Help? A Skeptical View” SM 25: 469-472

o       Rubin “Bayes, Neyman and Calibration: SM 25:473-479

o       Berk, Western & Weiss “Reply to Bollen, Firebaugh and Rubin” SM 481-485

Class 10. (Thursday, Feb 9)

Topic:  Lab Day.  Focus on significance tests and constructing confidence intervals for location hypotheses (mean values), correlations & contingency tables

 

Goal:  Be able to construct and test a given hypothesis about a univariate and bivariate relation, examine confidence interval plots of various sorts.

 

Reading:  None

 

Assignment Due: 

·        SAS Probability Exercises

·        At least 2 of your article summaries must be turned in by now

 

Supplementary reading:  None


Class 11. (Tuesday, Feb 14)

Topic:  Comparing Groups

 

Goal:  To be able to apply the appropriate statistical test to compare differences between two or more groups.

 

Reading:  A&F Chap. 9 & 13.

·        A&F Chapter 9 & 13.  Here we take the logic of significance tests worked out in C7 & C8 to tests for comparing groups.  The logic of ANOVA (C13) is crucial for understanding experiments as well as the foundations for statistical tests in regression.

·        Bearman & Bruckner (2002) “Opposite-Sex Twins and Adolescent Same-Sex Attraction” American Journal of Sociology 107:1179-1205

 

Assignment Due: Text questions, C 9 & 13.

 

Supplementary reading: 

           

 

Class 12. (Thursday, Feb 16)

Topic:  Lab Day, on comparing groups.

 

Goal:  Use of SAS Proc Ttest, Proc Anova.  Extensions of data cleaning & coding tips, general catch-up on our own projects.

 

Reading: None

 

Assignments due:  SAS exercise for significance tests

 

Supplementary reading: 

 


Class 13. (Tuesday, Feb 21)

Topic:  Associations in categorical variables

 

Goal:  Much of the data we collect come to us in sets of categories, and thus various tabular representations are natural.  How can we tell if two categorical variables are related to each other? 

 

Reading: 

·        A&F Chapter 10.

·        Gould (2000) “Revenge as Sanction and Solidarity Display: An analysis of Vendettas in Nineteenth-Century Corsica” American Sociological Review 65:682-704

 

Assignment Due: Text questions, chapter 10

 

Supplementary reading: 

·        Sobel, Becker & Minick “Origins, Destinations, and Association in Occupational Mobility”  American Journal of Sociology 104:687-721

·        Goodman (1996) “A Single General Method for the Analysis of Cross-Classified Data”  Journal of the American Statistical Association 91:408-428

·        Sloane & Morgan (1996) “An introduction to Categorical Data Analysis”  Annual Review of Sociology 22:351-375

·        Yamaguchi “Models for Comparing Mobility Tables: Toward Parsimony and Substance” American Sociological Review 52:482-494

 

 

Class 14. (Thursday, Feb 23)

Topic:  Lab Day Associations in categorical variables

 

Goal:  Using SAS to assess categorical association, introduction of rank procedures – which fall somewhere between categories and continuous variables, advanced data manipulation techniques (transpose, aggregation, etc.)

 

Reading:  None

 

Assignment Due: SAS on comparing groups, Anova, Ttests.

 

Supplementary reading: 

 


Class 15. (Tuesday, Feb 28)

Topic:  Associations in quantitative variables: Regression Analyses. 

 

Goal:  Extend our introduction of “descriptive” regression to formal regression models.  Link regression to ANOVA, introduce dealing w. curves, understand the assumptions of OLS.

 

Reading: 

·        A&F C11

·        Fox (2004) “The Changing Color of Welfare? How Whites’ Attitudes toward Latinos Influence their Support for Welfare”  American Journal of Sociology 110:580-625

 

Assignment Due: 

·        Bivariate descriptive tables for final paper due

·        Text questions, chapter 11.

 

Supplementary reading: 

·        Kennedy  A guide to Econometrics

·        Bollen and Jackman.  1990.  "Regression Diagnostics: An Expository Treatment of Outliers and Influential Cases."  Pp. 257-291 in Modern Methods of Data Analysis, edited by J. Fox and S. Long

·        Fox, John. 1991. Regression Diagnostics. Thousand Oaks, CA: Sage. (Quantitative Applications in the Social Sciences #79.)

·        Long, J. Scott and Laurie H. Ervin.  2000.  "Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model."  The American Statistician 54:217-224.

·        Francois Nielson teaches an excellent course devoted to regression, his notes are online: http://www.unc.edu/~nielsen/soci209/

 

 

Class 16. (Thursday, Mar 2)

Topic:  Regression Lab Day.

 

Goal:  Use of Proc Reg and SAS Analyst for regression, effectively using model diagnostics.

 

Reading:

 

Assignment due: SAS exercise on chi square, categorical association, aggregation

 

Supplementary reading: 

 


Class 17. (Tuesday, Mar 7)

Topic:  Multiple Regression & Interactions

 

Goal:  Continue our discussion of the OLS regression model by extending to (a) multiple independent variables and (b) understanding interaction terms.  We will also introduce multi-level models, though estimation of these models is beyond the scope of the class.

 

Reading:

·        A&F C12

·        Loftus (2001) “America’s Liberalization in Attitudes toward Homosexuality, 1973 – 1998” American Sociological Review. 66:762-782

 

Assignments Due: Text questions, C12

 

Supplementary reading:

See class 15

·        Jaccard & Turrisi Interaction Effects in Multiple Regression (Sage #72)

·        Pan & Frank (2004).  “A probability index of the robustness of a causal inference.”  Journal of Educational and Behavioral Statistics 28: 315-337

·         Frank (2000). “Impact of a Confounding Variable on the Inference of a Regression Coefficient.” Sociological Methods and Research, 29(2), 147-194.

 

 

Class 18. (Thursday, Mar 9)

Topic:  Multiple Regression

 

Goal:  To effectively use Proc Reg (and associated procedures) to fit multiple regression models w. diagnostics for assumption violations.

 

Reading:

 

Assignment Due:  SAS Exercises on multiple regression & Outlier diagnostics.

 

Supplementary reading: 

·        Bollen and Jackman.  1990.  "Regression Diagnostics: An Expository Treatment of Outliers and Influential Cases."  Pp. 257-291 in Modern Methods of Data Analysis, edited by J. Fox and S. Long

 


Class 19. (Tuesday, Mar 14)

Topic:  Logistic Regression

 

Goal:  OLS regression is not appropriate for non-continuous dependant variables.  Logistic regression combines our discussion of regression models with our discussion of categorical measures of association to provide a flexible tool for fitting “regression-like” models with qualitative dependent variables.  We will cover the basic use & interpretation of these models.

 

Reading: 

·        A&F C12 (last section)

·        Eegebeen (2005) “Cohabitation and Exchanges of Support”  Social Forces 83:1097-1110.

 

Assignment Due: 

·        Text questions C12

·        SAS Exercises on Multiple Regression. Note change of pattern here!

 

Supplementary reading: 

·        Stokes, Davis & Koch. (1995)  Categorical Data Analysis using the SAS System

·        Menard () Applied Logistic Regression Analysis  (Sage Volume #106)

 

 

Class 20. (Thursday, Mar 16)

Topic:  Last Day; presentation of research results.

 

Goal:  To wrap up the class, discuss each other’s projects and think about extensions to new types of analyses (simulations, networks, spatial analyses, cluster analyses, etc).

 

Reading: None

 

Assignment due: 

·        Final Papers are due

·        All 4 of your summaries must be in by now

·        Prepare a brief (5-10mins) summary of your paper to share with the class.

 

Supplementary reading: