Sociology 205a: Data sources

 

We will make use of a number of data examples (many from the text CD) for exercises in the class.   Whenever a dataset is not from the CD, I will index it on this page.

 

You will also need to find data for your final paper.  I have listed some of the more common data set WebPages here, as well as links to general data archives.

 

Note that data are more common than people typically imagine:  you don’t need to always work with a survey.  So, for example, if you were interested in the changing nature of sociology production, all of the bibliography information on JSTOR would form a rich data source that with a small amount of work you could read into SAS.   So be creative!

 

 

Class Exercise Data (this will be updated on an ongoing basis)

Exercise

File Name

File Type

Description

Program File

In class data read

Evangel_fert.xls

Excel

Data taken from the American Religion Data Archive on the number of evangelical adherents and fertility rates.

ef_read.sas

 

 

 

Example of writing data directly into a sas program

Data_read1.sas

 

 

 

Example of simulating data with a dataset.

Data_read2.sas

 

redwine1.sas7bdat

SAS Data

A set of wine spectator scores for a sample of red wind

redwine_read.sas

Lab HW 1

Heights.xls

Excel

From A&F, a sample of men (gender=0) and women (gender=1) and their height in inches.

 

Graphs & Regression

house_selling_prices.xls

Excel

Small sample of house prices and attributes

Hprices_read.sas

Merge Example

2004_pctbsh.xls

pop_density1.xls

pctwhitecnty.xls

cnty_income.xls

evangel_fert.xls

gundeaths.xls

military_recruit2000.xls

us_statewide_crime.xls

 

Excel

Files on State/County to test election results.

Election_2004_combine.sas

SAS Homework

Crime.sas7bdat

SAS

SAS Version of the US statewide crime data

Use this to evaluate the relation between poverty and violent crime.

Crime_problem.sas

SAS Exercise

hcidedat.sas7bdat

 

SAS version of some GSS data on people knowing a homicide victim

This program will read in the data and get you started with the code values.

Cideknow.sas

 

 

Commonly Used Datasets

The General Social Survey – GSS

            The GSS is a broad-based survey of individuals that covers a host of issues that interest sociologists.  Among the strongest values of the current survey is its long history, providing coverage since 1972 on attitudes, believes and behaviors of a representative sample of the US population.            Note that you can do a good deal of analyses on the GSS online, using their web-based tools.

            NORC GSS Web page

            ICPSR GSS Web Page – click on the “analyze” link at top to do on-line analyses.

 

Integrated Public Use Microdata Series – IPUMS

            The IPUMS provides individual-level data from the census, providing one of the most comprehensive sources for national demographic data.

 

 

The Current Population Survey - CPS

            The CPS is conducted by the BLS to generate national economic data.  Provides excellent information on work, income & imployment.  While not as often used, the file does have a panel aspect, which allows you to link people from one year to the following year, which is good for judging short-tem changes in employment.

 


The National Longitudinal Surveys

            The NLS is also funded by the BLS, but the survey is conducted by CHRR in Columbus.  The project has followed the same people since the mid 1970s, providing one of the richest datasets out there on changes over the life course.  The NLS includes the NLSY (Y=youth), which started with a younger sample.  This is a somewhat complicated survey design and data set, so you’ll have to spend some time getting to know it!

 

 

The National Longitudinal Survey of Adolescent Health – Add Health

            Add Health is a large longitudinal survey of adolescents and their social contexts.  Great for people interested in health, youth, education and social networks.  You will need a data contract to use the full data, but there are on-line analyses tools and you can get a public-use subsample to play with.

 

 

United States Census Data

            The US Census provides a good deal of data for general use, including direct access to aggregated results, allowing state or county level analyses.

            Products from the 2000 Census

            The County -City Data Book

            The Statistical Abstract

 

 

World Values Survey

            The World Values Survey is a data collection effort aimed at replicating the same survey in many countries.  It provides great comparative data on numerous topics of interest to sociologists.

 

World Development Report

            Provides information on key nation-level indicators about social and economic outcomes.  Published by the World Bank.

 

Social Science Data Archives

ICPSR: Inter-University Consortium for Political and Social Research

            Probably the most comprehensive data archive out there.  Has huge holdings of data on just about any topic you could want.  Spend some time exploring this site!

 

NORC – National Opinion Research Center. 

            Some data archived here, but mainly a center for conducting survey research.  Good descriptions of current work and data sources.

 

The Odum Institute Data Archive

            Another archive of social science data.  The site also has links to other sources of social science data.

 

SDA Archive

            Another general archive, with lots of on-line analyses tools.

 

Columbia Electronic Data Service (EDS). 

            Another general archive, with a great collection of longitudinal survey links.  Follow the topic links.

 

The American Religion Data Archive

            A great topic-specific archive. Holds a number of datasets that relate to religion, values, and so forth.  You can also do a good deal of analysis from the web page itself.