New insights sought with census database

Peter Kauffner
Staff Reporter

For historians seeking numerical answers to questions about America’s past, census records are the ultimate resource.
“The census is the only source we have that’s available consistently over a long period of time,” said history professor Steven Ruggles.
Ruggles heads a project to create a computerized database of 68 million census returns, which will help social science researchers gain new insights from census data. The database contains a 5-percent sample of census records since 1970 and a 1- percent sample of every census from 1850 to 1960 (except 1890, which was burned in a fire).
“That includes very large samples from 1970, 1980, and 1990, where we didn’t have to do the data entry,” Ruggles said. “We’ve done about 2 million records so far where we’ve actually typed in the data.”
The project, called the Integrated Public Use Microdata Series, has received about $7 million in grants from the National Institutes of Health and the National Science Foundation. It should be completed by 2007.
“We can’t start the 1930 census until 2002 and it will take about five years to do that one,” Ruggles said.
Privacy rules prevent the U.S. Census Bureau from releasing the names of census respondents until 72 years after the data is collected.
The census records for 1920 and earlier have already been released to the public with the respondents’ names attached. For 1940 and later, the Census Bureau has made records available with the names stripped.
“For the early years we have to take new samples,” Ruggles said. “We bought about 6,500 microfilm reals from the National Archive. We buy the full set of every census we’ve done so far.”
Those records are donated to Wilson Library as the project finishes with them. The 1880 census was recently made available in this way.
“Its the only complete set of any census year available in the state,” Ruggles said.
The project’s data can be downloaded by researchers over the World Wide Web. Although Internet users looking for their ancestors often download the data, most come away disappointed.
“The whole thing is 25 gigabytes, which is bigger than most people can deal with on their computers,” Ruggles said. “So what we are working on is an extraction system which allows you to select to the particular census years and variables you are interested in.”
Although the data is still quite spotty for some years, it has already produced some blockbuster results.
In a 1994 journal article, Ruggles used the database to examine changes in African-American family structure over time. He found that as long ago as 1880, black children were 2.4 times as likely to live as orphans or in single-parent families as white children were.
Ruggles said that cultural differences rather than social policy are responsible for the gap between black and white illegitimacy and divorce rates.
“You’ve got to remember that (Aid to Families with Dependent Children) beneficiaries are only about 2 percent of the population,” Ruggles said. “So it seems implausible on the face of it the 2 percent on AFDC are going to make the big difference (on divorce rates).”