U, ancestry website to develop database using 1940 census data

Their goal is to create a database that includes all of the information collected on the 132 million Americans recorded in the 1940 census

by Andrew Krammer

The University of Minnesota and Ancestry.com announced a partnership Monday to create a population database with the recently released U.S. Census data from 1940.

Their goal is to create a database that includes all of the information collected — demographic and economic statistics — on the 132 million Americans recorded in the census that year.

The U.S. National Archives and Records Administration released the complete census data online Monday — 72 years since it was taken. The waiting period was required before any personally identifiable information like names could be released.

More than 21 million people in the U.S. and Puerto Rico who were recorded in that census are still alive.

The National Archives website crashed within a few hours of going live Monday morning because of the overwhelming 1.9 million users who tried to access the data.

The records are free and open to the public. Still, the National Archives database is not searchable by name and only shows images of the records.

That’s where the University project comes in. It will index the data differently so the database is searchable by name. Ancestry.com will front the transcription of 7.8 billion keystrokes of data, while the Minnesota Population Center at the University will further aggregate it for scientific purposes.

Those involved with the University project say it’s one of the largest collaborations between genealogy and academic research.

While most population datasets contain only a 1 to 10 percent sample, the 1940 set plans to capture 100 percent of the recorded census.

1940 is seen as a pivotal time in the country’s existence by researchers.

The end of the Great Depression, Dust Bowl migrants out West and the beginning of World War II make the complete dataset a crucial base for studying the social and economic change throughout the 20th century.

The research is an extension on what the MPC has been doing for years, Steven Ruggles, director of the Minnesota Population Center, told the Minnesota Daily.

“We have the world’s largest collections of data on human population,” Ruggles said. “We’re currently disseminating 860 million records of describing individuals.”

To put that number in context, Facebook states it currently has 845 million monthly active users.

“We’re a little bit bigger,” Ruggles said.

He said the first data of the five-year project will be available in about two years, with annual releases for each of the three years following.

“Each additional release will have additional variables and features for researchers.”

However, Ruggles said Ancestry.com will have the information out sooner for its genealogical purposes, while the University will disseminate the data for scientific purposes.

Ancestry.com has already begun indexing the information and hopes to have a substantial portion of the information out this year, said Todd Godfrey, senior director of U.S. content at Ancestry.com.

Godfrey said when the website first approached the U.S. Census Bureau about the project, the bureau referred them to Ruggles and the population center.

“It was really a no brainer for us to work with the University,” Godfrey said.


-The Associated Press contributed to this report.