Progress Report, October 2008 to May 2009

The first phase of NAPP was funded to harmonize and distribute the complete-count census data from Canada, Great Britain, Iceland, Norway and the United States. The second phase of NAPP—which we call NAPP-II—will (1) incorporate all the samples of census data for these five countries between 1850 and 1930; add new complete count censuses from Iceland (1703, 1835, and 1845) and Norway (1801) (2) add complete-count censuses from Sweden (1890 and 1900), and (3) Create samples of individuals and couples linked between the samples and the complete-count censuses. (See the proposal for NAPP-II)

In October 2008, released a new version of the already existing NAPP datasets; 1871 Canada, 1881 Canada, 1901 Canada, 1881 Great Britain (England and Wales and Scotland), 1865 Norway, 1875 Norway, 1900, Norway, and 1881 United States. We also released the complete count Swedish dataset of 1900. A sample of 1851 Great Britain was released in May 2009. We have made significant changes to our web interface to accommodate the growing number of samples and variables, and to give users greater control while browsing variables or defining a data extract.

The major changes in NAPP web interface involves distinction between un-harmonized variables or source variables (meaning data on variables that has been provided by the host country) and integrated variables or harmonized variables (meaning, data on variables that are synchronized by researchers at the Minnesota Population Center). Source variables are sometimes unique to the country, for example, Birthplace in Norway (bplno), or Occupation, 1881 codes UK (occ81gb). Integrated variables on the contrary are variables that exist in all the samples. The most common variables include sex, age, marital status (marst). Some of the other integrated variables created by the Minnesota Population Center staff include dwelling position number (dwpos), mothers location in household (momloc), and number of own siblings in household (nsibs). The new data extraction system should provide researchers with added flexibility and help them choose precisely the variables they want to work with.

Besides this, users can customize their extract size by selecting the number of households or persons they want from each dataset. The extract system draws a subset of households that match the desired case-count or sample fraction and generates syntax files that adjust the weight variables appropriately. This is specifically helpful for the NAPP datasets as they are complete count datasets and sometimes might be too big to handle. While browsing the documentation, users can save variables to include in their data extract later in their web session.

We have also added original (and English translations) census enumeration forms and enumeration instructions from the respective countries as an added resource to researchers.

Progress Report