Linked Samples Download

To access the files, users must log in with their NAPP email address and password. Click here to register for an account.

The tables below provides data files in SAS, SPSS, and Stata format. Detailed descriptions on the linked data samples can be found here. For an explanation on some of the variables unique to the linked datasets, scroll down.


Data file format
SAS SPSS Stata
1865-1875 (Norway) men[1]
couples[1]
men[1]
couples[1]
men[1]
couples[1]
1875-1900 (Norway) men[1]
couples[1]
men[1]
couples[1]
men[1]
couples[1]
1865-1900 (Norway) men[1]
couples[1]
men[1]
couples[1]
men[1]
couples[1]
All samples, all years (Norway)
(in a *.zip archive)
all samples[1] all samples[1] all samples[1]
1850-1880 (United States) men
women
couples
men
women
couples
men
women
couples
1860-1880 (United States) men
women
couples
men
women
couples
men
women
couples
1870-1880 (United States) men
women
couples
men
women
couples
men
women
couples
1880-1900 (United States) men
women
couples
men
women
couples
men
women
couples
1880-1910 (United States) men
women
couples
men
women
couples
men
women
couples
1880-1920 (United States) men
women
couples
men
women
couples
men
women
couples
1880-1930 (United States) men
women
couples
men
women
couples
men
women
couples
All samples, all years (United States)
(in a *.zip archive)
all samples all samples all samples

[1] Dataset updated: 30th July, 2010

  We created 5 new variables specifically for the linked datasets: MARSTCH, MARSTCHD, OCCDIF*, MIGRANT* and MILEMIG*. These variables describe how the individual’s occupation, marital status and place of residence compare from year 1 and year 2. Each contains data for the primary links only.
(*Applies to the US linked samples only.)

OCCDIF describes the change in the IPUMS variable OCCSCORE between year 1 and year 2 in four categories:

1 OCCSCORE decreased by more than 10% from early period to later period
2 OCCSCORE did not change by more than 10% from early period to later period
3 OCCSCORE increased by more than 10% from early period to later period
9 N/A (person did not have an occupation in at least one of the two periods)

MARSTCH describes how marital status compares between the two years with the following codes:

1 Unmarried in both periods
2 Unmarried in early period, married in later period
3 Married in both periods
4 Married in early period, widowed or divorced in later period
5 Other (enumeration error or potentially inaccurate link)
6 Marital status indeterminate/ unknown in at least one period

MARSTCHD describes marital status changes in more detail:

10 Unmarried in both periods
11 Single in both periods
12 Widowed or divorced in both periods
13 Single in early period, widowed or divorced in later period
20 Unmarried in early period, married in later period
21 Single in early period, married in later period
22 Widowed or divorced in early period, married in later period
30 Married in both periods
31 Married in both periods, spouse linked
32 Married in both periods, spouse not linked
40 Married in early period, widowed or divorced in later period
50 Other (enumeration error or potentially inaccurate link)
60 Marital status indeterminate/ unknown in at least one period

NOTE: The "other" category for both MARSTCH and MARSTCHD contains individuals whose MARST values changed in nonsensical ways, for example going from married to single. Although falling into this category may be indicative of a false link it instead could be indicative of false information or data entry errors.

MIGRANT describes how state and county of residence compare between years in five categories:

1 Same county, same boundary [not a migrant]
2 Same county, boundary changed [probably not a migrant]
3 Different county within state; boundary changes between counties [migrant status indeterminate]
4 Different county within state; no boundary changes [migrant]
5 Different county and state [migrant]

MILEMIG provides an estimate of how far the person moved in miles. We achieved these estimates by measuring distances between NHGIS county centroids (center points) in GIS software. We did not calculate migration distances for those who moved to or from Alaska or Hawaii, or were categorized as "overseas military" in at least one year (those cases are coded as 9999 in MILEMIG).

NOTE: These migration variables are available only for the United States linked samples and differ from the NAPP variable MIGRANT. The NAPP variable compares a person's birth place to their current place of residence and is included in the U.S. and Norwegian linked files as MIGRANT_1 and MIGRANT_2.

LINKTYPE explains the reason the individual was included in the Linked Sample.

0 Primary linked person 1 Primary link in another iteration of this household. This person will have a LINKTYPE value of 0 in a repeat of this household in the dataset. 5 Additional linked person who was only linked after the representative linkage process. This person is not identified as a primary link in any household. These links were made for the convenience of researchers interested in the primary linked persons' households. 9 Unlinked person, present in the household of a primary linked person during only one census year.

HHSEQ is linked sample household identification number, to be used in conjunction with SERIAL. Each primary linked person has a unique combination of SERIAL and HHSEQ that is shared by all members of the linked person's household in both census years. HHSEQ values range from 1 to 9.

Additional variable information for the Norwegian linked files: We created new serial numbers for all years and new person numbers for the earlier years in the Norwegian linked files. We did so after splitting extremely large households into smaller units to make the linked files easier to use. We preserved the original serial and person numbers in the variables listed below. We also renamed the pointer variables from the earlier years as a reminder to use them with the original person numbers.

SERIAL_ORIG_1 and SERIAL_ORIG_2 are the original serial numbers from the earlier and later year respectively.

PERNUM_ORIG_1 is the original person number from the earlier year.

We renamed the pointer variables MOMLOC_1, POPLOC_1 and SPLOC_1 to MOMLOC_ORIG_1, POPLOC_ORIG_1 and SPLOC_ORIG_1. Use PERNUM_ORIG_1 when utilizing MOMLOC_ORIG_1, POPLOC_ORIG_1 or SPLOC_ORIG_1. NOTE: PERNUM_2 contains the original pernums for the later years and can be used with the pointer variables MOMLOC_2, POPLOC_2 and SPLOC_2.