Questions about 1970 Census Data

If I am going to use 1970 census data for analysis, is that appropriate for me to use the combination of the first four datasets: 1970 1% Form 1 State, 1970 1% Form 2 State, 1970 Form 1 Metro, and 1970 Form 2 Metro to conduct analysis?

Technically, since these samples were drawn simultaneously (and therefore contain none of the same cases), you could combine all of the 1970 samples into one data file. However, each sample does contain a different set of variables and the geographic definitions vary across the samples. So, practically speaking, using them all together is not very well suited for most analyses. A list of the key details that differentiate these samples is available on the samples description page. Additionally, a fully detailed description of the 1970 samples can be found here.

I found another reply by IPUMS Staff here, who said something that seems opposite to what you meant (he said these samples were NOT drawn simultaneously within each form). Could you tell me which one is correct? Thank you!

Arguably, both Jeff and Brandon were right. The detailed description of the 1970 samples linked in Jeff’s provides information about the 1% samples. I have also included the information on the 1% samples (and bolded the particularly relevant section) here for your reference:

Three one-in-a-hundred samples were actually drawn from 15% sample records. Thus three sets of counters were used, one for each sample. For the second sample the random start numbers were incremented by 33 and for the third sample, by 66. The 15 percent weight of each 15 percent sample unit was cumulated in the appropriate stratum for each of the three samples. When the cumulation passed a multiple of 100 in one of the sets of counters that sample unit became part of the associated 1 percent sample. Since no 15 percent weight ever was as large as 34 it was impossible for a single sample unit to be selected into more than one of the 1 percent samples, and thus the samples are mutually exclusive.

The three one-in-a-hundred samples drawn from 5 percent data were selected in exactly the same manner using the 5 percent weights assigned to the sample units. For practical purposes, these three 1 percent samples may also be considered mutually exclusive since the probability that a given case will appear in more than one public use sample is less than .001.

As you can see, the documentation indicates that it is appropriate to consider the samples mutually exclusive although there is a minuscule chance that a case may be included in more than one sample.

1 Like

Thank you!
Do we need to do anything about the weight variable after combining these 1% 1970 samples? In a different thread, Jeff once mentioned that if we pool ACS data from different years together, we can simply divide the weight in the pooled data by the number of years. Does this rule also apply to the current question (divide the weight in the pooled 1970 data by the number of 1% samples used in pooling)?

As noted on the PERWT comparability tab, if you are combining multiple samples from the same year, you need to adjust PERWT to get accurate population estimates. If you were to combine all six 1% samples (noting that because the variables and geographic definitions vary across the samples using them all together isn’t very well suited for most analyses), you would need to divide the PERWT values in each sample by 6.

1 Like