1910/1940 partent's mother tongue

Hi everyone, I am merging the 1910-1940 population data for the whole US, but I found that the two variables of parents’ mother tongues are mostly missing.
Is there any link protocol to find someone about one’s parents’ mother tongue?


Why is this part of the data so problematic in terms of quality, since everyone’s mother tongue is known, and is it feasible to recover their parents’ mother tongue by linking parents through family-level data?

Direct questions about the mother tongue of parents (IPUMS variables FMTONGUE & MMTONGUE) were only included in the 1910 and 1920 census forms; note that these were only asked of people with foreign-born parents. These direct questions are not included in 1930 and 1940.

A person’s mother tongue (MTONGUE) is asked about in 1910-1940. You could leverage the family interrelationship variables (MOMLOC & POPLOC in tandem with the attach characteristics feature) to indirectly capture the mother tongue of parents (create your own FMTONGUE and MMTONGUE) if the parent lived in the same household. However, this approach will miss parents’ mother tongue for persons who do not live with their parents. Because there may be systematic differences between those who do and do not have coresident parents, I would be cautious about using this indirect measure.

Thank you for your answers.

I have two questions.

  1. According to your system, the question of mother tongue was asked in 1910 for foreign born people, but in fact when I downloaded the data, all the data for mother tongue had null values. Can you take the time to check if this part of the data exists?

  2. Regarding the 1940 data, is mother tongue_1940 for foreign born, sample-lines or everyone? The descriptions on the variables was confusing.

Thanks for the additional questions and my apologies for the slow reply.

The 1910 mother tongue information is only included in the sample files–not the full count files. While this is clear in the availability tab, there isn’t much information about why on the variable page. I will request that the IPUMS USA team augment the documentation to note this discrepancy between availability in the 1910 files. I am sharing the relevant information from the revision history:

MTONGUE and QMTONGUE were removed from the 1910 full count sample due to a data transcription error. MTONGUE information from the original Census forms was not transcribed into our digital version of the complete count 1910 data file. The derivation of this variable led to incorrectly high rates of English as a MTONGUE value. Only the 1910 full count file was affected by this error.

Regarding 1940, the universe is sample-line persons; you can identify these with the variable SLREC. Sample-line persons were selected to answer additional questions, including MTONGUE. There are no universe restrictions about whether the person is foreign- or native-born, only whether or not they were a sample line person. For analyses of MTONGUE in 1940, you should use SLWT to get appropriate estimates; however, this variable should be representative of the entire population.