Can surnames be made available in 1900 Census 5% sample

According to the report of 1900 Census, persons born in Canada were separated as born in French Canada (Quebec) or English Canada (all other provinces, even for the francophones). Looking at data from the original forms on Family Search, I see that many were listed only as ‘Canada’. I would like to determine how the ‘Canada’ were treated (presumably all went into English Canada), and the proportion in “English Canada” and “Canada” with French Canadian or Acadian surnames. Unless I am mistaken, your 5% sample of the 1900 census does not hold surnames. Family Search (with all surnames) does not allow for tabulations of data, and in Ancestry there is essentially only the ‘Canada’ category. Is it possible to somehow have the Surnames added to your 5% sample of the 1900 Census?

Thanks a lot

Jacques Pepin

IPUMS USA provides public use versions of historical census data, including a 5% subsample of the 1900 census data. The public use data do not include names or other original string variables such as address. IPUMS also provides licenses to access restricted use versions of full count census data from 1790-1950, which include full names. The linked page provides an overview of the data and our licensing process. If you are interested in applying for a license to access the data, please send an email to with more information about your research.

With regards to how Canadian birthplaces in the 1900 census have been coded in the data IPUMS provides, I obtained the following information from one of our historical demographers here at IPUMS:

We code birthplace strings based on the information transcribed from the birthplace field, so if the string says “Canada English” it would get coded as English Canada (15010-15079) but if it was ambiguous or only said Canada, then it get the generic Canada code (15000). French Canada would get coded as 15080-15083. The key thing I’d note here is anything that was likely Canada but ambiguous on where in Canada would get coded as generic Canada.
Sometimes data transcription processes do not capture the full original string accurately, so generic Canada is over represented compared to English Canada and French Canada. It’s likely some cases have the transcription “Canada” but the original entry said “Canada English” or “Canada French.” Most 20th Century Census year asked enumerators to distinguish between the two.
Another key detail is we employ combo dictionaries for coding some variables (e.g. there are 2 variables, entered separately by Ancestry and FamilySearch available for birthplace, and we use both variables when coding strings in our dictionaries). More details on our “combo dictionary” approach can be found in our working paper (link, pages 22-25 are most applicable to your particular question).

Thanks a lot Isabel, I will apply for a full count data