Clarification on County and Subcounty-Level Data

Dear Sir/Madam,

I’m currently working with Kenya census microdata accessed through the Kenya National Data Archive (KeNADA) and the IPUMS International platforms. I began my analysis using the original data from KeNADA, and I noticed that the county locations match perfectly between the 2009 and 2019 datasets.

However, when I use data from the IPUMS International, I’m a bit confused by some of the geographic classifications, particularly the differences between districts (38 in 2009) and counties (47 in 2019). I’m also trying to work with more granular geographic levels and would ideally like to use subcounty data.

From what I can see, subcounty-level data is available in the 2019 census but not in 2009. I’m especially curious about how the subcounty variable was constructed for 2019. For example, I noticed around 345 sub-locations in the 2019 data, which led me to wonder whether these were derived from combinations of provinces (8) and districts (44), potentially resulting in 352 combinations. Also, could you kindly explain how the variable geo2_ke2009 was created with 158 units?

I’m not very familiar with the administrative geography of Kenya, so I would really appreciate any clarification or guidance you can provide. This detail is quite important for my classification and estimation work. If there are any relevant documents or published references you could share, I would be very grateful. Thank you so much in advance for your time and support.

Sincerely,

Onur Biyik

The changes in geographic regions and levels can be hard to interpret over time, so I understand your confusion. A good visual aid to understanding these changes are the year specific geography maps we make available for Kenya 2009 and Kenya 2019. You can find the full list of year-specific and harmonized geographic variables and year-specific geographic maps on our website. As to why this things changes so drastically in Kenya between these samples, it is likely due to changes to the Kenyan constitution, as described in the comparability section of these geographic variables (e.g. GEO3_KE2019):

Kenya passed a new constitution in 2010 which nullified all geography changes that occurred after 1992. As a result, the former Provinces were abolished, and the former second level administrative units, formerly Districts, were elevated to the new first level administrative units and renamed Counties. The 47 districts present in 1992, with the same boundaries present at that point, now represent the 47 counties that make up the first geographic level. Despite sub-counties now being the second administrative level, sub-counties are used as the third level of geography identified in GEO3_KE2019 to preserve comparability between samples.

As to your question on the construction of GEO2_KE2009, on the Codes page on each variable, there is a link below the code table that has a link to “Explore how IPUMS created this variable” . In the pop-up, there is a link to a Harmonization Table that displays the codes that were available in the source data that we received from the Country Statistical Office and how we harmonized those to codes available in IPUMS.

image.png