That is right; the harmonized codes are scrambled at the second level of geography. Our regionalization-harmonization algorithm renumbers units where there is a change in geography due to changes in boundaries or if there are confidentiality issues.
If you are looking at harmonized geography across several census years - the codes will not match as there are changes in geography and merger of units. However, if you are looking for original codes to work on a specific year, you could still get most of those codes from the GIS shapefiles. Download year-specific shapefiles for the second level of geography for any country, the last column in the attribute table should correspond to the codes that were given to us by the National Statistical Offices; e.g. for Mexico 2010, “MUNI2010” correspondents to 2010 municipio codes given to us by INEGI. If units are not combined due to confidentiality, you will get all the codes. But if there are confidentiality issues and units are combined, “MUNI2010” will reflect the code with the highest population of the combined units.
We have plans to distribute a .csv file with original codes from statistical offices and IPUMS regionalized-harmonized year-specific geography codes in the future, but at the present moment the shapefiles are the best place to go. If there is any specific sample or country you are interested in, we could send the .csv file to you as a separate attachment.
With respect to data release this summer, please refer to Bob’s answer on “Which samples will be released in 2016 and when?”