Your interpretation is correct. A coding of METRO=4 means the respondent’s household resides within the metro area, but may or may not be located within the principal city. For a full explanation of why the exact METRO status of such a large percentage of respondents cannot be identified, I recommend reading this User’s Note on the difficulties of identifying metro areas. Basically, the lowest level of identifiable geography in the ACS is the PUMA. Since PUMAs must contain a minimum number of persons for confidentiality reasons, they often straddle geographic boundaries. This makes determining the location of a respondent’s household difficult, especially within geographies as small as a typical metro area.
As a result, working with the METAREA/MET2013 and METRO variables will lead to two sources of measurement error in your estimates. First, if you refer to the table in the User’s Note above, several MSAs themselves cannot be completely identified. For example, 17% of the Oklahoma City MSA population cannot be confidently identified as living within the MSA. Second, even if an MSA is completely, or almost completely, identified, it may still have a large percentage of residents with METRO=4. For example, the Chicago MSA is 99% identified in the ACS samples, but approximately 30% of residents are coded METRO=4. You cannot assume that incompletely identified geographical units are a representative sample of the complete geographical unit.
Bottom line, METRO is still the best available method to identify “city” versus “suburbs” given the confidentiality restrictions on identifiable geographies, but this method is subject to error. The accuracy of identifying metro areas and central cities varies quite drastically across individual MSAs; thus, the decision of whether or not using METRO leads to a tolerable level of error depends on the geography of interest and the researcher’s discretion.
Hope this helps.