I’d like to group the OCC2010 variable into roughly 90 categories based on the first 2 digits for each code. For instance, within computer and matehmatical I’d like to split up the 10s, the 11s, and the 12s into 3 groups. The 400+ categories makes the bundles a bit small for my analysis, but 27 is too few. But I’m really curious if I’m missusing the data if I do that
The 27 category scheme we provide is a commonly-used method for organizing OCC2010; however, users can certainly choose to group the occupations differently. There is no inherent danger to using the first two digits of OCC2010 and aggregating the data into a greater number of categories.
Hope this helps.