Proper use of WTFINL for longitudinal analysis

In general, there is no real consensus about the proper incorporation of sampling weights, particularly when pooling together samples as you describe. I am not certain I completely understand what you are doing, and if I am misunderstanding in any way feel free to provide more detail, but I can offer the following discussion.

I am not certain what you mean by using “county-level data” and this could influence how you use sampling weights. You could be extracting specific counties or first aggregating labor force statistics by county or something else entirely. In any case, do note that in most IPUMS CPS samples the COUNTY variable is identifiable for only about 45% of the population. The tricky bit is that when pooling CPS monthly samples, the pooled sample will include multiple observations for some (but not all) households. In principle issues relating to this detail can be avoided by limiting analysis to only those with MISH==1. However, it sounds like you want to exploit the fact that there are repeated observations of households.

If I understand correctly, you are interested in identifying individuals who transition from not in the labor force in one month to being employed in the next month. This being the case, you are limiting your sample to individuals with at least two observations in consecutive months. (Note that this means your sample is limited to individuals with consecutive observations with MISH==1-4 OR MISH 5-8. Since there is an eight month gap between MISH 4 and 5, these individuals do not meet your criteria.) Additionally (unless you are explicitly correcting for this detail) since you are summing by year, the consecutive months are restricted to being within any one year. Said differently, individuals not in the labor force in December and employed in January are likely dropped from your sample. It is these restrictions on your sample that should be accounted for by the sample weight.

Finally, couple suggestions: Perhaps first limit each monthly sample to only individuals with MISH==1-4. Then create your “transition” dummy variables, as you describe, and sum up the number of transitions for each individual (via CPSIDP). Next keep only one observation per CPSIDP and apply the sampling weights (WTFINL). If you happen to be first aggregating the data up to the county level, then you should apply sampling weights within this aggregation. Finally, a reasonable way to check if you’ve applied the sampling weights “correctly” is to calculate the total population size, after you’ve applied the sampling weights. If the population is roughly equivalent to the real population of the US, then you are on the right track.

I also encourage you to look into additional discussion on this topic here and here for more information.