I am successfully running regressions on IPUMS-USA data using weighted least squares regressions with standard errors calculated from replicate weights. I have been asked to consider using GEE (generalized estimating equation) to account for correlated disturbances created by clusters of individuals sharing the same household level data.
Has the MPC investigated this issue? Are households big enough to create meaningful correlated disturbances? And would GEE provide a meaningful advantage to WLS regressions considering that SEs are coming from replicate weghts and the only difference between the approaches would be in the coefficients?
Our general advice on this sort of issue is “do both”. More specifically, whether or not correlated household disturbances are meaningful will likely depend on what outcome variable you are using in your regression. You are likely correct that many households are not “big” enough to possess meaningful inter-household correlation and that your standard errors will remain unaffected due to the use of replicate weights. However, I am hesitant to provide general advice for or against the need to perform GEE estimation. My advise is do both and compare. If the results are similar then include one as a robustness check, if the results differ then perhaps correlated household disturbances are in fact meaningful.