Consistency of DEPARTS, TRANTIME, and ARRIVES

Matthew_Wigginton_Bh · August 11, 2025, 3:18pm

I’m doing some analysis of work start times across time, using 1990-2022 data. 1990 and 2000 do not have the ARRIVES variable, so I compute it from DEPARTS and TRANTIME. To validate my methodology, I did the same computation for ACS data, to ensure my results roughly matched the ARRIVES variable in those years (roughly because the ARRIVES variable is binned, so may be off by a few minutes). For the most part it does, but for TRANTIME values that were topcoded there appears to be a discrepancy. The ARRIVES values are consistently lower than DEPARTS + TRANTIME. Based on the documentation, it seems like topcoded values for TRANTIME are represented as the state mean of all topcoded TRANTIMEs. I would have expected the ARRIVES variable to calculated either (1) using this mean value or (2) using the un-topcoded values, but it appears it was calculated with some other number—maybe the minimum of the topcoded values. For example:

SAMPLE	DEPARTS	TRANTIME	ARRIVES	calculated_arrives
2005-2009, ACS 5-year	4:05	184	6:34	7:09
2005-2009, ACS 5-year	7:32	184	10:04	10:36
2005-2009, ACS 5-year	3:05	184	5:34	6:09
2005-2009, ACS 5-year	6:32	184	9:04	9:36

In the first row, we have someone who departs at 4:05, and has a topcoded TRANTIME of 184 minutes (3 hours 4 minutes). I calculated that should be an arrives time of 7:09, but the data has an arrives time of 6:34, only 2 hours 29 minutes after departure. The topcoded values seem to consistently report an ARRIVES time that is less than what would be implied by the topcode, which wouldn’t make sense if the original value was used and the topcode value is a mean—some of the ARRIVES times would have to be later. Maybe the ARRIVES values are calculated based on the minimum topcode rather than the mean?

This spreadsheet has all of the examples I’ve found that are not related to allocation in the ACS samples I’m working with.

Ivan_Strahof · August 13, 2025, 6:16pm

I suspect the origin of the discrepancy is in the difference between the topcode threshold value and the topcode value itself. In the ACS, topcodes are typically applied by the Census Bureau to encompass the highest 0.5% of values in each state for each year. Any case with a value above the threshold has its value replaced with the group’s mean value (i.e., all cases above the threshold are assigned the same value, which is the mean of all their original values). Both the threshold and topcode are reported in the original PUMS topcode documentation linked to in our topcode user guide (see the second column in the table).

For TRANTIME specifically, JWMNTPCT provides the threshold value and JWMNP is the value assigned to topcoded cases. Using your example of a TRANTIME value of 184 minutes I am seeing two states in the 2009 1-year file (MN and MO) that have JWMNP = 184 and jwmntpct = 150. In the cases you include, I see that these cases report arriving 2.5 hours (150 minutes) after their departure time. This suggests to me that while TRANTIME has been top-coded, the arrival and departure times have been masked by implementing the threshold value, though I was unable to find confirmation of this in the documentation. Note that TRANTIME counts the initial minute as time spent traveling. For example, if someone departs at 4:05 and they travel for 5 minutes, they will arrive at 4:09 (minute 1 is 4:05, minute 2 is 4:06, minute 3 is 4:07, minute 4 is 4:08, and minute 5 is 4:09).

Topic		Replies	Views
Top coding logic USA	1	43	January 23, 2025
The highest one percent incwage earners are combined into the incwage"Top Code,"which is the median of that group? USA	17	1214	August 25, 2021
Topcoding of earnings in ATUS TIME USE	4	54	November 25, 2025
TRANTIME: value of 0 for workers? USA	2	439	March 26, 2020
Top coded monthly gross rent values seem to vary more widely than expected USA	2	444	September 11, 2020

Consistency of DEPARTS, TRANTIME, and ARRIVES

Related topics