Hello,
I am using the ipumsr package to create the extracts within my code, but today for the first time I set my geo_level at “blck_grp” and I get an error saying I need to specify the geographic_extents. I see the geographic_instances in the metadata but I am unsure how to specify the extent in which I am interested (Massachusetts, which is code 250).
Thank you to anyone that might be able to help.
To specify extents, use NHGIS state codes, like “250” for Massachusetts. These probably need to be enclosed in quotes because some codes have leading zeros (e.g., “010” for Alabama).
You can request multiple extents using a list, e.g., c("010", "250")
for Alabama and Massachusetts.
The table of geographic_instances
returned from get_metadata_nhgis
identifies these codes in the name
field.
In general, the IPUMS API, which ipumsr interacts with, uses name
to identify key identifiers, so in other circumstances where you’d like to specify extract content, you can assume that using the name
will work. But this isn’t altogether clear in the documentation for the geographic_extents. We’ll try to add a little more text to the define_extract_nhgis
documentation to clarify and/or provide an example.
Hi Jonathan, thanks for the help.
This is how I am specifying the extract:
nhgis_ext_1 ← define_extract_nhgis(
description = “2018-20222 ACS”,
datasets = ds_spec(
“2018_2022_ACS5a”,
data_tables = c(“B01003”, “B25006”, “B25003B”, “B25003D”, “B25003H”, “B25003I”, “B25035”, “B25002”),
geog_levels = “blck_grp”,
geographic_extents = “250”
)
)
But it returns an error of: unused argument (geographic_extents = “250”)
I have also tried brackets around the “250” and still no luck.
Ah, I see now. Geographic extents are requested generally for an entire extract, and not for a specific dataset, so the geographic_extents
parameter needs to be placed outside of the ds_spec
parameters, like so:
nhgis_ext_1 <- define_extract_nhgis(
description = “2018-20222 ACS”,
datasets = ds_spec(
“2018_2022_ACS5a”,
data_tables = c(“B01003”, “B25006”, “B25003B”, “B25003D”, “B25003H”, “B25003I”, “B25035”, “B25002”),
geog_levels = “blck_grp”
),
geographic_extents = “250”
)
A potential source of confusion here is that the metadata listing of possible extents is provided by dataset. We’ll try to improve our documentation about that as well. The advantage of this extract definition model is that you could request data from multiple datasets in one request while specifying extents only once for the entire request.