Case of replicate column names differ among years #116

rluedde · 2020-07-13T18:52:37Z

>>> from cenpy.moe import replicate_table_utils as crtu
>>> data_14 = crtu.get_replicate_data_api(["B15002"], 2014, "140", "04")
https://www2.census.gov/programs-surveys/acs/replicate_estimates/2014/data/5-year/140/B15002_04.csv.gz
>>> data_18.columns.levels[0][:3]
Index(['ESTIMATE', 'MOE', 'SE'], dtype='object', name='categories')
>>> data_14.columns.levels[0][:3]
Index(['estimate', 'moe', 'SE'], dtype='object', name='categories')

In the 2014 data, the names of these columns are capitalized and all other years (I think) are lowercase. This causes issues in (at least) apply_func.

I don't have good ideas for a solution. I imagine it's a .upper() or .lower() call somewhere in read_replicate_file or get_replicate_data. A comprehension might be needed as well?

The text was updated successfully, but these errors were encountered:

dfolch · 2020-07-13T19:35:15Z

Good idea to always convert to the same case. It is probably not worth the extra code to test for capitalization, just run upper on these three column names for all files read in with read_replicate_file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Case of replicate column names differ among years #116

Case of replicate column names differ among years #116

rluedde commented Jul 13, 2020

dfolch commented Jul 13, 2020

Case of replicate column names differ among years #116

Case of replicate column names differ among years #116

Comments

rluedde commented Jul 13, 2020

dfolch commented Jul 13, 2020