Skip to content

Processed data contains duplicate data for multiple geographies #63

@aboutaaron

Description

@aboutaaron

Bug/Issue

Census data downloader correctly downloads raw data but creates a CSV duplicated data in the processed directory.

Environment

  • Python 3.8
  • Pipenv version 2018.11.27.dev0
  • Latest version of censusdatadownloader

Reproduce

Install the package and then try to download a data set.

pipenv install census-data-downloader
censusdatadownloader --data-dir data/census race states

Expected behavior

A 52 row CSV file with total population by race in the processed directory.

Actual behavior

A 52 CSV with the same data for each column processed directory.

Possible issues/solutions

It looks like the data is correctly downloaded in the raw directory which makes me think something's happening in the process step. I'm seeing this behavior specifically with the race [geography] arguments.

I noticed the same behavior for internet counties but did get the correct data when I used internet states.

I'll see if I can debug what's happening at the process step but in the meantime I'll rely on the raw data. Thanks for your work on this!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions