This is somewhat related to #2.
I find this project to be extremely useful and a great framework for a task that I have to do often. In my projects, I've found myself using the base classes and concepts from this project when I want to download and process data from other Census Bureau API sources.
However, for non-ACS sources, I find myself entirely reimplementing many of the methods on my geotype downloader classes because the changes in functionality aren't possible by just calling super() and then adding additional logic.
I think adding these methods to BaseGeoTypeDownloader could make adding additional data sources easier, both in this project, and for other users in their own projects:
BaseGeoTypeDownloader.get_api_client(): This would be called from the constructor to set sefl.api and allow subclasses to specify a customized subclass of census.Census that supports additional API endpoints.
BaseGeoTypeDownloader.get_field_type_map(): This would be similar to BaseGeoTypeDownloader.get_raw_field_map() except it would map from raw field names to types that would be passed to pd.Series.astype(). Like BaseGeoTypeDownloader.get_raw_field_map(), this would be called from BaseGeoTypeDownloader.process() when setting the column types after reading in the raw table. The implementation could check for the existence of a FIELD_TYPES attribute on the table configuration class, and if that doesn't exist, default to the existing logic for ACS tables that checks the field name suffix. Adding the ability to explicitly set type conversions allows supporting non-ACS tables that might have field names that don't have the same suffix convention as ACS tables.
This is somewhat related to #2.
I find this project to be extremely useful and a great framework for a task that I have to do often. In my projects, I've found myself using the base classes and concepts from this project when I want to download and process data from other Census Bureau API sources.
However, for non-ACS sources, I find myself entirely reimplementing many of the methods on my geotype downloader classes because the changes in functionality aren't possible by just calling
super()and then adding additional logic.I think adding these methods to
BaseGeoTypeDownloadercould make adding additional data sources easier, both in this project, and for other users in their own projects:BaseGeoTypeDownloader.get_api_client(): This would be called from the constructor to setsefl.apiand allow subclasses to specify a customized subclass ofcensus.Censusthat supports additional API endpoints.BaseGeoTypeDownloader.get_field_type_map(): This would be similar toBaseGeoTypeDownloader.get_raw_field_map()except it would map from raw field names to types that would be passed topd.Series.astype(). LikeBaseGeoTypeDownloader.get_raw_field_map(), this would be called fromBaseGeoTypeDownloader.process()when setting the column types after reading in the raw table. The implementation could check for the existence of aFIELD_TYPESattribute on the table configuration class, and if that doesn't exist, default to the existing logic for ACS tables that checks the field name suffix. Adding the ability to explicitly set type conversions allows supporting non-ACS tables that might have field names that don't have the same suffix convention as ACS tables.