Add synthetic FRS#133
Conversation
Fixes #132
anth-volk
left a comment
There was a problem hiding this comment.
Thanks for this @nikhilwoodruff. I am not 100% sure on what the EULA says around this sort of data usage, but will trust your judgment on it. I highlighted one potential issue. If this dataset were public-facing, I'd probably also have some code questions, but don't think they're necessary for this.
| ) | ||
|
|
||
| data = {} | ||
| for variable in simulation.tax_benefit_system.variables: |
There was a problem hiding this comment.
comment: The EULA seems unclear to me on whether this is valid usage, but I am willing to trust your interpretation here
There was a problem hiding this comment.
I think we're looking at the same text. @MaxGhenis could you weigh in?
| import numpy as np | ||
|
|
||
|
|
||
| def create_synth_frs(): |
There was a problem hiding this comment.
comment, non-blocking: I think the code generating this is sufficient for a non-public dataset like this, but would have some questions around a couple parts of this were it public-facing
There was a problem hiding this comment.
It is public-facing (we want to use it for tests)
| ] | ||
|
|
||
| for file_path in dataset_files: | ||
| public_dataset_files = [ |
There was a problem hiding this comment.
nit: On line 7 above (couldn't link here), I would rename dataset_files to private_dataset_files to make this clearer
|
|
||
| upload_data_files( | ||
| files=public_dataset_files, | ||
| hf_repo_name="policyengine/policyengine-uk-data-public", |
There was a problem hiding this comment.
issue, blocking?: This repo already exists with files from February; do we need to delete those before proceeding here for versioning clarity?
There was a problem hiding this comment.
I think we can leave them there
Fixes #132