In general, we wish to keep the highest count. Depending on when the schedule CI runs, this might truncate data for the earliest date. merge_csv uses a drop_duplicates approach. That is not as ideal with keep=last.
Some filtering is needed to keep the one with the highest value.
In general, we wish to keep the highest count. Depending on when the schedule CI runs, this might truncate data for the earliest date.
merge_csvuses adrop_duplicatesapproach. That is not as ideal withkeep=last.Some filtering is needed to keep the one with the highest value.