Bug: merge_csv uses last however that might mean a reduction in number of clones / traffic views

In general, we wish to keep the highest count. Depending on when the schedule CI runs, this might truncate data for the earliest date. `merge_csv` uses a `drop_duplicates` approach. That is not as ideal with `keep=last`.

Some filtering is needed to keep the one with the highest value.