Skip to content

Pump Chunk Marking #2

@fleerdayo

Description

@fleerdayo

For each resampled (chunked) pump csv file, did you only mark 1 chunk as True?

E.g. if there is a pump at 2019-03-1 17.00 and I chunked my csv data into 5 second chunks (and only taking into consideration the pump day and 1 day before and after), I only marked the chunk from 17.00.00 to 17.00.05 as True.
This leaves me with an extremely imbalanced dataset so that a RandomForrestClassifier ends up predicting every chunk as False.

What am I missing here?

------- Offtopic -----------
Also thank you guys for your effort to collect all the data. I enjoyed reading your paper too and got lots of useful information out of it. It's a welcome distraction to fiddle around with your data during all the restrictions :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions