Hi, I'm wondering for each backend and each dataset, do they support saving iteration state and resume later to continue previous iteration where it stopped?
This feature is required for resuming from a checkpoint during model training.
for example:
dataloader = jdl.DataLoader(ds, backend, shuffle=True)
for i, batch in enumerate(dataloader):
if i == 100:
state = dataloader.state_dict()
# re-init the dataloader, and then try to resume from state
dataloader = jdl.DataLoader(ds, backend, shuffle=True)
dataloader.load_state_dict(state)
for batch in enumerate(dataloader):
....
Hi, I'm wondering for each backend and each dataset, do they support saving iteration state and resume later to continue previous iteration where it stopped?
This feature is required for resuming from a checkpoint during model training.
for example: