After installation, you're ready to run the code.
- Default configurations are provided in
.tomlformat.cpu.toml– for running on CPU onlycuda.toml– for single-GPU CUDA executioncuda-ddp.toml– for multi-GPU distributed training using DDP (Distributed Data Parallel)cuda_medium_term.toml– for experiments for 1 year into the future.inference.toml– for running inference pipelinesmps.toml– for Apple Silicon (M1/M2) using Metal Performance Shaders (MPS)transformer.toml– for transformer-specific model configurations
- Files placed in
configs/local/are ignored by Git, making it a safe place for personal or machine-specific configs.
For Distributed Data Parallel (DDP) training, use torchrun to spawn multiple processes.
For non-DDP usage, you can replace torchrun with python.
# Inference on cpu
uv run python3 scripts/inference.py --config-path configs/cuda-tiny.toml --data data/ameland.csv --target-col load --checkpoint data/checkpoint_tiny_99k.pt --show-plots# Train on CPU
uv run torchrun --standalone scripts/train.py configs/cpu.toml
# Train on CUDA (single GPU)
uv run torchrun --standalone scripts/train.py configs/cuda.toml
# Train on CUDA with DDP (multi-GPU)
uv run torchrun --standalone --nproc_per_node=4 scripts/train.py configs/cuda-ddp.toml
# overriding parameters on CLI
uv run torchrun --standalone scripts/train.py configs/cpu.toml --run.persist_to_wandb_project=main_project
uv run python scripts/train.py configs/cpu.toml --run.persist_to_wandb_project=main_project
# overriding parameters via env variables
export S4_run__persist_to_wandb_project="main_project"
uv run torchrun --standalone scripts/train.py configs/cpu.tomlThe process of creating datasets in our format is handled by the scripts/format_dataset.py script. The process is fully described in notebooks/00_data_preparation.ipynb. The data used in the following examples are created by running a test once. Important to note is that all our data is measured for the UTC timezone.
uv run python3 scripts/format_dataset.py --folder data/tests/sinusoid_data_raw --output_dir data/example/output_test/ --target_col measurements --time_col timestamp
# To include location metadata (longitude and latitude), add the locations_file argument
uv run python3 scripts/format_dataset.py --folder data/tests/sinusoid_data_raw --output_dir data/example/output_test/ --target_col measurements --time_col timestamp --locations_file data/tests/sinusoid_locations/locations.csvTo incorporate weather data, it must first be downloaded manually from Open-Meteo using scripts/download_weather_data.py. Once downloaded, it can be referenced in your .toml configuration.
# example command to download weather data from open-meteo
uv run python3 scripts/download_weather_data.py --coords_csv data/tests/sinusoid_locations/locations.csv --start_date 2023-01-01 --end_date 2023-06-01 --output_dir data/example/weather_outputIf you want more API calls than the default, put your Open-Meteo API key in a local .env file, see .env.example as reference.
Enable logging to W&B by adding this to your .toml:
[run]
persist_to_wandb_project = "forecasting-s4"
wandb_notes = "<SOME NOTES ABOUT THE RUN>"
[authentication]
wandb_api_key = "<API-KEY-FROM-WANDB>" Notes:
- W&B collects metrics, system stats, and artifacts so you can view runs on wandb.ai.
- The API key authenticates your session, store it securely (e.g., in configs/local/*.toml or env vars).
- If these fields are not set, W&B logging stays disabled and the logging is done locally.
When using the CodeEditor application in SageMaker, always start your run inside a
screensession. The terminal may freeze after a few hours of inactivity, andscreenhelps you resume your work safely.
- Install screen (required after each reboot):
sudo apt-get install screen
- Start a screen session:
screen
- Detach from a session:
PressCTRL+A, thenD - Reconnect to a session:
screen -r
- List active sessions:
screen -ls