pangaea-bench/CONTRIBUTING.md at main · Earth-Intelligence-Lab/pangaea-bench

Contributing

We welcome all forms of contributions, including but not limited to the following.

Introduce new geospatial foundation models
Incorporate downstream datasets
Add new decoder heads
Fix typo or bugs

Workflow

fork and pull the latest repository
checkout a new branch (do not use the main branch for PRs)
commit your changes
create a PR

Note: For significant modifications or any bugs spotting, please consider opening an issue for discussion beforehand.

Code structure

engine

In engine, basic modules in the training pipeline are defined including data_preprocessor, trainer and evaluator.

data_preprocessor selects the bands needed by an encoder and pads unavailable bands with zeros, and different augmentations.
trainer supports mixed precision/distributed training and print training stats and metrics in real time.
evaluator can be called independently and evaluate a model also in distributed way and compute per class metrics.

datasets

The implementations are simplified and standardized.
Dataset metas are read from configs, including newly added classes (name), ignore_index, and so on.
Check the example later to quick start contributing.

encoders

In encoders, you can find all the supported (foundation) models.

Support multi-stage output that may be needed by segmentors, specified by output layers in encoder config.
Check the example later to quick start contributing.

decoders

In decoders, you can find all the supported decoders.

The UperNet implementation is based on mmsegmentation
We support UPerNet for unitemporal semantic segmentation, UPerNetCD for change detection and MTUPerNet for multitemporal semantic segmentation
for multi-temporal, L-TAE and linear projection are supported

Adding new features

Adding a new geospatial foundation model

We have designed the repo to allow for benchmarking your own model with minimal effort. Follow the steps below to integrate your model:

Implement an Encoder Class:

In pangaea/encoders/, create a new Python file named after your model (e.g., my_model_encoder.py).
Implement a class that inherits from Encoder. You can check it in pangaea/encoders/base.py.
Be sure that your dataset is instantiated with all the required parameters from the Encoder. You can also add new parameters or fix some parameters from the Encoder that are not changing in your model (e.g. multi_temporal).
Implement the required methods: __init__, load_encoder_weights, and forward.

Example:

import torch.nn as nn

from pangaea.encoders.base import Encoder

class MyModel(Encoder):
    def __init__(
        self,
        encoder_weights: str | Path,
        input_size: int,
        input_bands: dict[str, list[str]],
        output_layers: int | list[int],
        in_chans: int,              #newly added parameter
    ) -> None:
        super().__init__(
            model_name="my_model_name",
            encoder_weights=encoder_weights,
            input_bands=input_bands,
            input_size=input_size,
            embed_dim=768,        # my_model_embed_dim, fixed parameters
            output_dim=768,       # my_model_output_dim, fixed parameters
            multi_temporal=False, # wether support multi-temporal, fixed parametersfixed parameters
            multi_temporal_ouput=False, # wether the output of the model has a temporal dimension
        )

       self.in_chans = in_chans    #newly added parameter

        # Initialize your model architecture here
        # For example:
        self.backbone = nn.Sequential(
            nn.Conv2d(in_chans, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            # Add more layers as needed
        )
        # Specify output layers if applicable

    def load_encoder_weights(self, pretrained_path: str) -> None:
        # Load pretrained weights
        state_dict = torch.load(pretrained_path, map_location='cpu')
        self.load_state_dict(state_dict, strict=False)
        print(f"Loaded encoder weights from {pretrained_path}")

    def forward(self, x: dict[str, torch.Tensor]) -> list[torch.Tensor]:
        """Foward pass of the encoder.

        Args:
            x (dict[str, torch.Tensor]): encoder's input structured as a dictionary:
            x = {modality1: tensor1, modality2: tensor2, ...}, e.g. x = {"optical": tensor1, "sar": tensor2}.
            If the encoder is multi-temporal (self.multi_temporal==True), input tensor shape is (B C T H W) with C the
            number of bands required by the encoder for the given modality and T the number of time steps. If the
            encoder is not multi-temporal, input tensor shape is (B C H W) with C the number of bands required by the
            encoder for the given modality.

        Returns:
            list[torch.Tensor]: list of the embeddings for each modality. For single-temporal encoders, the list's
            elements are of shape (B, embed_dim, H', W'). For multi-temporal encoders, the list's elements are of shape
            (B, C', T, H', W') with T the number of time steps if the encoder does not have any time-merging strategy,
            else (B, C', H', W') if the encoder has a time-merging strategy (where C'==self.output_dim).
        """
        x = image['optical']
        outputs = []
        # Forward pass through the model
        for idx, layer in enumerate(self.backbone):
            x = layer(x)
            if idx in self.output_layers:
                outputs.append(x)
        return outputs

Create an Encoder Configuration File:

In configs/encoder/, create a new YAML file named after your model (e.g., my_model.yaml).
Define model-specific parameters, including encoder_weights, input_bands,input_size and any model architecture arguments. Specifically, indicate _target_ to point out your implemeted model

Example:

 _target_: pangaea.encoders.my_model_encoder.MyModel
 encoder_weights: ./pretrained_models/my_model_weights.pth
 download_url: https://path.to.your.model/weights.pth
 
 input_size: 120  
 in_chans: 3
 embed_dim: 768
 patch_size: 16
 num_heads: 12
 depth: 12
 mlp_ratio: 4
 
 input_bands:
   optical:
     - B2
     - B3
     - B4
 
 output_layers:
   - 3
   - 5
   - 7
   - 11

Run Training with Your Model:

Use the run.py script with your encoder configuration.

Example Command:

 torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
 --config-name=train \
 dataset=hlsburnscars \
 encoder=my_model \
 decoder=seg_upernet \
 preprocessing=seg_default \
 criterion=cross_entropy \
 task=segmentation

Adding a new downstream dataset

We have designed the repo to allow for using your own datasets with minimal effort. Follow the steps below to integrate your dataset:

Implement a Dataset Class:

In the pangaea/datasets/ directory, create a new Python file named after your dataset (e.g., my_dataset.py).
Implement a class that inherits from RawGeoFMDataset. You can check it in pangaea/datasets/base.py.
Be sure that your dataset is instantiated with all the required parameters from the GeoFMDataset. You can also add new parameters.
Implement the required methods: __init__, __len__, __getitem__, and download (if applicable, otherwise a NotImplementedError is raised).

Example:

import torch
from pangaea.datasets.base import RawGeoFMDataset

class MyDataset(RawGeoFMDataset):
     def __init__(
        self,
        split: str,
        dataset_name: str,
        multi_modal: bool,
        multi_temporal: int,
        root_path: str,
        classes: list,
        num_classes: int,
        ignore_index: int,
        img_size: int,
        bands: dict[str, list[str]],
        distribution: list[int],
        data_mean: dict[str, list[str]],
        data_std: dict[str, list[str]],
        data_min: dict[str, list[str]],
        data_max: dict[str, list[str]],
        download_url: str,
        auto_download: bool,
        temp: int, #newly added parameter
    ):
        super(MyDataset, self).__init__(
            split=split,
            dataset_name=dataset_name,
            multi_modal=multi_modal,
            multi_temporal=multi_temporal,
            root_path=root_path,
            classes=classes,
            num_classes=num_classes,
            ignore_index=ignore_index,
            img_size=img_size,
            bands=bands,
            distribution=distribution,
            data_mean=data_mean,
            data_std=data_std,
            data_min=data_min,
            data_max=data_max,
            download_url=download_url,
            auto_download=auto_download,
        )

        self.temp = temp #newly added parameter
        # Initialize file lists or data structures here

    def __len__(self):
        # Return the total number of samples
        return len(self.file_list)

    def __getitem__(self, index):
       """Returns the i-th item of the dataset.

       Args:
           i (int): index of the item

       Raises:
           NotImplementedError: raise if the method is not implemented

       Returns:
           dict[str, torch.Tensor | dict[str, torch.Tensor]]: output dictionary follwing the format
           {"image":
               {
               "optical": torch.Tensor of shape (C T H W) (where T=1 if single-temporal dataset),
                "sar": torch.Tensor of shape (C T H W) (where T=1 if single-temporal dataset),
                },
           "target": torch.Tensor of shape (H W) of type torch.int64 for segmentation, torch.float for
           regression datasets.,
            "metadata": dict}.
       """
        # Load your data and labels here
        image = ...  # Load image
        target = ...  # Load target label or mask

        # Convert to tensors
        image = torch.tensor(image, dtype=torch.float32)
        target = torch.tensor(target, dtype=torch.long)

        return {
            'image': {'optical': image},
            'target': target,
            'metadata': {}
        }

    @staticmethod
    def download(self, silent=False):
        # Implement if your dataset requires downloading
        pass

Create a Dataset Configuration File:

Navigate to configs/dataset/ and create a new YAML file named after your dataset (e.g., my_dataset.yaml).
Indicate your implemented dataset class in _target_.
Define all necessary dataset parameters such as dataset_name, root_path, img_size, bands, data_mean, data_std, num_classes, and class labels. Check GeoFMDataset class for more details in pangaea/datasets/base.py.

Example:

_target_: pangaea.datasets.my_dataset.MyDataset
dataset_name: MyDataset
root_path: ./data/my_data_dir
download_url: None
auto_download: False
img_size: 256
multi_temporal: 6
multi_modal: False
ignore_index: -1
num_classes: 3
classes:
  - Class1
  - Class2
  - Class3
distribution:
  - 0.2
  - 0.4
  - 0.4
bands:
  optical:
    - B1
    - B2
    - B3
data_mean:
  optical:
    - 0.485
    - 0.456
    - 0.404
data_std:
  optical:
    - 0.229
    - 0.224
    - 0.225
data_min:
  optical:
    - 0.
    - 0.
    - 0.
data_max:
  optical:
    - 1.
    - 1.
    - 1.

Adjust the Augmentation Pipeline:
- If your dataset requires specific preprocessing or augmentation, create or modify an augmentation configuration file in configs/preprocessing/.
- Ensure that all preprocessing steps (e.g., normalization, resizing) match your dataset's requirements.
- If your specific preprocessing or augmentation are not implemented, please implement them in pangaea/engine/data_preprocessor.py

Run Training:

Use the run.py script with your dataset and augmentation configurations.

Example Command:

 torchrun --nnodes=1 --nproc_per_node=1 pangaea/run.py \
 --config-name=train \
 dataset=my_dataset \
 encoder=prithvi \
 decoder=seg_upernet_mt_ltae \
 preprocessing=seg_default \
 criterion=cross_entropy \
 task=segmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing

Workflow

Code structure

engine

datasets

encoders

decoders

Adding new features

Adding a new geospatial foundation model

Adding a new downstream dataset

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing

Workflow

Code structure

engine

datasets

encoders

decoders

Adding new features

Adding a new geospatial foundation model

Adding a new downstream dataset