Hi,
I ran into a problem with the SoftmaxWithLoss layer recently that I wanted to share because it seems it has not been fixed yet.
The softmax loss layer can have two outputs, the first holds the loss, the second one the probabilities computed by the attached softmax layer.
However, the attached softmax layer can only have one top blob, that is why when creating the loss layer, it goes like this:
- Create the loss layer
- Duplicate the LayerParameter object called softmax_param used to build the softmax loss layer
- create a new top and bottom vector where you push_back() the first bottom blob of the loss layer (the blob holding the data) and the first top blob of the loss layer.
- Create the softmax layer using softmax_param, reshape it to have the bottom/top vector
The problem is, when you have more than two top blobs in the loss layer, you also have two loss_weight parameters in the layer_param_ class that is used. This creates a mess when setting up the softmax layer because it expects only one of these parameters and checks for it using CHECK_EQ(), in the function SetLossWeights(). That means in the end the layer creation fails.
So, a quick fix would be to change SetLossWeights() to check for at least as many top blobs as there are loss_weights, instead of an equal amount of blobs and loss_weights. However, this is of course messy as well, so a better way would be to alter the softmax_param and only retain the first loss_weight parameter.
EDIT: Changing the LayerParameter softmax_param is of course easy and requires like three additional lines:
const Dtype loss_weight = layer_param_.loss_weight(0);
softmax_param.clear_loss_weight();
softmax_param.add_loss_weight(loss_weight);
I hope that description helps and the issue has not been reported before; I at least did not find it.
Hi,
I ran into a problem with the SoftmaxWithLoss layer recently that I wanted to share because it seems it has not been fixed yet.
The softmax loss layer can have two outputs, the first holds the loss, the second one the probabilities computed by the attached softmax layer.
However, the attached softmax layer can only have one top blob, that is why when creating the loss layer, it goes like this:
The problem is, when you have more than two top blobs in the loss layer, you also have two loss_weight parameters in the layer_param_ class that is used. This creates a mess when setting up the softmax layer because it expects only one of these parameters and checks for it using CHECK_EQ(), in the function SetLossWeights(). That means in the end the layer creation fails.
So, a quick fix would be to change SetLossWeights() to check for at least as many top blobs as there are loss_weights, instead of an equal amount of blobs and loss_weights. However, this is of course messy as well, so a better way would be to alter the softmax_param and only retain the first loss_weight parameter.
EDIT: Changing the LayerParameter softmax_param is of course easy and requires like three additional lines:
const Dtype loss_weight = layer_param_.loss_weight(0);
softmax_param.clear_loss_weight();
softmax_param.add_loss_weight(loss_weight);
I hope that description helps and the issue has not been reported before; I at least did not find it.