Excuse me again!
When I use the KLDivLoss as the criterion in ./block/models/criterions/kl_divergence.py rather than crossentropy, I get the loss is 0 and the acc is worse , as following picture shows:

But in paper MFH ,KLD loss is better than CrossEntropy, So, how could this be? Thanks!
Excuse me again!

When I use the KLDivLoss as the criterion in ./block/models/criterions/kl_divergence.py rather than crossentropy, I get the loss is 0 and the acc is worse , as following picture shows:
But in paper MFH ,KLD loss is better than CrossEntropy, So, how could this be? Thanks!