https://github.com/IBM/pytorch-seq2seq/blob/f146087a9a271e9b50f46561e090324764b081fb/seq2seq/models/TopKDecoder.py#L83 . I think teacher_forcing should not be present in beam decoding, since ground truth tokens are not known during inference.
pytorch-seq2seq/seq2seq/models/TopKDecoder.py
Line 83 in f146087
I think teacher_forcing should not be present in beam decoding, since ground truth tokens are not known during inference.