Skip to content

Beam Search 11.Debug beam 2

Higepon Taro Minowa edited this page Jul 9, 2017 · 31 revisions

what I've done so far

  • changed beam_attention_decoder and remove output_projection, because caller is not expecting that.
  • Enabled fast build = True for faster debug

error

tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [5] vs. [2]

  • 5 is beam size

log_perp_list.append(crossent * weight)

  • this is the place

facts

  • happening after model creation

questions

  • DONE what is [2] => batch size
  • what is right shape of
    • crossent (5,)
    • weight (?, )
  • DONE let's think and have expected shape of weight, logit and target.
  • DONE loop size is 10, length of target defined by bucket, which is 10
  • DONE How can I have debug point in this?
    • DONE What is for loop size? = 10 beam_size?
    • DONE wait I made beam_size=5?
    • DONE where is weights coming from
    • DONE This is coming from (5, 10) one of bucket
  • the function comment says weights = "[batch_size x num_decoder_symbols]"
    • but we see [beam_size * num_decoder_symbols]
    • let's try beam_search = False: (?, 6) ? = batch_size and 6 is num_vocab
    • beam_search=True: (5, 6) OR (?, 6)
      • batch_size=3 in config, so this is wrong.
      • let's check it's stacktrace
   bucket_outputs, *_ = seq2seq(encoder_inputs[:bucket[0]],
                               decoder_inputs[:bucket[1]])

This is returning [beam_size, num_vocab]

steps

  • DONE check actual outputs logic
    • DONE cell_output is (?, 6) for the first time and (5, 6) 2nd time.
    • DONE first tine inp (?, 2), cell_output (?, 6)
    • DONE second time inp (5, 2), cell_output (5, 6)
    • DONE my guess is loop_function should return same size prev, but returning beam_size base.
  • DONE identify loop_function
    • DONE _extract_beam_search
  • DONE is this bug in loop_function
    • emb_prev = tf.reshape(emb_prev ,[beam_size ,embedding_size])
    • so this is intentional
  • DONE let's check other loop_function?
    • not reshaping
  • Or we should use different code path?

Clone this wiki locally