-
Notifications
You must be signed in to change notification settings - Fork 19
logs18:RL test train seq2seq first
Higepon Taro Minowa edited this page May 17, 2018
·
4 revisions
| Log Type | Detail |
|---|---|
| 1: What specific output am I working on right now? | In 16, RL seemed working. If it starts from scratch. In this experiment. We test if it can converge even if it's trained in seq2seq first. This will ensure most likely reply + best len appears in the result. |
| 2: Thinking out loud - hypotheses about the current problem - what to work on next - how can I verify |
If it converges len == 2 with a reasonable reply, it's working. |
| 3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer | |
| 4: Results of runs and conclusion | same as logs17:RL test len equals 2 is the best, it got stuck with all -1 reward |
| 5: Next steps | see if kind random negative reward makes it better. |
| 6: mega.nz | rl_test_20180517111949 |
| seq2seq | ||
|---|---|---|
![]() |
![]() |
| RL | ||
|---|---|---|
![]() |
![]() |
hparam| {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1248, 'model_path': 'model/tweet_large'} dst {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large_rl'}



