Potential bug in timing embedding

Hi, 

There might be a small bug here:

https://github.com/tensorflow/tensor2tensor/blob/ef1fccebe8d2c0cf482f41f9d940e2938c816c78/tensor2tensor/layers/common_attention.py#L445-L449

I think in the last line the `exp` should be divided by `min_timescale` rather than multiplied, since it's inverse timescales. Usually `min_timescale` is 1 so it doesn't matter. But e.g. if you fix `max_timescale` and change `min_timescale`, the resulting inverse timescale corresponding to `max_timescale` changes.

A simpler implementation could be roughly something like this:
```
inv_timescales = exp(-linspace(log(min_timescale), log(max_timescale), num_timescales))
```
and from this one you can derive the current implementation, except with division instead of multiplication. It can be even simpler with logspace but tf seems to have this function only as experimental.

Let me know if this makes sense.

Thanks a lot!




	log_timescale_increment = (
	math.log(float(max_timescale) / float(min_timescale)) /
	tf.maximum(tf.to_float(num_timescales) - 1, 1))
	inv_timescales = min_timescale * tf.exp(
	tf.to_float(tf.range(num_timescales)) * -log_timescale_increment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential bug in timing embedding #1923

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential bug in timing embedding #1923

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions