Recently the Deep Learning community has shown great interest in attention mechanisms to train neural networks – the network pays attention to only certain parts of the input or to certain parts of the network structure to learn at a given instant. However, to make this work well, we need to develop efficient algorithms for jointly learning the network parameters as well as the attention mechanism.

The main work in the proposal will be two-fold:

  1. Better algorithms for attention, possibly based on reinforcement learning; and
  2. Transfer learning using attention.

We will demonstrate the success of this approach in two domains – natural language generation; and transfer in reinforcement learning.