Recently the Deep Learning community has shown great interest in attention mechanisms to train neural networks - the network pays attention to only certain parts of the input or to certain parts of the network structure to learn at a given instant. However, to make this work well, we need to develop efficient algorithms for jointly learning the network parameters as well as the attention mechanism. The main work in the proposal will be three-fold.
Better algorithms for attention, possibly based on reinforcement learning; transfer learning using attention; and more efficient implementations of attention mechanisms