Recurrent Neural Networks (RNNs) are commonly used in sequential tasks. They are Turing complete, and therefore theoretically capable of computing any task if correctly configured. However, there is a long-standing debate on the systematicity of NN learning. There is a recent increased interest in the abilities of RNNs to learn systematic tasks such as counting. In this talk we address RNN learning of counting behavior from an empirical and a theoretical aspect. We formalise counting as Dyck-1 acceptance an focus on generalisation to long sequences. In the empirical approach, we evaluate in experiments the learning and generalisation of counting behaviour with different RNN models, using different configurations and parametrisations. We find that RNN models generally do not learn exact counting and fail on longer sequences. We also find that weights correctly initialised for Dyck-1 acceptance are unlearned during training. Further analysis of the results shows different failure modes for different models. In the theoretical approach, we propose two theorems for single-cell RNNs, one for linear and one for ReLU networks, where we establish Counter Indicator Conditions (CICs) on their weights that result in exact counting behaviour. We formally prove in both cases that the CICs are necessary and sufficient for exact counting to be realised. However, we find in experiments that the CICs are not found during training and are even unlearned in correctly initialised models. A plausible explanation of this behaviour is the finding that for ReLU RNNs we find a mismatch between the CICs and the loss function such that the CICs do not coincide with the minimal loss value. This indicates that gradient descent based optimisation is unlikely to reach exact counting behaviour with a standard setup.
Post Talk Link: Click Here
Passcode: 1ZfZ.dB%
Nadine El Naggar is a final year doctoral researcher in Artificial Intelligence at City, University of London. She holds MSc (2017) and BSc (2013) degrees, both with grade Distinction, in Computer Engineering from the Arab Academy for Science, Technology and Maritime Transport in Alexandria, Egypt, where she has also worked as a Teaching Assistant until 2019. Her research focuses on mathematical analysis and empirical evaluation of systematic generalisation of Recurrent Neural Networks (RNNs).
Read More
Read More