Towards an Empirically Guided Understanding of the Loss Landscape of Neural Networks

Media type: Electronic Thesis; Doctoral Thesis; E-Book

Title: Towards an Empirically Guided Understanding of the Loss Landscape of Neural Networks

Contributor: Benzing, Frederik [Author]

imprint: ETH Zurich, 2022

Language: English

DOI: https://doi.org/20.500.11850/572885; https://doi.org/10.3929/ethz-b-000572885

Keywords: computer science ; Gradient Descent ; Data processing ; Machine Learning ; Continual learning ; Optimization

Origination:

Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Description: One of the most important and ubiquitous building blocks of machine learning is gradient based optimization. While it has and continues to contribute to the vast majority of recent successes of deep neural networks, it comes both with some limitations and the potential for further improvements. Catastrophic forgetting, which is the subject of the fist two parts of this thesis, is one such limitation. It refers to the observation that when gradient based learning algorithms are asked to learn different tasks sequentially, they overwrite knowledge from earlier tasks. In the machine learning community, several different ideas and formalisations of this problem are being investigated. One of the most difficult versions is a setting in which the use of data from earlier distributions is strictly forbidden. In this domain, an important line of work are so-called regularisation based algorithms. Our first contribution is to unify a large family of these algorithms by showing that they all rely on the same theoretical idea to limit catastrophic forgetting. This had not only been unknown, but we also show how this is an accidental feature of at least some of the algorithms. To demonstrate the practical impact of these insights, we also show how they can be used to make some algorithms more robust and performant across a variety of settings. The second part of the thesis uses tools from the first part and tackles a similar problem, but does so from a different angle. Namely it focusses on the phenomenon of catastrophic forgetting – also known as the stability-plasticity dilemma – from the viewpoint of neuroscience. It proposes and analyses a simple synaptic learning rule, based on the stochasticity of synaptic signal transmission and shows how this learning rule can alleviate catastrophic forgetting in model neural network. Moreover, the learning rule’s effects on energy-efficient information processing are investigated extending prior work which explores computational roles of the aforementioned and somewhat mysterious ...

Access State: Open Access

Rights information: In Copyright - Non-commercial Use Permitted

Search in field:

Recently searched for: