Footnote:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Description:
Computational efficiency is an essential factor that influences the applicability of computer vision algorithms. Although deep neural networks have reached state-of-the-art performances in a variety of computer vision tasks, there are a couple of efficiency related problems of the deep learning based solutions. First, the overparameterization of deep neural networks results in models with millions of parameters, which lowers the parameter efficiency of the designed networks. To store the parameters and intermediate feature maps during the computation, a large device memory footprint is required. Secondly, the massive computation in deep neural networks slows down their training and inference. This limits the application of deep neural networks to latency-demanding scenarios and low-end devices. Thirdly, the massive computation consumes significant amount of energy, which leaves a large carbon footprint of deep learning models. The aim of this thesis is to improve the computational efficiency of current deep neural networks. This problem is tackled from three perspective including neural network compression, neural architecture optimization, and computational procedure optimization. In the first part of the thesis, we reduce the model complexity of neural networks by network compression techniques including filter decomposition and filter pruning. The basic assumption for filter decomposition is that the ensemble of filters in deep neural networks constitutes an overcomplete set. Instead of using the original filters directly during the computation, they can be approximated by a linear combination of a set of basis filters. The contribution of this thesis is to provide a unified analysis of previous filter decomposition methods. On the other hand, a differentiable filter pruning method is proposed. To achieve differentiability, the layers of neural networks is reparameterized by a meta network. Sparsity regularization is applied to the input of the meta network, i.e. latent vectors. Optimizing with the introduced ...