Receptive Field Analysis for Optimizing Convolutional Neural Network Architectures Without Training
- The number of layers in convolutional neural networks (CNN) is often overshot, when a convolutional neural network architecture is designed for an image-based task. These CNN-architectures are therefore unnecessarily costly to train and deploy. The increase in network-complexity also results in diminishing returns in terms of the predictive quality. The receptive field of a convolutional layer is strictly limiting the features it can process. We can consistently predict unproductive layers that will not contribute qualitatively to the test performance in a given CNN architecture, by analyzing the receptive field expansion over the network. Since the receptive field is a property of the architecture itself, this analysis does not require training the model. We refer to this analysis technique as Receptive Field Analysis (RFA). In this work, we demonstrate that RFA can be used to guide the optimization of CNN architectures by predicting the presence of unproductive layers. We show that RFA allows the deduction of design decisions and simple design strategies that reliably improve the parameter efficiency of the model on the given task. We further demonstrate that these RFA-guided strategies can reliably improve the predictive performance, computational efficiency or strike a balance between the two. Finally, we show that RFA can also be used to define an interval of feasible input resolutions for any modern architecture, in which the model will operate with high efficiency, while being able to extract any pattern from the image. This allows practitioners to pick efficient input resolutions when adapting models for novel tasks.
Author: | Mats L. RichterORCiD, Julius SchöningORCiD, Anna Richter (geb. Wiedenroth)ORCiD, Ulf KrumnackORCiD |
---|---|
Title (English): | Receptive Field Analysis for Optimizing Convolutional Neural Network Architectures Without Training |
DOI: | https://doi.org/10.1007/978-981-19-6153-3_10 |
ISBN: | 978-981-19-6152-6 |
Parent Title (English): | Deep Learning Applications, Volume 4 |
Publisher: | Springer Nature |
Place of publication: | Singapore |
Document Type: | Part of a Book |
Language: | English |
Year of Completion: | 2023 |
electronic ID: | Zur Anzeige in scinos |
Release Date: | 2024/08/12 |
Tag: | Computational efficiency; Input resolution; Neural architecture design; Optimization; Receptive field size; Trainable parameter |
First Page: | 235 |
Last Page: | 261 |
Note: | Zugriff im Hochschulnetz |
Faculties: | Fakultät IuI |
DDC classes: | 000 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik |
Review Status: | Peer Reviewed |