Variational Dropout for Differentiable Neural Architecture Search
-
Graphical Abstract
-
Abstract
Differentiable neural architecture search (NAS) greatly accelerates the architecture search while retaining enough search space. However, existing differentiable NAS is vague in distinguishing candidate operations using the relative magnitude of architectural parameters and suffers from instability and low performance. In this paper, we propose a novel probabilistic framework for differentiable NAS named VDNAS that leverages variational dropout to achieve reformulated super-net sparsification for differentiable NAS. We propose a hierarchical structure to simultaneously enable operation sampling and explicit topology optimization via variational dropout. Specifically, for operation sampling, we develop semi-implicit variational dropout to enable selection of multiple operations and suppress the over-selection of skip-connect operation. We introduce embedded Sigmoid relaxation to alleviate the biased gradient estimation in semi-implicit variational dropout to ensure the stability in sampling of architectures and optimization of architectural parameters. Furthermore, we design operation reparameterization to aggregate multiple sampling operations on the same edge to improve the shallow and wide architectures induced by multiple-operation sampling and enhance the transferring ability to large-scale datasets. Experimental results demonstrate that proposed approaches achieve state-of-the-art performance, i.e., top-1 error rates of 2.45% and 15.76% on CIFAR-10/100. Remarkably, when transferred to ImageNet, the proposed approaches searched on CIFAR-10 outperform existing methods searched directly on ImageNet with only 10% search cost.
-
-