这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Neural Architecture Search (NAS) has demonstrated state-of-the-art performance on various computer vision tasks. Despite the superior performance achieved, the efficiency and generality of existing methods are highly valued due to their high computational complexity and low generality. In this paper, we propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning, facilitating a theoretical bound on accuracy and efficiency. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints. The architectures searched by our method achieve remarkable top-1 accuracies, 97.56 and 77.2 on CIFAR-10 and ImageNet (mobile settings), respectively, with the fastest search process, i.e., only 1.8 GPU hours on a Tesla V100. Codes for searching and network generation are available at: https://openi.pcl.ac.cn/PCL_AutoML/XNAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The details about theoretical proof and the experiment of the Gaussian assumption are provided in the supplementary material.

  2. According to the previous work (Liang et al., 2019), after certain search epochs, the number of skip-connects increases dramatically in the selected architecture, which results in poor performance.

  3. https://github.com/quark0/darts/blob/master/cnn/operations.py

References

  • Baker, B., Gupta, O., & Naik, N., et al. (2016). Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167.

  • Bishop, C. M. (2006). Pattern recognition. Machine Learning 128(9).

  • Cai, H., Chen, T., & Zhang, W., et al. (2018a). Efficient architecture search by network transformation. In AAAI.

  • Cai, H., Yang, J., & Zhang, W., et al. (2018b). Path-level network transformation for efficient architecture search. arXiv preprint arXiv:1806.02639.

  • Cai, H., Zhu, L., & Han, S. (2019). Proxylessnas: Direct neural architecture search on target task and hardware. In ICLR.

  • Cai, H., Gan, C., & Han, S. (2020) Once for all: Train one network and specialize it for efficient deployment. In ICLR.

  • Chen, H., Zhuo, L., Zhang, B., et al. (2021). Binarized neural architecture search for efficient object recognition. International Journal of Computer Vision, 129(2), 501–516.

    Article  Google Scholar 

  • Chen, L. C., Papandreou, G., Kokkinos, I., et al. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Chen, X., Xie, L., & Wu, J., et al. (2019). Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In ICCV.

  • Chu, X., Wang, X., & Zhang, B., et al. (2020a). Darts-: robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027.

  • Chu, X., Zhou, T., & Zhang, B., et al. (2020b). Fair darts: Eliminating unfair advantages in differentiable architecture search. In: European conference on computer vision. (pp. 465–480) Springer.

  • Cordts, M., Omran, M., & Ramos, S., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  • DeVries, T., & Taylor, GW. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.

  • Dong, X., Yang, Y. (2019). Searching for a robust neural architecture in four gpu hours. In CVPR.

  • Fan, A., Grave, E., Joulin, A. (2020). Reducing transformer depth on demand with structured dropout. In 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020. OpenReview.net, https://openreview.net/forum?id=SylO2yStDr

  • Ghiasi, G., Lin, T. Y., & Le, Q. V. (2019). Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7036–7045).

  • He, K., Zhang, X., & Ren, S., et al. (2016). Deep residual learning for image recognition. In CVPR.

  • Howard, A., Sandler, M., & Chu, G., et al. (2019). Searching for mobilenetv3. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1314–1324).

  • Howard, A. G., Zhu, M., & Chen, B., et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

  • Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In CVPR.

  • Huang, G., Liu, Z., & Van Der Maaten, L., et al. (2017). Densely connected convolutional networks. In CVPR.

  • Krizhevsky, A., Hinton, G. (2009). Learning multiple layers of features from tiny images. Tech. rep.

  • Krizhevsky, A., Sutskever, I., & Hinton, GE. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS.

  • Li, L., Talwalkar, A. (2019). Random search and reproducibility for neural architecture search. arXiv preprint arXiv:1902.07638.

  • Li, L., Khodak, M., & Balcan, M. F., et al. (2020). Geometry-aware gradient algorithms for neural architecture search. arXiv preprint arXiv:2004.07802.

  • Liang, H., Zhang, S., & Sun, J., et al. (2019). Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035.

  • Liu, C., Zoph, B., & Neumann, M., et al. (2018). Progressive neural architecture search. In ECCV

  • Liu, C., Chen, L. C., & Schroff, F., et al. (2019a). Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In CVPR.

  • Liu, H., Simonyan, K., Yang, Y. (2019b). DARTS: Differentiable architecture search. In ICLR.

  • Ma, N., Zhang, X., & Zheng, H. T., et al. (2018). Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV.

  • Mehta, S., Rastegari, M., & Caspi, A., et al. (2018). Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In Proceedings of the european conference on computer vision (ECCV), (pp. 552–568).

  • Mehta, S., Rastegari, M., & Shapiro, L., et al. (2019). Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9190–9200).

  • Nayman, N., Noy, A., & Ridnik, T., et al. (2019). Xnas: Neural architecture search with expert advice. In Advances in neural information processing systems (pp. 1977–1987).

  • Park, H., Yoo, Y., & Seo, G., et al. (2018). Concentrated-comprehensive convolutions for lightweight semantic segmentation. arXiv preprint arXiv:1812.04920.

  • Paszke, A., Gross, S., & Chintala, S., et al. (2017). Automatic differentiation in pytorch. In NeurIPS.

  • Pham, H., Guan, M. Y., & Zoph, B., et al. (2018). Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268.

  • Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12(1), 145–151.

    Article  Google Scholar 

  • Real, E., Aggarwal, A., & Huang, Y., et al. (2019). Regularized evolution for image classifier architecture search. In AAAI

  • Russakovsky, O., Deng, J., & Su, H., et al. (2015). Imagenet large scale visual recognition challenge. In IJCV

  • Sandler, M., Howard, A., & Zhu, M., et al. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR (pp. 4510–4520).

  • Sanh, V., Debut, L., & Chaumond, J., et al. (2019). Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.

  • Sun, S., Cheng, Y., & Gan, Z., et al. (2019). Patient knowledge distillation for bert model compression. In EMNLP.

  • Tan, M., Chen, B., & Pang, R., et al. (2019). Mnasnet: Platform-aware neural architecture search for mobile. In CVPR (pp. 2820–2828)

  • Turc, I., Chang, MW., & Lee, K., et al. (2019). Well-read students learn better: On the importance of pre-training compact models. arXiv preprint arXiv:1908.08962

  • Wang, A., Singh, A., Michael, J., et al. (2018). Glue: A multi-task benchmark and analysis platform for natural language understanding. EMNLP, 2018, 353.

    Google Scholar 

  • Wu, B., Dai, X., & Zhang, P., et al. (2019). Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10734–10742).

  • Xie, L., & Yuille, A. (2017). Genetic cnn. In ICCV.

  • Xu, Y., Xie, L., & Zhang, X., et al. (2019). Pc-darts: Partial channel connections for memory-efficient architecture search. In ICLR.

  • Ying, C., Klein, A., & Real, E., et al. (2019). Nas-bench-101: Towards reproducible neural architecture search. In ICML.

  • Yu, J., & Huang, T. (2019). Network slimming by slimmable networks: Towards one-shot architecture search for channel numbers. arXiv preprint arXiv:1903.11728.

  • Zhang, X., Hou, P., & Zhang, X., et al. (2021). Neural architecture search with random labels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10907–10916).

  • Zheng, X., Ji, R., & Tang, L., et al. (2019). Multinomial distribution learning for effective neural architecture search. In ICCV.

  • Zheng, X., Ji, R., Chen, Y., et al. (2021). Migo-nas: Towards fast and generalizable neural architecture search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(9), 2936–2952.

    Article  Google Scholar 

  • Zhou, Q., Wang, Y., Fan, Y., et al. (2020). Aglnet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network. Applied Soft Computing, 96(106), 682.

    Google Scholar 

  • Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578.

  • Zoph, B., Vasudevan, V., & Shlens, J., et al. (2018). Learning transferable architectures for scalable image recognition. In CVPR.

Download references

Acknowledgements

This work was supported by the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. U22B2051, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, No. 62002305 and No. 62272401), Guangdong Basic and Applied Basic Research Foundation (No. 2019B1515120049), China National Postdoctoral Program for Innovative Talents (BX20220392), China Postdoctoral Science Foundation (2022M711729) and the Natural Science Foundation of Fujian Province of China (No. 2021J01002, No. 2022J06001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rongrong Ji.

Additional information

Communicated by Frederic Jurie.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, X., Yang, C., Zhang, S. et al. DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning. Int J Comput Vis 131, 1234–1249 (2023). https://doi.org/10.1007/s11263-023-01753-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-023-01753-6

Keywords