DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Zheng, Xiawu; Yang, Chenyi; Zhang, Shaokun; Wang, Yan; Zhang, Baochang; Wu, Yongjian; Wu, Yunsheng; Shao, Ling; Ji, Rongrong

doi:10.1007/s11263-023-01753-6

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Published: 02 February 2023

Volume 131, pages 1234–1249, (2023)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Xiawu Zheng^1,2,
Chenyi Yang¹,
Shaokun Zhang¹,
Yan Wang⁵,
Baochang Zhang⁶,
Yongjian Wu⁷,
Yunsheng Wu⁷,
Ling Shao⁸ &
…
Rongrong Ji ORCID: orcid.org/0000-0001-9163-2932^1,2,3,4

1170 Accesses
22 Citations
2 Altmetric
Explore all metrics

Abstract

Neural Architecture Search (NAS) has demonstrated state-of-the-art performance on various computer vision tasks. Despite the superior performance achieved, the efficiency and generality of existing methods are highly valued due to their high computational complexity and low generality. In this paper, we propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning, facilitating a theoretical bound on accuracy and efficiency. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints. The architectures searched by our method achieve remarkable top-1 accuracies, 97.56 and 77.2 on CIFAR-10 and ImageNet (mobile settings), respectively, with the fastest search process, i.e., only 1.8 GPU hours on a Tesla V100. Codes for searching and network generation are available at: https://openi.pcl.ac.cn/PCL_AutoML/XNAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Lightweight Neural Network Search Method Suitable for Heterogeneous Computing Environments

True Rank Guided Efficient Neural Architecture Search for End to End Low-Complexity Network Discovery

A Two-Stage Efficient Evolutionary Neural Architecture Search Method for Image Classification

Notes

The details about theoretical proof and the experiment of the Gaussian assumption are provided in the supplementary material.
According to the previous work (Liang et al., 2019), after certain search epochs, the number of skip-connects increases dramatically in the selected architecture, which results in poor performance.
https://github.com/quark0/darts/blob/master/cnn/operations.py

References

Baker, B., Gupta, O., & Naik, N., et al. (2016). Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167.
Bishop, C. M. (2006). Pattern recognition. Machine Learning 128(9).
Cai, H., Chen, T., & Zhang, W., et al. (2018a). Efficient architecture search by network transformation. In AAAI.
Cai, H., Yang, J., & Zhang, W., et al. (2018b). Path-level network transformation for efficient architecture search. arXiv preprint arXiv:1806.02639.
Cai, H., Zhu, L., & Han, S. (2019). Proxylessnas: Direct neural architecture search on target task and hardware. In ICLR.
Cai, H., Gan, C., & Han, S. (2020) Once for all: Train one network and specialize it for efficient deployment. In ICLR.
Chen, H., Zhuo, L., Zhang, B., et al. (2021). Binarized neural architecture search for efficient object recognition. International Journal of Computer Vision, 129(2), 501–516.
Article Google Scholar
Chen, L. C., Papandreou, G., Kokkinos, I., et al. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.
Article Google Scholar
Chen, X., Xie, L., & Wu, J., et al. (2019). Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In ICCV.
Chu, X., Wang, X., & Zhang, B., et al. (2020a). Darts-: robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027.
Chu, X., Zhou, T., & Zhang, B., et al. (2020b). Fair darts: Eliminating unfair advantages in differentiable architecture search. In: European conference on computer vision. (pp. 465–480) Springer.
Cordts, M., Omran, M., & Ramos, S., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
DeVries, T., & Taylor, GW. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
Dong, X., Yang, Y. (2019). Searching for a robust neural architecture in four gpu hours. In CVPR.
Fan, A., Grave, E., Joulin, A. (2020). Reducing transformer depth on demand with structured dropout. In 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020. OpenReview.net, https://openreview.net/forum?id=SylO2yStDr
Ghiasi, G., Lin, T. Y., & Le, Q. V. (2019). Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7036–7045).
He, K., Zhang, X., & Ren, S., et al. (2016). Deep residual learning for image recognition. In CVPR.
Howard, A., Sandler, M., & Chu, G., et al. (2019). Searching for mobilenetv3. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1314–1324).
Howard, A. G., Zhu, M., & Chen, B., et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In CVPR.
Huang, G., Liu, Z., & Van Der Maaten, L., et al. (2017). Densely connected convolutional networks. In CVPR.
Krizhevsky, A., Hinton, G. (2009). Learning multiple layers of features from tiny images. Tech. rep.
Krizhevsky, A., Sutskever, I., & Hinton, GE. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS.
Li, L., Talwalkar, A. (2019). Random search and reproducibility for neural architecture search. arXiv preprint arXiv:1902.07638.
Li, L., Khodak, M., & Balcan, M. F., et al. (2020). Geometry-aware gradient algorithms for neural architecture search. arXiv preprint arXiv:2004.07802.
Liang, H., Zhang, S., & Sun, J., et al. (2019). Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035.
Liu, C., Zoph, B., & Neumann, M., et al. (2018). Progressive neural architecture search. In ECCV
Liu, C., Chen, L. C., & Schroff, F., et al. (2019a). Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In CVPR.
Liu, H., Simonyan, K., Yang, Y. (2019b). DARTS: Differentiable architecture search. In ICLR.
Ma, N., Zhang, X., & Zheng, H. T., et al. (2018). Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV.
Mehta, S., Rastegari, M., & Caspi, A., et al. (2018). Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In Proceedings of the european conference on computer vision (ECCV), (pp. 552–568).
Mehta, S., Rastegari, M., & Shapiro, L., et al. (2019). Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9190–9200).
Nayman, N., Noy, A., & Ridnik, T., et al. (2019). Xnas: Neural architecture search with expert advice. In Advances in neural information processing systems (pp. 1977–1987).
Park, H., Yoo, Y., & Seo, G., et al. (2018). Concentrated-comprehensive convolutions for lightweight semantic segmentation. arXiv preprint arXiv:1812.04920.
Paszke, A., Gross, S., & Chintala, S., et al. (2017). Automatic differentiation in pytorch. In NeurIPS.
Pham, H., Guan, M. Y., & Zoph, B., et al. (2018). Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268.
Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12(1), 145–151.
Article Google Scholar
Real, E., Aggarwal, A., & Huang, Y., et al. (2019). Regularized evolution for image classifier architecture search. In AAAI
Russakovsky, O., Deng, J., & Su, H., et al. (2015). Imagenet large scale visual recognition challenge. In IJCV
Sandler, M., Howard, A., & Zhu, M., et al. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR (pp. 4510–4520).
Sanh, V., Debut, L., & Chaumond, J., et al. (2019). Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Sun, S., Cheng, Y., & Gan, Z., et al. (2019). Patient knowledge distillation for bert model compression. In EMNLP.
Tan, M., Chen, B., & Pang, R., et al. (2019). Mnasnet: Platform-aware neural architecture search for mobile. In CVPR (pp. 2820–2828)
Turc, I., Chang, MW., & Lee, K., et al. (2019). Well-read students learn better: On the importance of pre-training compact models. arXiv preprint arXiv:1908.08962
Wang, A., Singh, A., Michael, J., et al. (2018). Glue: A multi-task benchmark and analysis platform for natural language understanding. EMNLP, 2018, 353.
Google Scholar
Wu, B., Dai, X., & Zhang, P., et al. (2019). Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10734–10742).
Xie, L., & Yuille, A. (2017). Genetic cnn. In ICCV.
Xu, Y., Xie, L., & Zhang, X., et al. (2019). Pc-darts: Partial channel connections for memory-efficient architecture search. In ICLR.
Ying, C., Klein, A., & Real, E., et al. (2019). Nas-bench-101: Towards reproducible neural architecture search. In ICML.
Yu, J., & Huang, T. (2019). Network slimming by slimmable networks: Towards one-shot architecture search for channel numbers. arXiv preprint arXiv:1903.11728.
Zhang, X., Hou, P., & Zhang, X., et al. (2021). Neural architecture search with random labels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10907–10916).
Zheng, X., Ji, R., & Tang, L., et al. (2019). Multinomial distribution learning for effective neural architecture search. In ICCV.
Zheng, X., Ji, R., Chen, Y., et al. (2021). Migo-nas: Towards fast and generalizable neural architecture search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(9), 2936–2952.
Article Google Scholar
Zhou, Q., Wang, Y., Fan, Y., et al. (2020). Aglnet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network. Applied Soft Computing, 96(106), 682.
Google Scholar
Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578.
Zoph, B., Vasudevan, V., & Shlens, J., et al. (2018). Learning transferable architectures for scalable image recognition. In CVPR.

Download references

Acknowledgements

This work was supported by the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. U22B2051, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, No. 62002305 and No. 62272401), Guangdong Basic and Applied Basic Research Foundation (No. 2019B1515120049), China National Postdoctoral Program for Innovative Talents (BX20220392), China Postdoctoral Science Foundation (2022M711729) and the Natural Science Foundation of Fujian Province of China (No. 2021J01002, No. 2022J06001).

Author information

Authors and Affiliations

Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Xiawu Zheng, Chenyi Yang, Shaokun Zhang & Rongrong Ji
Peng Cheng Laboratory, Shenzhen, China
Xiawu Zheng & Rongrong Ji
Institute of Artificial Intelligence, Xiamen University, Xiamen, 361005, People’s Republic of China
Rongrong Ji
Fujian Engineering Research Center of Trusted Artificial Intelligence Analysis and Application, Xiamen University, Xiamen, 361005, People’s Republic of China
Rongrong Ji
Samsara, Seattle, WA, USA
Yan Wang
Beihang University, Beijing, China
Baochang Zhang
BestImage Lab, Tencent Co., Ltd, Shanghai, 200233, China
Yongjian Wu & Yunsheng Wu
Terminus Group, Beijing, China
Ling Shao

Authors

Xiawu Zheng
View author publications
Search author on:PubMed Google Scholar
Chenyi Yang
View author publications
Search author on:PubMed Google Scholar
Shaokun Zhang
View author publications
Search author on:PubMed Google Scholar
Yan Wang
View author publications
Search author on:PubMed Google Scholar
Baochang Zhang
View author publications
Search author on:PubMed Google Scholar
Yongjian Wu
View author publications
Search author on:PubMed Google Scholar
Yunsheng Wu
View author publications
Search author on:PubMed Google Scholar
Ling Shao
View author publications
Search author on:PubMed Google Scholar
Rongrong Ji
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Rongrong Ji.

Additional information

Communicated by Frederic Jurie.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zheng, X., Yang, C., Zhang, S. et al. DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning. Int J Comput Vis 131, 1234–1249 (2023). https://doi.org/10.1007/s11263-023-01753-6

Download citation

Received: 25 November 2020
Accepted: 30 December 2022
Published: 02 February 2023
Version of record: 02 February 2023
Issue date: May 2023
DOI: https://doi.org/10.1007/s11263-023-01753-6

Keywords

Part of a collection:

Computer Science SDG 7: Affordable and Clean Energy

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Lightweight Neural Network Search Method Suitable for Heterogeneous Computing Environments

True Rank Guided Efficient Neural Architecture Search for End to End Low-Complexity Network Discovery

A Two-Stage Efficient Evolutionary Neural Architecture Search Method for Image Classification

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now