LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks

Roy, Swalpa Kumar; Manna, Suvojit; Dubey, Shiv Ram; Chaudhuri, Bidyut Baran

Computer Science > Computer Vision and Pattern Recognition

arXiv:1901.05894 (cs)

[Submitted on 1 Jan 2019 (v1), last revised 17 Feb 2023 (this version, v4)]

Title:LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks

Authors:Swalpa Kumar Roy, Suvojit Manna, Shiv Ram Dubey, Bidyut Baran Chaudhuri

View PDF

Abstract:The activation function in neural network introduces the non-linearity required to deal with the complex tasks. Several activation/non-linearity functions are developed for deep learning models. However, most of the existing activation functions suffer due to the dying gradient problem and non-utilization of the large negative input values. In this paper, we propose a Linearly Scaled Hyperbolic Tangent (LiSHT) for Neural Networks (NNs) by scaling the Tanh linearly. The proposed LiSHT is non-parametric and tackles the dying gradient problem. We perform the experiments on benchmark datasets of different type, such as vector data, image data and natural language data. We observe the superior performance using Multi-layer Perceptron (MLP), Residual Network (ResNet) and Long-short term memory (LSTM) for data classification, image classification and tweets classification tasks, respectively. The accuracy on CIFAR100 dataset using ResNet model with LiSHT is improved by 9.48, 3.40, 3.16, 4.26, and 1.17\% as compared to Tanh, ReLU, PReLU, LReLU, and Swish, respectively. We also show the qualitative results using loss landscape, weight distribution and activations maps in support of the proposed activation function.

Comments:	Accepted in 7th International Conference on Computer Vision and Image Processing (CVIP), 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1901.05894 [cs.CV]
	(or arXiv:1901.05894v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1901.05894

Submission history

From: Shiv Ram Dubey [view email]
[v1] Tue, 1 Jan 2019 02:24:06 UTC (1,610 KB)
[v2] Thu, 6 Aug 2020 10:51:23 UTC (1,619 KB)
[v3] Wed, 25 May 2022 07:03:45 UTC (1,619 KB)
[v4] Fri, 17 Feb 2023 01:49:12 UTC (889 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators