Activation Functions: Let’s see new activation functions

GELU (Gaussian Error Linear Unit)

The GELU nonlinearity is the expected transformation of a stochastic regularizer which randomly applies the identity or zero map to a neuron’s input
1. Equation:

2. GELU Experiments
Classfication Experiment: MNIST classification

Autoencoder Experiment: MNIST Autoencoder

Reference:
https://arxiv.org/pdf/1606.08415.pdf https://github.com/hendrycks/GELUs
LiSHT (Linearly Scaled Hyperbolic Tangent Activation)

1. Equation:

2. LiSHT Experiments
Classification Experiment: MNIST & CIFAR10

Sentiment Classification Results using LSTM

Reference
https://arxiv.org/pdf/1901.05894.pdf
SWISH

1. Equation:

2. SWISH Experiments
Machine Translation

Reference
https://arxiv.org/pdf/1710.05941.pdf
Mish

1. Equation:


2. Mish Experiments
Output Landscape of a Random Neural Network

Testing Accuracy v/s Number of Layers on MNIST

Test Accuracy v/s Batch Size on CIFAR-10

Reference
https://arxiv.org/pdf/1908.08681.pdf
Other Activation Functions
Rectified Activations: https://arxiv.org/pdf/1505.00853.pdf
Sparsemax: https://arxiv.org/pdf/1602.02068.pdf
Comments