⚡ Activation Functions in Deep Learning
How neurons "fire" and learn complex patterns!
Function | Formula | Usage |
---|---|---|
ReLU ⚡ | f(x) = max(0, x) | Fast and popular for hidden layers |
Sigmoid ➰ | f(x) = 1 / (1 + e-x) | Good for binary classification (0 or 1) |
Tanh 🔵 | f(x) = (ex - e-x) / (ex + e-x) | Better than sigmoid for hidden layers (output range -1 to 1) |
⚡ ReLU (Rectified Linear Unit)
- ReLU sets negative values to 0 and keeps positive values as-is.
- Very simple, yet very powerful!
Pros:
🔹 Fast and efficient
🔹 Reduces chances of vanishing gradients
Cons:
🔸 Dead neurons (ReLU can output 0 permanently for some neurons)
➰ Sigmoid Function
- Squashes input into (0,1) range.
- Perfect when you need probabilities (like yes/no outputs).
Pros:
🔹 Good for binary outputs
Cons:
🔸 Vanishing gradient problem
🔸 Outputs not centered around zero
🔵 Tanh Function
- Scales input between -1 and 1.
- Often better than sigmoid for hidden layers.
Pros:
🔹 Centered around zero
🔹 Stronger gradients than sigmoid
Cons:
🔸 Still suffers from vanishing gradients at extremes
🎯 Quick Challenge!
Which activation function outputs between -1 and 1?
By Darchums Technologies Inc - April 28, 2025
Comments
Post a Comment