🤖 Harold's Notes

Search

❯

❯

❯

Random thoughts

Random thoughts

Jul 03, 20241 min read

Why RL?

Can you explain why we do RL sometimes and supervised other times ? What’s the fundamental difference between the two, and which one fits in what situations?
- Common things:
  - a model acting given an input
    - supervised: model f, acting on input x, outputs $y$
    - RL: policy p, acting on state s, outputs action a
- Different
  - labels:
    - supervised: for input x → true label y directly associated
    - RL: for state s → there is no “true” action a
      - there is a reward $r$ associated to a rollout of state and actions, and this reward may arise sooner or later

Why Deep Learning works now

dead neurons can occur e.g. all incoming weights and biases to a RELU neuron are negative, and the inputs are all positive. Then, the RELU neuron will never learn and the weights will never update.
- see vanishing gradients
Reasons why we don’t need to be as careful as before about correct inits and vanishing/exploding gradients
- Residual connections
- Normalization layers (BatchNorm, LayerNorm)
- Better Optimizers (Adam, …

Graph View

Why RL?
Why Deep Learning works now

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2025