🤖 Harold's Notes

Search

❯

❯

❯

❯

❯

Quantization basics

Quantization basics

Sep 02, 20242 min read

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization

Symmetric quantization

Quantized value for zero = full precision zero

Absolute maximum (absmax) quantization

Given a list of values, we take the highest absolute value, $α$ , as the range to perform the linear mapping
- e.g. FP32 → INT8
$ν$ = max representable value of your format
scale factor $s = \frac{ν}{α}$
$x_{quantized} = ro u n d (s \cdot x)$
$x_{dequantized} = x_{quantized} / s$
quantization error = $x - x_{dequantized}$ (in original precision)

Asymmetric quantization

It maps the minimum ( $β$ ) and maximum ( $α$ ) values from the float range to the minimum and maximum values of the quantized range.
Example with zeropoint quantization
- $ν$ = max representable value of your format
- $η$ = max representable value of your format
- scale factor $s = \frac{ν - η}{α - β}$
- zeropoint $z = ro u n d (- s \cdot β) + η$
- $x_{quantized} = ro u n d (s \cdot x + z)$
  - e.g. $α$ gets mapped to $ν$
    - $α_{quantized} = α \cdot \frac{ν - η}{α - β} + ro u n d (- \frac{ν - η}{α - β} \cdot β) + η \approx (α - β) \cdot \frac{ν - η}{α - β} + η = ν$

Outliers

Clipping

If your vector has an outlier, then using “naive” quantization can lead to most values being mapped to the same spot in the band
We need to clip the outlier e.g. $[- 5, 5]$ in FP32
How do you choose the clipping range?
- For weights and biases
  - Manually choosing a percentile of the input
  - Optimize the mean squared error (MSE) between the original and quantized weights.
  - Minimizing entropy (KL-divergence) between the original and quantized values
- For activations
  - Unlike weights, activations vary with each input data fed into the model during inference, making it challenging to quantize them accurately.

Graph View

Symmetric quantization
Absolute maximum (absmax) quantization
Asymmetric quantization
Outliers
Clipping

Backlinks

Post-training quantization (PTQ)

Created with Quartz v4.2.3 © 2024