Importance matrix

The importance matrix of the models weights is a value for each parameter that quantifies how large a change in performance is expected from a small change in parameter weight
Using a calibration dataset, one can obtain the importance matrix of a model by
- (well-motivated) computing the gradients, the square gradient can be used as the importance
- (a bit more heuristic) the square of the activation value
One can then use this importance weights in a weighted RMSE minimization when quantizing the tensor.

🤖 Harold's Notes