- At the infinite-width limit, every gradient update incurs such a small change on the parameters that the naive first-order taylor expansion holds i.e. .
- So, in function space, where is loss, is label, and is the NTK kernel
- The NTK limit does not learn features!