Şimşek
Şimşek
Home
Publications
Posts
Light
Dark
Automatic
Interpretability
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
Any continuous function $f*$ can be approximated arbitrarily well by a neural network with sufficiently many neurons $k$. We consider the case when $f*$ itself is a neural network with one hidden layer and $k$ neurons. Approximating $f*$ with a …
Cite
×