We study how permutation symmetries in overparameterized multi-layer neural networks generate ‘symmetry-induced’ critical points. Assuming a network with L layers of minimal widths $r_1^∗, \ldots, r_{L-1}^∗$ reaches a zero-loss minimum at $r_1^∗! · · …

The permutation symmetry of neurons in each layer of a deep neural network gives rise not only to multiple equivalent global minima of the loss function, but also to first-order saddle points located on the path between the global minima. In a …

© 2024 Şimşek

Published with Academic Website Builder