Skip to content

sdraeger/nc-mixup-loss-function

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Collapse under Mixup: A Closer Look at the Loss Function

Introduction

Neural collapse (NC; [1]) is a phenomenon in deep learning that occurs when a neural network is trained for classification and toward overfitting the training dataset. Under NC, the features of the layer before the classification head exhibit the geometric structure of an Equiangular Tight Frame (ETF) and the decision rule of the network collapses to a nearest-neighbor decision where the penultimate layer's features are compared to the mean of the training examples' features.

Mixup [3] is a data augmentation technique that works by creating a convex combination of two training examples. In addition to providing a way of regularizing the model, it yields a more calibrated model. Fisher et al. [2] investigated how the success of Mixup could be explained by the prevalence of NC in different stages of training and under different loss functions. Despite these efforts, it remains open whether different loss functions exhibit different NC dynamics, as well as the impact of different network architectures.

Features

  • Command-line interface for easy configuration and execution.
  • Support for multiple datasets and model architectures.
  • Configurable training parameters such as learning rate, weight decay, and number of epochs.
  • Reproducibility through random seed setting.
  • Visualization of model training progress.

Installation

To install the necessary dependencies, run the following command:

pip install -r requirements.txt

Usage

To train a model, run the following command:

python main.py --dataset cifar10 --model resnet18 --loss cross_entropy --epochs 500

where arguments such as dataset, model, loss, and epochs are variable. For a full list of available arguments, refer to main.py.

To plot the last-layer representations and loss curves from a pkl file, run the following command:

python utils.py <command_type> ...

References

[1] Papyan, V., Han, X. Y., & Donoho, D. L. (2020). Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117(40), 24652-24663.

[2] Fisher, Q., Meng, H., & Papyan, V. (2024). Pushing Boundaries: Mixup's Influence on Neural Collapse. arXiv preprint arXiv:2402.06171.

[3] Zhang, H. (2017). mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published