MgNet is a unified model that simultaneously recovers some convolutional neural networks (CNN) for image classification and multigrid (MG) methods for solving discretized partial differential equations (PDEs). Here is a diagram of its architecture.
For simplicity, we use the following notation to represent different MgNet models with different hyper-parameters:
These hyper-parameters are defined as follows.
- : The number of smoothing iterations on each grid. For example, [2,2,2,2] means that there are 4 grids, and the number of iterations of each grid is 2.
- : The number of channels for and on each grid. We mainly consider the case , which suggests us the following simplification notation , or even [c] if we further take . For examples, and .
- : This means that we use different smoother in each smoothing iteration. Correspondingly, means that we share the smoother among each grid, which is
Note that we always use , which only depends on grids. For example, the following notation denotes a MgNet model which adopts 4 different grids (feature resolutions), 2 smoothing iterations on each grid, 256 channels for both feature tensor and data tensor , and smoothing iteration .
Model | Parameters | CIFAR-10 | CIFAR100 |
---|---|---|---|
AlexNet | 2.5M | 76.22 | 43.87 |
VGG19 | 20.0M | 93.56 | 71.95 |
ResNet18 | 11.2M | 95.28 | 77.54 |
pre-act ResNet1001 | 10.2M | 95.08 | 77.29 |
WideResNet 28×2 | 36.5M | 95.83 | 79.50 |
MgNet[2,2,2,2]-256-B^{l} | 8.2M | 96.00 | 79.94 |
Model | Parameters | ImageNet |
---|---|---|
AlexNet | 60.2M | 63.30 |
VGG19 | 144.0M | 74.50 |
ResNet18 | 11.2M | 72.12 |
pre-act ResNet200 | 64.7M | 78.34 |
WideResNet 50×2 | 68.9M | 78.10 |
MgNet[3,4,6,3]-[128,256,512,1024]-B^{l,i} | 71.3M | 78.59 |
The results reported in the tables above are from in the following papers and webpages.
Imagenet classification with deep convolutional neural networks
Very deep convolutional networks for large-scale image recognition
Deep residual learning for image recognition
Identity mappings in deep residual networks
Wide residual networks
This repository contains the PyTorch (1.7.1) implementation of MgNet.
As an example, the following command trains a MgNet with on CIFAR-100:
python train_mgnet.py --wise-B --dataset cifar100
For more detials about MgNet, we refer to the following two papers. If you find MgNet useful to your research or you use the code published here, please consider to cite:
MgNet: A unified framework of multigrid and convolutional neural network
@article{he2019mgnet,
title={MgNet: A unified framework of multigrid and convolutional neural network},
author={He, Juncai and Xu, Jinchao},
journal={Science china mathematics},
volume={62},
number={7},
pages={1331--1354},
year={2019},
publisher={Springer}
}
Constrained Linear Data-feature Mapping for Image Classification
@article{he2019constrained,
title={Constrained Linear Data-feature Mapping for Image Classification},
author={He, Juncai and Chen, Yuyan and Zhang, Lian and Xu, Jinchao},
journal={arXiv preprint arXiv:1911.10428},
year={2019}
}
This software is free software distributed under the Lesser General Public License or LGPL, version 3.0 or any later versions. This software distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
Jinchao Xu's research group: http://multigrid.org/
Juncai He: jhe AT utexas.edu
Jinchao Xu: xu AT math.psu.edu
Lian Zhang: zhanglian AT multigrid.org
Jianqing Zhu: jqzhu AT emails.bjut.edu.cn
Any discussions, comments, suggestions and questions are welcome!