Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeCamp #62 #1238

Merged
merged 52 commits into from
Jan 17, 2023
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
cac9705
网络搭建完成、能正常推理
aso538 Nov 30, 2022
bbeab5e
网络搭建完成、能正常推理
aso538 Nov 30, 2022
d6fc8ca
网络搭建完成、能正常推理
aso538 Nov 30, 2022
93d167b
添加了模型转换未验证,配置文件 但有无法运行
aso538 Nov 30, 2022
1173f26
模型转换、结构验证完成,可以推理出正确答案
aso538 Dec 1, 2022
b0078e8
推理精度与原论文一致 已完成转化
aso538 Dec 1, 2022
50b4ffd
三个方法改为class 暂存
aso538 Dec 1, 2022
5fa99c1
完成推理精度对齐 误差0.04
aso538 Dec 2, 2022
390d6a5
暂时使用的levit2mmcls
aso538 Dec 2, 2022
ca33151
Merge branch 'open-mmlab:dev-1.x' into dev-1.x
aso538 Dec 7, 2022
2fd1ff8
训练跑通,训练相关参数未对齐
aso538 Dec 8, 2022
8fb0b78
'训练相关参数对齐'参数'
aso538 Dec 13, 2022
f892b5c
'修复训练时验证导致模型结构改变无法复原问题'
aso538 Dec 16, 2022
a4d4bee
'修复训练时验证导致模型结构改变无法复原问题'
aso538 Dec 16, 2022
8326886
'添加mixup和labelsmooth'
aso538 Dec 20, 2022
f4ba8cd
'配置文件补齐'
aso538 Dec 26, 2022
c5602cd
添加模型转换
aso538 Dec 26, 2022
1d6c6c7
添加meta文件
aso538 Dec 27, 2022
252a39c
添加meta文件
aso538 Dec 27, 2022
9c038e5
删除demo.py测试文件
aso538 Dec 27, 2022
3745195
添加模型README文件
aso538 Dec 27, 2022
bc9b7b8
docs文件回滚
aso538 Dec 27, 2022
1e7cda1
model-index删除末行空格
aso538 Dec 27, 2022
7c665cb
更新模型metafile
aso538 Dec 27, 2022
e0874dc
更新metafile
aso538 Dec 27, 2022
ec45519
更新metafile
aso538 Dec 27, 2022
f15889d
更新README和metafile
aso538 Dec 28, 2022
29eda58
更新模型README
aso538 Dec 28, 2022
040d773
更新模型metafile
aso538 Dec 28, 2022
7c2e0c5
Delete the model class and get_LeViT_model methods in the mmcls.model…
aso538 Jan 5, 2023
e1e79a5
Change the class name to Google Code Style
aso538 Jan 5, 2023
a9feae6
use arch to provide default architectures
aso538 Jan 5, 2023
fc9e49c
use nn.Conv2d
aso538 Jan 6, 2023
1887ba4
mmcv.cnn.fuse_conv_bn
aso538 Jan 7, 2023
bd1e8ab
modify some details
aso538 Jan 7, 2023
2e8c17c
remove down_ops from the architectures.
aso538 Jan 8, 2023
4a60537
remove init_weight function
aso538 Jan 9, 2023
982b9d8
Modify ambiguous variable names
aso538 Jan 9, 2023
d3d1bf2
Change the drop_path in config to drop_path_rate
aso538 Jan 9, 2023
2da91ec
Add unit test
aso538 Jan 10, 2023
d4ad5a3
remove train function
aso538 Jan 10, 2023
9947b6e
add unit test
aso538 Jan 10, 2023
b57ded5
modify nn.norm1d to build_norm_layer
aso538 Jan 11, 2023
355f24e
update metafile and readme
aso538 Jan 11, 2023
60a3a9c
Merge remote-tracking branch 'origin/dev-1.x' into pr1238/dev-1.x
mzr1996 Jan 17, 2023
a4666f7
Update configs and LeViT implementations.
mzr1996 Jan 17, 2023
159a0bc
Update README.
mzr1996 Jan 17, 2023
7a9c9ec
Add docstring and update unit tests.
mzr1996 Jan 17, 2023
18a405b
Revert irrelative modification.
mzr1996 Jan 17, 2023
4e370df
Merge remote-tracking branch 'origin/dev-1.x' into pr1238/dev-1.x
mzr1996 Jan 17, 2023
96b7303
Fix unit tests
mzr1996 Jan 17, 2023
f185a2b
minor fix
mzr1996 Jan 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions configs/_base_/datasets/imagenet_bs256_levit_224.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
dataset_type = 'ImageNet'

data_preprocessor = dict(
num_classes=1000,
# RGB format normalization parameters
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
# convert image from BGR to RGB
to_rgb=True,
)

bgr_mean = data_preprocessor['mean'][::-1]
bgr_std = data_preprocessor['std'][::-1]

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='RandomResizedCrop',
scale=224,
backend='pillow',
interpolation='bicubic'),
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
dict(
type='RandAugment',
policies='timm_increasing',
num_policies=2,
total_level=10,
magnitude_level=9,
magnitude_std=0.5,
hparams=dict(
pad_val=[round(x) for x in bgr_mean], interpolation='bicubic')),
dict(
type='RandomErasing',
erase_prob=0.25,
mode='rand',
min_area_ratio=0.02,
max_area_ratio=1 / 3,
fill_color=bgr_mean,
fill_std=bgr_std),
dict(type='PackClsInputs'),
]

test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeEdge',
scale=256,
edge='short',
backend='pillow',
interpolation='bicubic'),
dict(type='CenterCrop', crop_size=224),
dict(type='PackClsInputs'),
]

train_dataloader = dict(
batch_size=256,
num_workers=4,
dataset=dict(
type=dataset_type,
data_root=r'E:\imagenet',
ann_file='meta/val.txt',
data_prefix='ILSVRC2012_img_val',
pipeline=train_pipeline),
sampler=dict(type='DefaultSampler', shuffle=True),
)

val_dataloader = dict(
batch_size=256,
num_workers=4,
dataset=dict(
type=dataset_type,
data_root=r'E:\imagenet',
ann_file='meta/val.txt',
data_prefix='ILSVRC2012_img_val',
pipeline=test_pipeline),
sampler=dict(type='DefaultSampler', shuffle=False),
)
val_evaluator = dict(type='Accuracy', topk=(1, 5))

# If you want standard test, please manually configure the test dataset
test_dataloader = val_dataloader
test_evaluator = val_evaluator
34 changes: 34 additions & 0 deletions configs/_base_/models/levit-256-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# model settings
model = dict(
type='ImageClassifier',
backbone=dict(
type='LeViT',
img_size=224,
patch_size=16,
drop_path=0,
embed_dim=[256, 384, 512],
num_heads=[4, 6, 8],
depth=[4, 4, 4],
key_dim=[32, 32, 32],
attn_ratio=[2, 2, 2],
mlp_ratio=[2, 2, 2],
down_ops=[
# ('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
['Subsample', 32, 256 // 32, 4, 2, 2],
['Subsample', 32, 384 // 32, 4, 2, 2],
],
out_indices=(2, )),
neck=dict(type='GlobalAveragePooling'),
head=dict(
type='LeViTClsHead',
num_classes=1000,
in_channels=512,
distillation=False,
loss=dict(
type='LabelSmoothLoss', label_smooth_val=0.1, loss_weight=1.0),
topk=(1, 5),
),
train_cfg=dict(augments=[
dict(type='Mixup', alpha=0.8),
dict(type='CutMix', alpha=1.0),
]))
44 changes: 44 additions & 0 deletions configs/_base_/schedules/imagenet_bs1024_adamw_levit.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# for batch in each gpu is 256, 4 gpu
# lr = 5e-4 * 256 * 4 / 512 = 0.001
optim_wrapper = dict(
optimizer=dict(
type='AdamW',
lr=5e-4 * 256 * 4 / 512.0,
weight_decay=0.025,
eps=1e-8,
betas=(0.9, 0.999)),
paramwise_cfg=dict(
norm_decay_mult=0.0,
bias_decay_mult=0.0,
custom_keys={
'.attention_biases': dict(decay_mult=0.0),
}),
)

# learning policy
# lr = args.lr * args.batch_size * utils.get_world_size() / 512.0
# start_factor=1e-6/lr
param_scheduler = [
# warm up learning rate scheduler
dict(
type='LinearLR',
start_factor=1e-6 / (5e-4 * 256 * 4 / 512.0),
by_epoch=True,
end=5,
# update by iter
# convert_to_iter_based=True
),
# main learning rate scheduler
dict(type='CosineAnnealingLR', eta_min=1e-5, by_epoch=True, begin=5)
]

# train, val, test setting
train_cfg = dict(by_epoch=True, max_epochs=1000, val_interval=5)
val_cfg = dict()
test_cfg = dict()

# NOTE: `auto_scale_lr` is for automatically scaling LR,
# based on the actual training batch size.
# auto_scale_lr = dict(base_batch_size=1024)
model_wrapper_cfg = dict(
type='MMDistributedDataParallel', find_unused_parameters=True)
38 changes: 38 additions & 0 deletions configs/levit/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# LeViT

> [LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference](https://arxiv.org/pdf/2104.01136.pdf)

<!-- [ALGORITHM] -->

## Abstract

We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime. Our work exploits recent findings in attention-based architectures, which are competitive on highly parallel processing hardware. We revisit principles from the extensive literature on convolutional neural networks to apply them to transformers, in particular activation maps with decreasing resolutions. We also introduce the attention bias, a new way to integrate positional information in vision transformers. As a result, we propose LeVIT: a hybrid neural network for fast inference image classification. We consider different measures of efficiency on different hardware platforms, so as to best reflect a wide range of application scenarios. Our extensive experiments empirically validate our technical choices and show they are suitable to most architectures. Overall, LeViT significantly outperforms existing convnets and vision transformers with respect to the speed/accuracy tradeoff. For example, at 80% ImageNet top-1 accuracy, LeViT is 5 times faster than EfficientNet on CPU.

<div align=center>
<img src="https://raw.githubusercontent.com/facebookresearch/LeViT/main/.github/levit.png" width="90%"/>
</div>

## Results and models

### ImageNet-1k

| Model | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config |
| :--------: | :-------: | :------: | :-------: | :-------: | :--------------------------------------------------------------------: |
| LeViT-128S | 7.75 | 0.304 | 76.52 | 92.90 | [config](./levit-128s-p16.py) |
| LeViT-128 | 9.19 | 0.405 | 78.58 | 93.94 | [config](./levit-128-p16.py) |
| LeViT-192 | 10.91 | 0.656 | 79.87 | 94.75 | [config](./levit-192-p16.py) |
| LeViT-256 | 18.86 | 1.126 | 81.59 | 95.45 | [config](./levit-256-p16_4xb256_autoaug-mixup-lbs-coslr-1000e_in1k.py) |
| LeViT-384 | 39.07 | 2.350 | 82.59 | 95.96 | [config](./levit-384-p16.py) |

## Citation

```
@InProceedings{Graham_2021_ICCV,
author = {Graham, Benjamin and El-Nouby, Alaaeldin and Touvron, Hugo and Stock, Pierre and Joulin, Armand and Jegou, Herve and Douze, Matthijs},
title = {LeViT: A Vision Transformer in ConvNet's Clothing for Faster Inference},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {12259-12269}
}
```
4 changes: 4 additions & 0 deletions configs/levit/deploy/levit-128-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
_base_ = ['../levit-128-p16.py']

model = dict(
backbone=dict(deploy=True), head=dict(deploy=True, distillation=True))
4 changes: 4 additions & 0 deletions configs/levit/deploy/levit-128s-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
_base_ = ['../levit-128s-p16.py']

model = dict(
backbone=dict(deploy=True), head=dict(deploy=True, distillation=True))
4 changes: 4 additions & 0 deletions configs/levit/deploy/levit-192-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
_base_ = ['../levit-192-p16.py']

model = dict(
backbone=dict(deploy=True), head=dict(deploy=True, distillation=True))
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
_base_ = ['../levit-256-p16_4xb256_autoaug-mixup-lbs-coslr-1000e_in1k.py']

model = dict(
backbone=dict(deploy=True), head=dict(deploy=True, distillation=True))
4 changes: 4 additions & 0 deletions configs/levit/deploy/levit-384-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
_base_ = ['../levit-384-p16.py']

model = dict(
backbone=dict(deploy=True), head=dict(deploy=True, distillation=True))
18 changes: 18 additions & 0 deletions configs/levit/levit-128-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
_base_ = [
'../_base_/models/levit-256-p16.py',
'../_base_/datasets/imagenet_bs256_levit_224.py',
'../_base_/default_runtime.py',
'../_base_/schedules/imagenet_bs1024_adamw_levit.py'
]

model = dict(
backbone=dict(
embed_dim=[128, 256, 384],
num_heads=[4, 8, 12],
key_dim=[16, 16, 16],
down_ops=[
# ('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
['Subsample', 16, 128 // 16, 4, 2, 2],
['Subsample', 16, 256 // 16, 4, 2, 2],
]),
head=dict(in_channels=384, ))
19 changes: 19 additions & 0 deletions configs/levit/levit-128s-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
_base_ = [
'../_base_/models/levit-256-p16.py',
'../_base_/datasets/imagenet_bs256_levit_224.py',
'../_base_/default_runtime.py',
'../_base_/schedules/imagenet_bs1024_adamw_levit.py'
]

model = dict(
backbone=dict(
embed_dim=[128, 256, 384],
num_heads=[4, 6, 8],
depth=[2, 3, 4],
key_dim=[16, 16, 16],
down_ops=[
# ('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
['Subsample', 16, 128 // 16, 4, 2, 2],
['Subsample', 16, 256 // 16, 4, 2, 2],
]),
head=dict(in_channels=384, ))
17 changes: 17 additions & 0 deletions configs/levit/levit-192-p16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
_base_ = [
'../_base_/models/levit-256-p16.py',
'../_base_/datasets/imagenet_bs256_levit_224.py',
'../_base_/default_runtime.py',
'../_base_/schedules/imagenet_bs1024_adamw_levit.py'
]

model = dict(
backbone=dict(
embed_dim=[192, 288, 384],
num_heads=[3, 5, 6],
down_ops=[
# ('Subsample',key_dim, num_heads, attn_ratio, mlp_ratio, stride)
['Subsample', 32, 192 // 32, 4, 2, 2],
['Subsample', 32, 288 // 32, 4, 2, 2],
]),
head=dict(in_channels=384, ))
Loading