Skip to content

Commit

Permalink
[Refactor] Update analysis tools and documentations. (#1359)
Browse files Browse the repository at this point in the history
* [Refactor] Update analysis tools and documentations.

* Update migration.md and add unit test.

* Fix print_config.py
  • Loading branch information
mzr1996 committed Feb 15, 2023
1 parent b4ee9d2 commit bedf4e9
Show file tree
Hide file tree
Showing 13 changed files with 285 additions and 142 deletions.
2 changes: 1 addition & 1 deletion docs/en/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@ Changes in [heads](mmcls.models.heads):
| :--------------------------: | :-------------------------------------------------------------------------------------------------------------- |
| `collect_env` | No changes |
| `get_root_logger` | Removed, use [`mmengine.logging.MMLogger.get_current_instance`](mmengine.logging.MMLogger.get_current_instance) |
| `load_json_log` | Waiting for support |
| `load_json_log` | The output format changed. |
| `setup_multi_processes` | Removed, use [`mmengine.utils.dl_utils.set_multi_processing`](mmengine.utils.dl_utils.set_multi_processing). |
| `wrap_non_distributed_model` | Removed, we auto wrap the model in the runner. |
| `wrap_distributed_model` | Removed, we auto wrap the model in the runner. |
Expand Down
66 changes: 40 additions & 26 deletions docs/en/useful_tools/log_result_analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys loss
#### Plot the top-1 accuracy and top-5 accuracy curves, and save the figure to results.jpg.

```shell
python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accuracy_top-1 accuracy_top-5 --legend top1 top5 --out results.jpg
python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accuracy/top1 accuracy/top5 --legend top1 top5 --out results.jpg
```

#### Compare the top-1 accuracy of two log files in the same figure.
Expand All @@ -57,11 +57,6 @@ python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accu
python tools/analysis_tools/analyze_logs.py plot_curve log1.json log2.json --keys accuracy_top-1 --legend exp1 exp2
```

```{note}
The tool will automatically select to find keys in training logs or validation logs according to the keys.
Therefore, if you add a custom evaluation metric, please also add the key to `TEST_METRICS` in this tool.
```

### How to calculate training time

`tools/analysis_tools/analyze_logs.py` can also calculate the training time according to the log files.
Expand All @@ -75,18 +70,18 @@ python tools/analysis_tools/analyze_logs.py cal_train_time \
**Description of all arguments**:

- `json_logs` : The paths of the log files, separate multiple files by spaces.
- `--include-outliers` : If set, include the first iteration in each epoch (Sometimes the time of first iterations is longer).
- `--include-outliers` : If set, include the first time record in each epoch (Sometimes the time of the first iteration is longer).

Example:

```shell
python tools/analysis_tools/analyze_logs.py cal_train_time work_dirs/some_exp/20200422_153324.log.json
python tools/analysis_tools/analyze_logs.py cal_train_time work_dirs/your_exp/20230206_181002/vis_data/scalars.json
```

The output is expected to be like the below.

```text
-----Analyze train time of work_dirs/some_exp/20200422_153324.log.json-----
-----Analyze train time of work_dirs/your_exp/20230206_181002/vis_data/scalars.json-----
slowest epoch 68, average time is 0.3818
fastest epoch 1, average time is 0.3694
time std over epochs is 0.0020
Expand All @@ -104,41 +99,50 @@ We provide `tools/analysis_tools/eval_metric.py` to enable the user evaluate the

```shell
python tools/analysis_tools/eval_metric.py \
${CONFIG} \
${RESULT} \
[--metrics ${METRICS}] \
[--cfg-options ${CFG_OPTIONS}] \
[--metric-options ${METRIC_OPTIONS}]
[--metric ${METRIC_OPTIONS} ...] \
```

Description of all arguments:

- `config` : The path of the model config file.
- `result`: The Output result file in json/pickle format from `tools/test.py`.
- `--metrics` : Evaluation metrics, the acceptable values depend on the dataset.
- `--cfg-options`: If specified, the key-value pair config will be merged into the config file, for more details please refer to [Learn about Configs](../user_guides/config.md)
- `--metric-options`: If specified, the key-value pair arguments will be passed to the `metric_options` argument of dataset's `evaluate` function.
- `result`: The output result file in pickle format from `tools/test.py`.
- `--metric`: The metric and options to evaluate the results. You need to specify at least one metric and you
can also specify multiple `--metric` to use multiple metrics.

Please refer the [Metric Documentation](mmcls.evaluation) to find the available metrics and options.

```{note}
In `tools/test.py`, we support using `--out-items` option to select which kind of results will be saved. Please ensure the result file includes "class_scores" to use this tool.
In `tools/test.py`, we support using `--out-item` option to select which kind of results will be saved.
Please ensure the `--out-item` is not specified or `--out-item=pred` to use this tool.
```

**Examples**:

```shell
python tools/analysis_tools/eval_metric.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py your_result.pkl --metrics accuracy --metric-options "topk=(1,5)"
# Get the prediction results
python tools/test.py configs/resnet/resnet18_8xb16_cifar10.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth \
--out results.pkl

# Eval the top-1 and top-5 accuracy
python tools/analysis_tools/eval_metric.py results.pkl --metric type=Accuracy topk=1,5

# Eval accuracy, precision, recall and f1-score
python tools/analysis_tools/eval_metric.py results.pkl --metric type=Accuracy \
--metric type=SingleLabelMetric items=precision,recall,f1-score
```

### How to visualize the prediction results

We can also use this tool `tools/analysis_tools/analyze_results.py` to save the images with the highest scores in successful or failed prediction.
We can use `tools/analysis_tools/analyze_results.py` to save the images with the highest scores in successful or failed prediction.

```shell
python tools/analysis_tools/analyze_results.py \
${CONFIG} \
${RESULT} \
[--out-dir ${OUT_DIR}] \
[--topk ${TOPK}] \
[--rescale-factor ${RESCALE_FACTOR}] \
[--cfg-options ${CFG_OPTIONS}]
```

Expand All @@ -148,18 +152,28 @@ python tools/analysis_tools/analyze_results.py \
- `result`: Output result file in json/pickle format from `tools/test.py`.
- `--out_dir`: Directory to store output files.
- `--topk`: The number of images in successful or failed prediction with the highest `topk` scores to save. If not specified, it will be set to 20.
- `--rescale-factor`: Image rescale factor, which is useful if the output is too large or too small (Too small
images may cause the prediction tag is too vague).
- `--cfg-options`: If specified, the key-value pair config will be merged into the config file, for more details please refer to [Learn about Configs](../user_guides/config.md)

```{note}
In `tools/test.py`, we support using `--out-items` option to select which kind of results will be saved. Please ensure the result file includes "pred_score", "pred_label" and "pred_class" to use this tool.
In `tools/test.py`, we support using `--out-item` option to select which kind of results will be saved.
Please ensure the `--out-item` is not specified or `--out-item=pred` to use this tool.
```

**Examples**:

```shell
# Get the prediction results
python tools/test.py configs/resnet/resnet18_8xb16_cifar10.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth \
--out results.pkl

# Save the top-10 successful and failed predictions. And enlarge the sample images by 10 times.
python tools/analysis_tools/analyze_results.py \
configs/resnet/resnet50_b32x8_imagenet.py \
result.pkl \
--out_dir results \
--topk 50
configs/resnet/resnet18_8xb16_cifar10.py \
results.pkl \
--out-dir output \
--topk 10 \
--rescale-factor 10
```
6 changes: 4 additions & 2 deletions docs/en/useful_tools/print_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ Description of all arguments:

## Examples

Print the complete config of `configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py`

```shell
# Print a complete config
python tools/misc/print_config.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py

# Save the complete config to a independent config file.
python tools/misc/print_config.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py > final_config.py
```
2 changes: 1 addition & 1 deletion docs/zh_CN/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -564,7 +564,7 @@ visualizer = dict(
| :--------------------------: | :------------------------------------------------------------------------------------------------------------ |
| `collect_env` | 无变动 |
| `get_root_logger` | 移除,使用 [`mmengine.logging.MMLogger.get_current_instance`](mmengine.logging.MMLogger.get_current_instance) |
| `load_json_log` | 待支持 |
| `load_json_log` | 输出格式发生变化。 |
| `setup_multi_processes` | 移除,使用 [`mmengine.utils.dl_utils.set_multi_processing`](mmengine.utils.dl_utils.set_multi_processing) |
| `wrap_non_distributed_model` | 移除,现在 runner 会自动包装模型。 |
| `wrap_distributed_model` | 移除,现在 runner 会自动包装模型。 |
Expand Down
76 changes: 44 additions & 32 deletions docs/zh_CN/useful_tools/log_result_analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys loss
#### 绘制某日志文件对应的 top-1 和 top-5 准确率曲线图,并将曲线图导出为 results.jpg 文件。

```shell
python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accuracy_top-1 accuracy_top-5 --legend top1 top5 --out results.jpg
python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accuracy/top1 accuracy/top5 --legend top1 top5 --out results.jpg
```

#### 在同一图像内绘制两份日志文件对应的 top-1 准确率曲线图。
Expand All @@ -57,11 +57,6 @@ python tools/analysis_tools/analyze_logs.py plot_curve your_log_json --keys accu
python tools/analysis_tools/analyze_logs.py plot_curve log1.json log2.json --keys accuracy_top-1 --legend exp1 exp2
```

```{note}
The tool will automatically select to find keys in training logs or validation logs according to the keys.
Therefore, if you add a custom evaluation metric, please also add the key to `TEST_METRICS` in this tool.
```

### 如何统计训练时间

`tools/analysis_tools/analyze_logs.py` 也可以根据日志文件统计训练耗时。
Expand All @@ -74,19 +69,19 @@ python tools/analysis_tools/analyze_logs.py cal_train_time \

**所有参数的说明:**

- `json_logs` : 模型配置文件的路径(可同时传入多个,使用空格分开)。
- `--include-outliers` :如果指定,将不会排除每个轮次中第一轮迭代的记录(有时第一轮迭代会耗时较长)。
- `json_logs`模型配置文件的路径(可同时传入多个,使用空格分开)。
- `--include-outliers`如果指定,将不会排除每个轮次中第一个时间记录(有时第一轮迭代会耗时较长)。

**示例:**

```shell
python tools/analysis_tools/analyze_logs.py cal_train_time work_dirs/some_exp/20200422_153324.log.json
python tools/analysis_tools/analyze_logs.py cal_train_time work_dirs/your_exp/20230206_181002/vis_data/scalars.json
```

预计输出结果如下所示:

```text
-----Analyze train time of work_dirs/some_exp/20200422_153324.log.json-----
-----Analyze train time of work_dirs/your_exp/20230206_181002/vis_data/scalars.json-----
slowest epoch 68, average time is 0.3818
fastest epoch 1, average time is 0.3694
time std over epochs is 0.0020
Expand All @@ -95,37 +90,44 @@ average iter time: 0.3777 s/iter

## 结果分析

利用 `tools/test.py``--out` ,w我们可以将所有的样本的推理结果保存到输出 文件中。利用这一文件,我们可以进行进一步的分析。
利用 `tools/test.py` `--out`,我们可以将所有的样本的推理结果保存到输出文件中。利用这一文件,我们可以进行进一步的分析。

### 如何进行离线度量评估

我们提供了 `tools/analysis_tools/eval_metric.py` 脚本,使用户能够根据预测文件评估模型。

```shell
python tools/analysis_tools/eval_metric.py \
${CONFIG} \
${RESULT} \
[--metrics ${METRICS}] \
[--cfg-options ${CFG_OPTIONS}] \
[--metric-options ${METRIC_OPTIONS}]
[--metric ${METRIC_OPTIONS} ...] \
```

**所有参数说明**

- `config` : 配置文件的路径。
- `result`: `tools/test.py`的输出结果文件。
- `--metrics` : 评估的衡量指标,可接受的值取决于数据集类。
- `--cfg-options`:额外的配置选项,会被合入配置文件,参考[教程 1:如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)
- `--metric-options`:如果指定了,这些选项将被传递给数据集 `evaluate` 函数的 `metric_options` 参数。
- `result``tools/test.py` 输出的结果文件。
- `--metric`:用于评估结果的指标,请至少指定一个指标,并且你可以通过指定多个 `--metric` 来同时计算多个指标。

请参考[评估文档](mmcls.evaluation)选择可用的评估指标和对应的选项。

```{note}
In `tools/test.py`, we support using `--out-items` option to select which kind of results will be saved. Please ensure the result file includes "class_scores" to use this tool.
在 `tools/test.py` 中,我们支持使用 `--out-item` 选项来选择保存何种结果至输出文件。
请确保没有额外指定 `--out-item`,或指定了 `--out-item=pred`。
```

**示例**:

```shell
python tools/analysis_tools/eval_metric.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py your_result.pkl --metrics accuracy --metric-options "topk=(1,5)"
# 获取结果文件
python tools/test.py configs/resnet/resnet18_8xb16_cifar10.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth \
--out results.pkl

# 计算 top-1 和 top-5 准确率
python tools/analysis_tools/eval_metric.py results.pkl --metric type=Accuracy topk=1,5

# 计算准确率、精确度、召回率、F1-score
python tools/analysis_tools/eval_metric.py results.pkl --metric type=Accuracy \
--metric type=SingleLabelMetric items=precision,recall,f1-score
```

### 如何将预测结果可视化
Expand All @@ -138,27 +140,37 @@ python tools/analysis_tools/analyze_results.py \
${RESULT} \
[--out-dir ${OUT_DIR}] \
[--topk ${TOPK}] \
[--rescale-factor ${RESCALE_FACTOR}] \
[--cfg-options ${CFG_OPTIONS}]
```

**所有参数说明:**:

- `config` : 配置文件的路径。
- `result`: `tools/test.py`的输出结果文件。
- `--out_dir`:保存结果分析的文件夹路径。
- `--topk`: 分别保存多少张预测成功/失败的图像。如果不指定,默认为 `20`
- `--cfg-options`: 额外的配置选项,会被合入配置文件,参考[教程 1:如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)
- `config`:配置文件的路径。
- `result``tools/test.py`的输出结果文件。
- `--out_dir`:保存结果分析的文件夹路径。
- `--topk`:分别保存多少张预测成功/失败的图像。如果不指定,默认为 `20`
- `--rescale-factor`:图像的缩放系数,如果样本图像过大或过小时可以使用(过小的图像可能导致结果标签非常模糊)。
- `--cfg-options`:额外的配置选项,会被合入配置文件,参考[学习配置文件](../user_guides/config.md)

```{note}
In `tools/test.py`, we support using `--out-items` option to select which kind of results will be saved. Please ensure the result file includes "pred_score", "pred_label" and "pred_class" to use this tool.
在 `tools/test.py` 中,我们支持使用 `--out-item` 选项来选择保存何种结果至输出文件。
请确保没有额外指定 `--out-item`,或指定了 `--out-item=pred`。
```

**示例**:

```shell
# 获取预测结果文件
python tools/test.py configs/resnet/resnet18_8xb16_cifar10.py \
https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth \
--out results.pkl

# 保存预测成功/失败的图像中,得分最高的前 10 张,并在可视化时将输出图像放大 10 倍。
python tools/analysis_tools/analyze_results.py \
configs/resnet/resnet50_b32x8_imagenet.py \
result.pkl \
--out_dir results \
--topk 50
configs/resnet/resnet18_8xb16_cifar10.py \
results.pkl \
--out-dir output \
--topk 10 \
--rescale-factor 10
```
6 changes: 5 additions & 1 deletion docs/zh_CN/useful_tools/print_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,16 @@ python tools/misc/print_config.py ${CONFIG} [--cfg-options ${CFG_OPTIONS}]
所有参数的说明:

- `config` : 模型配置文件的路径。
- `--cfg-options`::额外的配置选项,会被合入配置文件,参考[教程 1:如何编写配置文件](https://mmclassification.readthedocs.io/zh_CN/latest/tutorials/config.html)
- `--cfg-options`::额外的配置选项,会被合入配置文件,参考[学习配置文件](../user_guides/config.md)

## 示例:

打印`configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py`文件的完整配置

```shell
# 打印完成的配置文件
python tools/misc/print_config.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py

# 将完整的配置文件保存为一个独立的配置文件
python tools/misc/print_config.py configs/t2t_vit/t2t-vit-t-14_8xb64_in1k.py > final_config.py
```
6 changes: 5 additions & 1 deletion mmcls/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .analyze import load_json_log
from .collect_env import collect_env
from .progress import track_on_main_process
from .setup_env import register_all_modules

__all__ = ['collect_env', 'register_all_modules', 'track_on_main_process']
__all__ = [
'collect_env', 'register_all_modules', 'track_on_main_process',
'load_json_log'
]
43 changes: 43 additions & 0 deletions mmcls/utils/analyze.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Copyright (c) OpenMMLab. All rights reserved.
import json


def load_json_log(json_log):
"""load and convert json_logs to log_dicts.
Args:
json_log (str): The path of the json log file.
Returns:
dict: The result dict contains two items, "train" and "val", for
the training log and validate log.
Example:
An example output:
.. code-block:: python
{
'train': [
{"lr": 0.1, "time": 0.02, "epoch": 1, "step": 100},
{"lr": 0.1, "time": 0.02, "epoch": 1, "step": 200},
{"lr": 0.1, "time": 0.02, "epoch": 1, "step": 300},
...
]
'val': [
{"accuracy/top1": 32.1, "step": 1},
{"accuracy/top1": 50.2, "step": 2},
{"accuracy/top1": 60.3, "step": 2},
...
]
}
"""
log_dict = dict(train=[], val=[])
with open(json_log, 'r') as log_file:
for line in log_file:
log = json.loads(line.strip())
# A hack trick to determine whether the line is training log.
mode = 'train' if 'lr' in log else 'val'
log_dict[mode].append(log)

return log_dict
Loading

0 comments on commit bedf4e9

Please sign in to comment.