TensorRT-Demo/README_mAP.md

123 lines
5.4 KiB
Markdown

# Instructions for evaluating accuracy (mAP) of SSD models
Preparation
-----------
1. Prepare image data and label ('bbox') file for the evaluation. I used COCO [2017 Val images (5K/1GB)](http://images.cocodataset.org/zips/val2017.zip) and [2017 Train/Val annotations (241MB)](http://images.cocodataset.org/annotations/annotations_trainval2017.zip). You could try to use your own dataset for evaluation, but you'd need to convert the labels into [COCO Object Detection ('bbox') format](http://cocodataset.org/#format-data) if you want to use code in this repository without modifications.
More specifically, I downloaded the images and labels, and unzipped files into `${HOME}/data/coco/`.
```shell
$ wget http://images.cocodataset.org/zips/val2017.zip \
-O ${HOME}/Downloads/val2017.zip
$ wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip \
-O ${HOME}/Downloads/annotations_trainval2017.zip
$ mkdir -p ${HOME}/data/coco/images
$ cd ${HOME}/data/coco/images
$ unzip ${HOME}/Downloads/val2017.zip
$ cd ${HOME}/data/coco
$ unzip ${HOME}/Downloads/annotations_trainval2017.zip
```
Later on I would be using the following (unzipped) image and annotation files for the evaluation.
```
${HOME}/data/coco/images/val2017/*.jpg
${HOME}/data/coco/annotations/instances_val2017.json
```
2. Install 'pycocotools'. The easiest way is to use `pip3 install`.
```shell
$ sudo pip3 install pycocotools
```
Alternatively, you could build and install it from [source](https://github.com/cocodataset/cocoapi).
3. Install additional requirements.
```shell
$ sudo pip3 install progressbar2
```
Evaluation
----------
I've created the [eval_ssd.py](eval_ssd.py) script to do the [mAP evaluation](http://cocodataset.org/#detection-eval).
```
usage: eval_ssd.py [-h] [--mode {tf,trt}] [--imgs_dir IMGS_DIR]
[--annotations ANNOTATIONS]
{ssd_mobilenet_v1_coco,ssd_mobilenet_v2_coco}
```
The script takes 1 mandatory argument: either 'ssd_mobilenet_v1_coco' or 'ssd_mobilenet_v2_coco'. In addition, it accepts the following options:
* `--mode {tf,trt}`: to evaluate either the unoptimized TensorFlow frozen inference graph (tf) or the optimized TensorRT engine (trt).
* `--imgs_dir IMGS_DIR`: to specify an alternative directory for reading image files.
* `--annotations ANNOTATIONS`: to specify an alternative annotation/label file.
For example, I evaluated both 'ssd_mobilenet_v1_coco' and 'ssd_mobilenet_v2_coco' TensorRT engines on my x86_64 PC and got these results. The overall mAP values are `0.230` and `0.246`, respectively.
```shell
$ python3 eval_ssd.py --mode trt ssd_mobilenet_v1_coco
......
100% (5000 of 5000) |####################| Elapsed Time: 0:00:26 Time: 0:00:26
loading annotations into memory...
Done (t=0.36s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.11s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=8.89s).
Accumulating evaluation results...
DONE (t=1.37s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.351
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.254
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.166
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.530
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.209
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.264
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.264
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.022
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.606
None
$
$ python3 eval_ssd.py --mode trt ssd_mobilenet_v2_coco
......
100% (5000 of 5000) |####################| Elapsed Time: 0:00:29 Time: 0:00:29
loading annotations into memory...
Done (t=0.37s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.12s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=9.47s).
Accumulating evaluation results...
DONE (t=1.42s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.248
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.375
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.273
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.176
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.573
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.221
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.279
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.202
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.643
None
```