Skip to content

Latest commit

 

History

History
120 lines (96 loc) · 5.12 KB

demo.md

File metadata and controls

120 lines (96 loc) · 5.12 KB

2D Object Detection with tkDNN

Supported Networks

  • Yolo4, Yolo4-csp, Yolo4x, Yolo4_berkeley, Yolo4tiny
  • Yolo3, Yolo3_berkeley, Yolo3_coco4, Yolo3_flir, Yolo3_512, Yolo3tiny, Yolo3tiny_512
  • Yolo2, Yolo2_voc, Yolo2tiny
  • Csresnext50-panet-spp, Csresnext50-panet-spp_berkeley
  • Resnet101_cnet, Dla34_cnet
  • Mobilenetv2ssd, Mobilenetv2ssd512, Bdd-mobilenetv2ssd

Index

2D Object Detection

This is an example using yolov4.

To run the an object detection first create the .rt file by running:

rm yolo4_fp32.rt        # be sure to delete(or move) old tensorRT files
./test_yolo4            # run the yolo test (is slow)

If you get problems in the creation, try to check the error activating the debug of TensorRT in this way:

cmake .. -DCMAKE_BUILD_TYPE=Debug -DDEBUG=True
make

Once you have successfully created your rt file, run the demo:

./ demo <path-to-config>

In general the demo program takes 1 parameter, the <path-to-config> that is the path to che configuration file. The parameter is optional and its default value is "../demo/demoConfig.yaml".

The config file is a yaml file with the following attributes:

  • net is the rt file generated by a test
  • input is the path to a video file or a camera input (on Linux)
  • win_input is the path to a video file or a camera input (on Windows)
  • ntype is the type of network. Thee types are currently supported: y (YOLO family), c (CenterNet family) and m (MobileNet-SSD family)
  • n_classes is the number of classes the network is trained on
  • n_batch number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
  • conf_thresh confidence threshold for the detector. Only bounding boxes with threshold greater than conf-thresh will be displayed.
  • show if set to 0 the demo will not show the visualization (if n-batches ==1)
  • save if set to 1 the demo will save the video of the demo into result.mp4 (if n-batches ==1)

N.B. By default it is used FP32 inference

demo

FP16 inference

To run the demo with FP16 inference follow these steps (example with yolov3):

export TKDNN_MODE=FP16  # set the half floating point optimization
rm yolo4_fp16.rt        # be sure to delete(or move) old tensorRT files
./test_yolo4            # run the yolo test (is slow)
#set net: yolo4_fp16.rt in the config file
./demo

N.B. Using FP16 inference will lead to some errors in the results (first or second decimal).

INT8 inference

To run the demo with INT8 inference three environment variables need to be set:

  • export TKDNN_MODE=INT8: set the 8-bit integer optimization
  • export TKDNN_CALIB_IMG_PATH=/path/to/calibration/image_list.txt : image_list.txt has in each line the absolute path to a calibration image
  • export TKDNN_CALIB_LABEL_PATH=/path/to/calibration/label_list.txt: label_list.txt has in each line the absolute path to a calibration label

You should provide image_list.txt and label_list.txt, using training images. However, if you want to quickly test the INT8 inference you can run (from this repo root folder)

bash scripts/download_validation.sh COCO

to automatically download COCO2017 validation (inside demo folder) and create those needed file. Use BDD instead of COCO to download BDD validation.

Then a complete example using yolo3 and COCO dataset would be:

export TKDNN_MODE=INT8
export TKDNN_CALIB_LABEL_PATH=../demo/COCO_val2017/all_labels.txt
export TKDNN_CALIB_IMG_PATH=../demo/COCO_val2017/all_images.txt
rm yolo4_int8.rt        # be sure to delete(or move) old tensorRT files
./test_yolo4            # run the yolo test (is slow)
#set net: yolo4_int8.rt in the config file
./demo

N.B.

  • Using INT8 inference will lead to some errors in the results.
  • The test will be slower: this is due to the INT8 calibration, which may take some time to complete.
  • INT8 calibration requires TensorRT version greater than or equal to 6.0
  • Only 100 images are used to create the calibration table by default (set in the code).

Batching

BatchSize bigger than 1

export TKDNN_BATCHSIZE=2
# build tensorRT files

This will create a TensorRT file with the desired max batch size. The test will still run with a batch of 1, but the created tensorRT can manage the desired batch size.

Test batch Inference

This will test the network with random input and check if the output of each batch is the same.

./test_rtinference <network-rt-file> <number-of-batches>
# <number-of-batches> should be less or equal to the max batch size of the <network-rt-file>

# example
export TKDNN_BATCHSIZE=4           # set max batch size
rm yolo3_fp32.rt                   # be sure to delete(or move) old tensorRT files
./test_yolo3                       # build RT file
./test_rtinference yolo3_fp32.rt 4 # test with a batch size of 4