Skip to content

Latest commit

 

History

History
119 lines (95 loc) · 4.94 KB

demo.md

File metadata and controls

119 lines (95 loc) · 4.94 KB

2D Object Detection with tkDNN

Supported Networks

  • Yolo4, Yolo4-csp, Yolo4x, Yolo4_berkeley, Yolo4tiny
  • Yolo3, Yolo3_berkeley, Yolo3_coco4, Yolo3_flir, Yolo3_512, Yolo3tiny, Yolo3tiny_512
  • Yolo2, Yolo2_voc, Yolo2tiny
  • Csresnext50-panet-spp, Csresnext50-panet-spp_berkeley
  • Resnet101_cnet, Dla34_cnet
  • Mobilenetv2ssd, Mobilenetv2ssd512, Bdd-mobilenetv2ssd

Index

2D Object Detection

This is an example using yolov4.

To run the an object detection first create the .rt file by running:

rm yolo4_fp32.rt        # be sure to delete(or move) old tensorRT files
./test_yolo4            # run the yolo test (is slow)

If you get problems in the creation, try to check the error activating the debug of TensorRT in this way:

cmake .. -DDEBUG=True
make

Once you have successfully created your rt file, run the demo:

./demo yolo4_fp32.rt ../demo/yolo_test.mp4 y

In general the demo program takes 7 parameters:

./demo <network-rt-file> <path-to-video> <kind-of-network> <number-of-classes> <n-batches> <show-flag> <conf-thresh>

where

  • <network-rt-file> is the rt file generated by a test
  • <<path-to-video> is the path to a video file or a camera input
  • <kind-of-network> is the type of network. Thee types are currently supported: y (YOLO family), c (CenterNet family) and m (MobileNet-SSD family)
  • <number-of-classes>is the number of classes the network is trained on
  • <n-batches> number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
  • <show-flag> if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)
  • <conf-thresh> confidence threshold for the detector. Only bounding boxes with threshold greater than conf-thresh will be displayed.

N.B. By default it is used FP32 inference

demo

FP16 inference

To run the demo with FP16 inference follow these steps (example with yolov3):

export TKDNN_MODE=FP16  # set the half floating point optimization
rm yolo3_fp16.rt        # be sure to delete(or move) old tensorRT files
./test_yolo3            # run the yolo test (is slow)
./demo yolo3_fp16.rt ../demo/yolo_test.mp4 y

N.B. Using FP16 inference will lead to some errors in the results (first or second decimal).

INT8 inference

To run the demo with INT8 inference three environment variables need to be set:

  • export TKDNN_MODE=INT8: set the 8-bit integer optimization
  • export TKDNN_CALIB_IMG_PATH=/path/to/calibration/image_list.txt : image_list.txt has in each line the absolute path to a calibration image
  • export TKDNN_CALIB_LABEL_PATH=/path/to/calibration/label_list.txt: label_list.txt has in each line the absolute path to a calibration label

You should provide image_list.txt and label_list.txt, using training images. However, if you want to quickly test the INT8 inference you can run (from this repo root folder)

bash scripts/download_validation.sh COCO

to automatically download COCO2017 validation (inside demo folder) and create those needed file. Use BDD instead of COCO to download BDD validation.

Then a complete example using yolo3 and COCO dataset would be:

export TKDNN_MODE=INT8
export TKDNN_CALIB_LABEL_PATH=../demo/COCO_val2017/all_labels.txt
export TKDNN_CALIB_IMG_PATH=../demo/COCO_val2017/all_images.txt
rm yolo3_int8.rt        # be sure to delete(or move) old tensorRT files
./test_yolo3            # run the yolo test (is slow)
./demo yolo3_int8.rt ../demo/yolo_test.mp4 y

N.B.

  • Using INT8 inference will lead to some errors in the results.
  • The test will be slower: this is due to the INT8 calibration, which may take some time to complete.
  • INT8 calibration requires TensorRT version greater than or equal to 6.0
  • Only 100 images are used to create the calibration table by default (set in the code).

Batching

BatchSize bigger than 1

export TKDNN_BATCHSIZE=2
# build tensorRT files

This will create a TensorRT file with the desired max batch size. The test will still run with a batch of 1, but the created tensorRT can manage the desired batch size.

Test batch Inference

This will test the network with random input and check if the output of each batch is the same.

./test_rtinference <network-rt-file> <number-of-batches>
# <number-of-batches> should be less or equal to the max batch size of the <network-rt-file>

# example
export TKDNN_BATCHSIZE=4           # set max batch size
rm yolo3_fp32.rt                   # be sure to delete(or move) old tensorRT files
./test_yolo3                       # build RT file
./test_rtinference yolo3_fp32.rt 4 # test with a batch size of 4