YOLOv3 Practical Complete Example
This document uses the ENNP SDK to port YOLOv3 to the hardware acceleration module inside the EIC7700 to perform inference with the NPU neural network model. Before referring to this document, please ensure you have set up the required environment by following the ENNP SDK download, EsQuant installation, and EsAAC and EsSimulator tool installation instructions.
This document has been tested on x86 Ubuntu 22.04 with Linux 6.8.0-52-generic.
Model Conversion
Export ONNX Model
-
Clone the official code from GitHub
git clone https://github.com/ultralytics/yolov3.git
-
Modify dependency versions Specify torch version as 1.12.0, torchvision version as 0.13.0, and remove comments for onnx and onnx-simplifier
vim requirements.txt
torch==1.12.0
torchvision==0.13.0
onnx>=1.10.0
onnx-simplifier>=0.4.1 -
Install dependencies
cd yolov3
pip3 install -r requirements.txt -
Download the official model
wget https://github.com/ultralytics/yolov3/releases/download/v9.6.0/yolov3.pt
-
Export to ONNX
python3 export.py --weights ./yolov3.pt --img-size 416 --simplify --opset 13 --include onnx
Model Pruning
Since the exported model contains some post-processing operations and the quantization tool does not currently support these operations, they need to be pruned. The parts to be pruned are the operations after the convolution layers. You can check these using Netron and update the pruning script with the appropriate input_names and output_names.
- onnx::Shape_406
- onnx::Shape_461
- onnx::Reshape_516
import onnx
input_onnx = "yolov3.onnx"
input_names = ["images"]
output_names = ["onnx::Shape_406", "onnx::Shape_461", "onnx::Reshape_516"]
output_onnx = input_onnx
cut_suffix = "_sim_extract_416_notranspose_noreshape." + input_onnx.split('.')[-1]
new_output_onnx = output_onnx.replace(".onnx", cut_suffix)
print(new_output_onnx)
onnx.utils.extract_model(input_onnx, new_output_onnx, input_names, output_names)
After running this Python script, the pruned model yolov3_sim_extract_416_notranspose_noreshape.onnx
will be generated in the current directory.
Model Quantization
Use the EsQuant tool to perform model quantization. This should be done within the EsQuant Docker environment. For more details, refer to the EsQuant model quantization tool.
-
Configure
config.json
The
config.json
file is provided innn-tools/sample/yolov3/esquant
. Below is an example, please refer to it and modify it according to your actual setup.{
"version": "1.3",
"model": {
"model_path": "/workspace/yolov3_sim_extract_416_notranspose_noreshape.onnx",
"save_path": "/workspace/yolov3/",
"images_list": "/workspace/img_list.txt",
"analysis_list": "/workspace/alys_list.txt"
},
"quant": {
"quantized_method": "per_channel",
"quantized_dtype": "int8",
"requant_mode": "mean",
"quantized_algorithm": "at_eic",
"optimization_option": "auto",
"bias_option": "absmax",
"nodes_option1": [],
"nodes_option2": [],
"nodes_i8": [],
"nodes_i16": [],
"mean": [
0, 0, 0
],
"std": [
1, 1, 1
],
"norm": true,
"scale_path": "",
"enable_analyse": true,
"device": "cpu"
},
"preprocess": {
"input_format": "RGB",
"keep_ratio": false,
"resize_shape": [
416,
416
],
"crop_shape": [
416,
416
]
}
} -
Download the coco2017 dataset in EsQuant Docker
The model quantization requires a calibration set. Here we use the coco2017-1000 dataset for calibration. Please download it yourself, and refer to
nn-tools/sample/yolov3/esquant
to create theimg_list.txt
andalys_list.txt
files.-
img_list.txt
/workspace/coco/val2017_1000/000000095069.jpg
/workspace/coco/val2017_1000/000000499313.jpg
/workspace/coco/val2017_1000/000000579893.jpg
/workspace/coco/val2017_1000/000000023230.jpg
/workspace/coco/val2017_1000/000000162035.jpg
... -
alys_list.txt
/workspace/coco/val2017_1000/000000095069.jpg
-
-
Execute Quantization
python3 Example_with_config.py --config_path ./config.json --preprocess_name Yolo
After the quantization is complete, navigate to the
save_path
directory defined in theconfig.json
file. You should see the generatedworkspace_yolov3_sim_extract_416_notranspose_noreshape.json
file, along with the ONNX file after graph fusion and accuracy analysis resultsprecision_accumulate_result.txt
,precision_reset_result.txt
. -
(Optional) Quantization Model Accuracy Analysis
The accuracy analysis relies on the
precision_accumulate_result.txt
andprecision_reset_result.txt
files generated during the quantization process.precision_accumulate_result.txt
contains the cosine similarity errors between the quantized and floating-point models at each layer.precision_reset_result.txt
contains the cosine similarity errors at each layer with reset inputs.
Sigmoid_240: 0.987436056137085
Mul_241 : 0.9122292399406433
Conv_242 : 0.9125502705574036
Sigmoid_243: 0.9889860153198242
Mul_244 : 0.9188247919082642
Conv_245 : 0.8988267183303833
Sigmoid_246: 0.9783140420913696
Mul_247 : 0.9119919538497925
Conv_248 : 0.997031569480896