我使用edge-ai-tools工具编译了一个车道线检测模型,部署在EVM上时,准确率降低了很多,这是什么原因导致的啊?
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
TI的EVM板。就按照edge-ai-tools工具里的readme编译的。log如下。
Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']
Running 1 Models - ['bestb']
Running_Model : bestb
Running shape inference on model ../../../models/public/culane_18.onnx
--------------------2---------------------------------
tidl_tools_path = /home/leo/code/edgeai-tidl-tools-master/examples/osrt_python/ort/tidl_tools
artifacts_folder = ../../../model-artifacts//bestb/
tidl_tensor_bits = 8
debug_level = 1
num_tidl_subgraphs = 16
tidl_denylist =
tidl_denylist_layer_name =
tidl_denylist_layer_type =
tidl_allowlist_layer_name =
model_type =
tidl_calibration_accuracy_level = 7
tidl_calibration_options:num_frames_calibration = 2
tidl_calibration_options:bias_calibration_iterations = 5
mixed_precision_factor = -1.000000
model_group_id = 0
power_of_2_quantization = 2
enable_high_resolution_optimization = 0
pre_batchnorm_fold = 1
add_data_convert_ops = 3
output_feature_16bit_names_list =
m_params_16bit_names_list =
reserved_compile_constraints_flag = 1601
ti_internal_reserved_1 =
****** WARNING : Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options******
Supported TIDL layer type --- Conv -- Conv_0
Supported TIDL layer type --- Relu -- Relu_1
Supported TIDL layer type --- MaxPool -- MaxPool_2
Supported TIDL layer type --- Conv -- Conv_3
Supported TIDL layer type --- Relu -- Relu_4
Supported TIDL layer type --- Conv -- Conv_5
Supported TIDL layer type --- Add -- Add_6
Supported TIDL layer type --- Relu -- Relu_7
Supported TIDL layer type --- Conv -- Conv_8
Supported TIDL layer type --- Relu -- Relu_9
Supported TIDL layer type --- Conv -- Conv_10
Supported TIDL layer type --- Add -- Add_11
Supported TIDL layer type --- Relu -- Relu_12
Supported TIDL layer type --- Conv -- Conv_13
Supported TIDL layer type --- Relu -- Relu_14
Supported TIDL layer type --- Conv -- Conv_15
Supported TIDL layer type --- Conv -- Conv_16
Supported TIDL layer type --- Add -- Add_17
Supported TIDL layer type --- Relu -- Relu_18
Supported TIDL layer type --- Conv -- Conv_19
Supported TIDL layer type --- Relu -- Relu_20
Supported TIDL layer type --- Conv -- Conv_21
Supported TIDL layer type --- Add -- Add_22
Supported TIDL layer type --- Relu -- Relu_23
Supported TIDL layer type --- Conv -- Conv_24
Supported TIDL layer type --- Relu -- Relu_25
Supported TIDL layer type --- Conv -- Conv_26
Supported TIDL layer type --- Conv -- Conv_27
Supported TIDL layer type --- Add -- Add_28
Supported TIDL layer type --- Relu -- Relu_29
Supported TIDL layer type --- Conv -- Conv_30
Supported TIDL layer type --- Relu -- Relu_31
Supported TIDL layer type --- Conv -- Conv_32
Supported TIDL layer type --- Add -- Add_33
Supported TIDL layer type --- Relu -- Relu_34
Supported TIDL layer type --- Conv -- Conv_35
Supported TIDL layer type --- Relu -- Relu_36
Supported TIDL layer type --- Conv -- Conv_37
Supported TIDL layer type --- Conv -- Conv_38
Supported TIDL layer type --- Add -- Add_39
Supported TIDL layer type --- Relu -- Relu_40
Supported TIDL layer type --- Conv -- Conv_41
Supported TIDL layer type --- Relu -- Relu_42
Supported TIDL layer type --- Conv -- Conv_43
Supported TIDL layer type --- Add -- Add_44
Supported TIDL layer type --- Relu -- Relu_45
Supported TIDL layer type --- Conv -- Conv_46
Supported TIDL layer type --- Reshape -- Reshape_48
Supported TIDL layer type --- Gemm -- Gemm_49
Supported TIDL layer type --- Relu -- Relu_50
Supported TIDL layer type --- Gemm -- Gemm_51
Supported TIDL layer type --- Reshape -- Reshape_53
Preliminary subgraphs created = 1
Final number of subgraphs created are : 1, - Offloaded Nodes - 52, Total Nodes - 52
SUGGESTION -- [TIDL_InnerProductLayer] Size larger than 2048 * 2048 is not optimal.
Running runtimes graphviz - /home/leo/code/edgeai-tidl-tools-master/examples/osrt_python/ort/tidl_tools/tidl_graphVisualiser_runtimes.out ../../../model-artifacts//bestb//allowedNode.txt ../../../model-artifacts//bestb//tempDir/graphvizInfo.txt ../../../model-artifacts//bestb//tempDir/runtimes_visualization.svg
*** In TIDL_createStateImportFunc ***
Compute on node : TIDLExecutionProvider_TIDL_0_0
0, Conv, 3, 1, input, 201
1, Relu, 1, 1, 201, 129
2, MaxPool, 1, 1, 129, 130
3, Conv, 3, 1, 130, 204
4, Relu, 1, 1, 204, 133
5, Conv, 3, 1, 133, 207
6, Add, 2, 1, 207, 136
7, Relu, 1, 1, 136, 137
8, Conv, 3, 1, 137, 210
9, Relu, 1, 1, 210, 140
10, Conv, 3, 1, 140, 213
11, Add, 2, 1, 213, 143
12, Relu, 1, 1, 143, 144
13, Conv, 3, 1, 144, 216
14, Relu, 1, 1, 216, 147
15, Conv, 3, 1, 147, 219
16, Conv, 3, 1, 144, 222
17, Add, 2, 1, 219, 152
18, Relu, 1, 1, 152, 153
19, Conv, 3, 1, 153, 225
20, Relu, 1, 1, 225, 156
21, Conv, 3, 1, 156, 228
22, Add, 2, 1, 228, 159
23, Relu, 1, 1, 159, 160
24, Conv, 3, 1, 160, 231
25, Relu, 1, 1, 231, 163
26, Conv, 3, 1, 163, 234
27, Conv, 3, 1, 160, 237
28, Add, 2, 1, 234, 168
29, Relu, 1, 1, 168, 169
30, Conv, 3, 1, 169, 240
31, Relu, 1, 1, 240, 172
32, Conv, 3, 1, 172, 243
33, Add, 2, 1, 243, 175
34, Relu, 1, 1, 175, 176
35, Conv, 3, 1, 176, 246
36, Relu, 1, 1, 246, 179
37, Conv, 3, 1, 179, 249
38, Conv, 3, 1, 176, 252
39, Add, 2, 1, 249, 184
40, Relu, 1, 1, 184, 185
41, Conv, 3, 1, 185, 255
42, Relu, 1, 1, 255, 188
43, Conv, 3, 1, 188, 258
44, Add, 2, 1, 258, 191
45, Relu, 1, 1, 191, 192
46, Conv, 3, 1, 192, 193
47, Reshape, 2, 1, 193, 195
48, Gemm, 3, 1, 195, 196
49, Relu, 1, 1, 196, 197
50, Gemm, 3, 1, 197, 198
51, Reshape, 2, 1, 198, output
Input tensor name - input
Output tensor name - output
In TIDL_onnxRtImportInit subgraph_name=output
Layer 0, subgraph id output, name=output
Layer 1, subgraph id output, name=input
In TIDL_runtimesOptimizeNet: LayerIndex = 54, dataIndex = 53
************** Frame index 1 : Running float import *************
In TIDL_runtimesPostProcessNet
SUGGESTION: [TIDL_InnerProductLayer] Gemm_51 Size larger than 2048 * 2048 is not optimal.
****************************************************
** 1 WARNINGS 0 ERRORS **
****************************************************
************ in TIDL_subgraphRtCreate ************
The soft limit is 2048
The hard limit is 2048
MEM: Init ... !!!
MEM: Init ... Done !!!
0.0s: VX_ZONE_INIT:Enabled
0.5s: VX_ZONE_ERROR:Enabled
0.6s: VX_ZONE_WARNING:Enabled
0.1579s: VX_ZONE_INIT:[tivxInit:184] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************
******* In TIDL_subgraphRtInvoke ********
Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
27, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
28, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
31, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
37, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0
Sub Graph Stats 237.000000 16919031.000000 186.000000
******* TIDL_subgraphRtInvoke done ********
********** Frame Index 1 : Running float inference **********
******* In TIDL_subgraphRtInvoke ********
Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
27, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
28, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
31, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
37, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0
Sub Graph Stats 319.000000 16811320.000000 131.000000
******* TIDL_subgraphRtInvoke done ********
********** Frame Index 2 : Running fixed point mode for calibration **********
In TIDL_runtimesPostProcessNet
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 16715.82 .... ..... ... .... .....
# 1 . .. T 16788.76 .... ..... ... .... .....
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 10349.83 .... ..... ... .... .....
# 1 . .. T 10283.69 .... ..... ... .... .....
***************** Calibration iteration number 0 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 10276.73 .... ..... ... .... .....
# 1 . .. T 10274.64 .... ..... ... .... .....
***************** Calibration iteration number 1 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 10418.43 .... ..... ... .... .....
# 1 . .. T 10384.01 .... ..... ... .... .....
***************** Calibration iteration number 2 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 10272.75 .... ..... ... .... .....
# 1 . .. T 10359.20 .... ..... ... .... .....
***************** Calibration iteration number 3 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt
Freeing memory for user provided Net
----------------------- TIDL Process with REF_ONLY FLOW ------------------------
# 0 . .. T 10398.03 .... ..... ... .... .....
# 1 . .. T 10254.21 .... ..... ... .... .....
***************** Calibration iteration number 4 completed ************************
------------------ Network Compiler Traces -----------------------------
NC running for device: 1
Running with OTF buffer optimizations
successful Memory allocation
Rerunning network compiler for reshape
------------------ Network Compiler Traces -----------------------------
NC running for device: 1
Running with OTF buffer optimizations
successful Memory allocation
SUGGESTION: [TIDL_InnerProductLayer] Gemm_51 Size larger than 2048 * 2048 is not optimal.
****************************************************
** 1 WARNINGS 0 ERRORS **
****************************************************
Completed_Model : 1, Name : bestb , Total time : 91370.38, Offload Time : 16865.18 , DDR RW MBs : 0, Output File : py_out_bestb_ppp.jpg
************ in TIDL_subgraphRtDelete ************
MEM: Deinit ... !!!
MEM: Alloc's: 26 alloc's of 743870520 bytes
MEM: Free's : 26 free's of 743870520 bytes
MEM: Open's : 0 allocs of 0 bytes
MEM: Deinit ... Done !!!
请看下面e2e工程师的回复。
Please check out our extensive documentation on Performance and Accuracy here :
https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/tidl_osr_debug.md
请看下面e2e工程师最新的回复。
Can you share the basis for this comparison you doing ?
The model running on the board is quantized model, there is speed vs accuracy trade off that one need to consider while running the model on the target.
If you are looking for accuracy tradeoff what parameter you specifically looking for such ?