TDA4VM: Encoding processing interferes with display frame rate

Part Number: TDA4VM

My English is not very good and my expression may not be accurate. I hope you can understand. Sorry.

Hardware: TDA4VM
SDK: 0806

My current needs are as follows:
The encoding thread runs at 10fps
The display thread  runs at 60fps


The results of my test are as follows:
The encoding thread runs at 10fps, but the display thread cannot run at 60fps. The reasons I found are as follows:
When the encoding thread is opened, the time consumption of the display thread will increase. 

The main code for encoding threads:

static vx_status autox_codec_encode(AppCMS_Codec *obj, vx_int32 frame_id)
{
    vx_status status = VX_SUCCESS;
    AppGraphParamRefPool *enc_pool = &obj->enc_pool;

    if (frame_id >= APP_BUFFER_Q_DEPTH)
    {
        if (frame_id >= enc_pool->bufq_depth)
        {
            if (status == VX_SUCCESS && obj->encode == 1)
            {
                status = appCodecDeqAppSrc(obj->fifo_id_output);
                if (status != VX_SUCCESS)
                {
                    printf("# ERROR: func:%s Error appCodecDeqAppSrc return err. obj->fifo_id_output [%d]\n", __func__,obj->fifo_id_output);
                }
            }
            if (status == VX_SUCCESS)
            {
                status = unmap_vx_object_arr(enc_pool->arr[obj->fifo_id_output], enc_pool->map_id[obj->fifo_id_output], obj->sensorObj_num_cameras_enabled);
                if (status != VX_SUCCESS)
                {
                    printf("# ERROR: func:%s Error unmap_vx_object_arr[=][=] status:obj->fifo_id_output [%d] obj->appsrc_push_id [%d]\n", __func__,obj->fifo_id_output,obj->appsrc_push_id);
                }
                status = VX_SUCCESS;
            }
        }

        if (status == VX_SUCCESS)
        {
            status = map_vx_object_arr(enc_pool->arr[obj->appsrc_push_id], enc_pool->data_ptr[obj->appsrc_push_id], enc_pool->map_id[obj->appsrc_push_id], obj->sensorObj_num_cameras_enabled);
            if (status != VX_SUCCESS)
            {
                printf("# ERROR: func:%s Error map_vx_object_arr[*][*] status:obj->fifo_id_output [%d]\n", __func__,obj->fifo_id_output);
            }
            status = VX_SUCCESS;
        }
        if (status == VX_SUCCESS && obj->encode == 1)
        {
            status = appCodecEnqAppSrc(obj->appsrc_push_id);
        }
        if (status == VX_SUCCESS)
        {
            obj->appsrc_push_id++;
            obj->appsrc_push_id = (obj->appsrc_push_id >= enc_pool->bufq_depth) ? 0 : obj->appsrc_push_id;
        }
    }
}

 The main code for Display thread:

if (DD_ENABLE_LEFT_Display)
{
    uint32_t num_refs;
    static int display_node_wait_loops=3;
    static int msc_out_node_wait_loops=2; 
    if (status == VX_SUCCESS)
    {
        if(msc_out_node_wait_loops==0)
        {
            status = vxGraphParameterDequeueDoneRef(obj->graph_left, obj->imgMosaicObj_left.output_graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);
            if (status != VX_SUCCESS)
            {
                printf("vxGraphParameterDequeueDoneRef L q Out error\n");
            }

            status = vxGraphParameterDequeueDoneRef(obj->graph_left, obj->imgMosaicObj_left.inputs[0].graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);
            if (status != VX_SUCCESS)
            {
                printf("vxGraphParameterDequeueDoneRef L q In error\n");
            }
        }
        else
        {
            msc_out_node_wait_loops--;
        }
    }
    if (status == VX_SUCCESS)
    {
        if(display_node_wait_loops==0)
        {
            status = vxGraphParameterDequeueDoneRef(obj->graph_left, obj->displayObj_csi_tx_vc0_left.output_graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);
            if (status != VX_SUCCESS)
            {
                printf("vxGraphParameterDequeueDoneRef L q error\n");
            }
        }
        else
        {
            display_node_wait_loops--;
        }
    }
    ...
}

The parts I tested that influence each other are as follows:

If I comment the following code in the coding thread
appCodecEnqAppSrc(obj->appsrc_push_id);

Then the following code in the display thread takes 6~8ms, and the display thread can run to 60fps
vxGraphParameterDequeueDoneRef(obj->graph_left, obj->displayObj_csi_tx_vc0_left.output_graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);

If I enable this part of the code
appCodecEnqAppSrc(obj->appsrc_push_id);

Then the following code in the display thread will take 10~14ms to compile, then the display thread will not run 60fps
vxGraphParameterDequeueDoneRef(obj->graph_left, obj->displayObj_csi_tx_vc0_left.output_graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);

My questions:
Why does this operation of the encoding thread
appCodecEnqAppSrc(obj->appsrc_push_id);
This will increase the time consumption of this function
vxGraphParameterDequeueDoneRef(obj->graph_left, obj->displayObj_csi_tx_vc0_left.output_graph_parameter_index, (vx_reference*)&output_image, 1, &num_refs);