This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3352 SD卡 mmcblk0: retrying using single block read导致系统卡死



开发过程中发现,人为割断data[3]模拟sd卡接触不良现象,导致系统卡死。

log中会出现mmcblk0: retrying using single block read语句。

我尝试在sd初始化阶段规避这个问题,方法如下

设置sd卡4线操作后,使用ACMD13命令再次读取sd状态寄存器,发现仍然卡死。

mmc_wait_for_req

   -> __mmc_start_req

   -> mmc_wait_for_req_done

SD卡中断有触发,但是死在了mmc_wait_for_req_done中,且mmc_wait_for_req_done 中的 wait_for_completion_timeout函数都没有执行。

请帮忙分析一下。

  • 问题已经解决

    系统卡死实际上是一直触发setup_dma_interrupt,原因是没有清空中断标志位。

    红框中的if语句无法进入,无法清空中断标志位

    而if语句无法进入的主要原因在于

    omap_hsmmc_do_irq 
      -> omap_hsmmc_do_irq 
        -> omap_hsmmc_dma_cleanup //DMA clean up for command errors
          -> omap_free_dma 
            -> edma_alloc_channel
              -> setup_dma_interrupt(channel, NULL, NULL);
    红框内的代码向SH_IECR中写1影响SH_IER中的值,也就是中断中的sh_ier的值
    导致if语句不成立。
    黑框中是我加的代码向SH_ICR直接清空中断标志位,防止多次进入。
    ps: 能进入omap_hsmmc_dma_cleanup 函数的基本在传输过程中都是有问题的。
        这个卡死的现象主要发生在adtc类型的cmd命令。


  • 感谢分享!

  •  

         你好,我们目前也遇到同样的情况,按照你的说明在setup_dma_interrupt中增加了黑框中的代码,但是系统仍然会死掉,CPU占用率会达到100%,不知道你的更改是怎么样的?是否还有其他代码更改,谢谢。

  • 现在知道,系统卡死的原因吗?还是不停的进入中断导致的?

    我的修改是这样的

    if (!callback)
    + {
         edma_shadow0_write_array(ctlr, SH_IECR, lch >> 5,
                     BIT(lch & 0x1f));
    -
       + edma_shadow0_write_array(ctlr, SH_ICR, lch >> 5,
       + BIT(lch & 0x1f));
    + }

  • 我这边系统启动时会有如下打印(内核任务发生D死锁),同时使用top命令发现系统io占用率非常高,导致cpu占用率一直维持在100%。增加了你那个代码也是一样的结果。

    warn kernel: [ 9.209438] mmc0: host does not support reading read-only switch. assuming write-enable.
    info kernel: [ 9.226550] mmc0: new high speed SDHC card at address 0001
    info kernel: [ 9.236568] mmcblk0: mmc0:0001 SD32G 29.1 GiB
    err kernel: [ 10.034716] mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0xb00
    warn kernel: [ 10.055611] mmcblk0: retrying using single block read
    err kernel: [ 10.857660] mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0x0
    err kernel: [ 10.868457] end_request: I/O error, dev mmcblk0, sector 0

    info kernel: [ 242.472380] [Hungtask UTC][2018.07.23 10:31:18-23889]
    info kernel: [ 242.472491] [Hungtask Clock][242472483110]
    err kernel: [ 242.472510] INFO: task kworker/u2:0:6 blocked for more than 120 seconds.
    err kernel: [ 242.520737] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    info kernel: [ 242.564536] kworker/u2:0 D c03ab11c 0 6 2 0x00000000
    info kernel: [ 242.564773] Workqueue: kmmcd mmc_rescan [mmc_core]
    warn kernel: [ 242.564791] locked:
    warn kernel: [ 242.595299] cd5fbc3c &dev->mutex 0 [<c0248990>] device_attach+0x1c/0x90
    warn kernel: [ 242.623075] cd812790 &bdev->bd_mutex 0 [<c00fcf30>] __blkdev_get+0x48/0x380
    warn kernel: [ 242.649881] Backtrace:
    warn kernel: [ 242.660133] [<c03aadb4>] (__schedule+0x0/0x4fc) from [<c03ab338>] (schedule+0x88/0x8c)
    warn kernel: [ 242.709721] [<c03ab2b0>] (schedule+0x0/0x8c) from [<c03ab3dc>] (io_schedule+0xa0/0x100)
    warn kernel: [ 242.726115] [<c03ab33c>] (io_schedule+0x0/0x100) from [<c0097f08>] (sleep_on_page+0x10/0x18)
    warn kernel: [ 242.738622] r7:00000002 r6:ce07b974 r5:c07f7c60 r4:ce07b96c
    warn kernel: [ 242.750401] [<c0097ef8>] (sleep_on_page+0x0/0x18) from [<c03a9efc>] (__wait_on_bit_lock+0x5c/0xa4)
    warn kernel: [ 242.765537] [<c03a9ea0>] (__wait_on_bit_lock+0x0/0xa4) from [<c0097ff8>] (__lock_page+0x6c/0x7c)
    warn kernel: [ 242.779872] r9:000200d0 r8:00000000 r7:cd812904 r6:00000000 r5:c00fc09c
    warn kernel: [ 242.779872] r4:c07a3b60
    warn kernel: [ 242.792313] [<c0097f8c>] (__lock_page+0x0/0x7c) from [<c00987c8>] (do_read_cache_page+0xa0/0xe8)
    warn kernel: [ 242.808547] r4:c07a3b60
    warn kernel: [ 242.816024] [<c0098728>] (do_read_cache_page+0x0/0xe8) from [<c0098830>] (read_cache_page_async+0x20/0x28)
    warn kernel: [ 242.841144] r9:000001ff r8:00000000 r7:00000200 r6:cd61bc00 r5:ce07ba24
    warn kernel: [ 242.841144] r4:00000000
    warn kernel: [ 242.882435] [<c0098810>] (read_cache_page_async+0x0/0x28) from [<c0098848>] (read_cache_page+0x10/0x18)
    warn kernel: [ 242.909401] [<c0098838>] (read_cache_page+0x0/0x18) from [<c01dc3a4>] (read_dev_sector+0x34/0x6c)
    warn kernel: [ 242.924621] [<c01dc370>] (read_dev_sector+0x0/0x6c) from [<c01dd7e4>] (read_lba+0xbc/0x110)
    warn kernel: [ 242.938916] r5:00000000 r4:03a3e000
    warn kernel: [ 242.952252] [<c01dd728>] (read_lba+0x0/0x110) from [<c01dde98>] (efi_partition+0xdc/0xdbc)
    warn kernel: [ 242.975295] [<c01dddbc>] (efi_partition+0x0/0xdbc) from [<c01dcc98>] (check_partition+0x108/0x1dc)
    warn kernel: [ 242.994068] [<c01dcb90>] (check_partition+0x0/0x1dc) from [<c01dc8d4>] (rescan_partitions+0x8c/0x2a4)
    warn kernel: [ 243.008588] r7:cd812790 r6:cd64a40c r5:00000000 r4:cd64a400
    warn kernel: [ 243.024545] [<c01dc848>] (rescan_partitions+0x0/0x2a4) from [<c00fd044>] (__blkdev_get+0x15c/0x380)
    warn kernel: [ 243.040215] [<c00fcee8>] (__blkdev_get+0x0/0x380) from [<c00fd47c>] (blkdev_get+0x214/0x32c)
    warn kernel: [ 243.054527] [<c00fd268>] (blkdev_get+0x0/0x32c) from [<c01da934>] (add_disk+0x3b4/0x438)
    warn kernel: [ 243.076843] [<c01da580>] (add_disk+0x0/0x438) from [<bf187e5c>] (mmc_add_disk+0x1c/0xf8 [mmc_block])
    warn kernel: [ 243.107194] [<bf187e40>] (mmc_add_disk+0x0/0xf8 [mmc_block]) from [<bf18839c>] (mmc_blk_probe+0x264/0x2b4 [mmc_block])
    warn kernel: [ 243.124563] r7:00000001 r6:ce07bd08 r5:cd64a000 r4:cd5fbc00
    warn kernel: [ 243.134449] [<bf188138>] (mmc_blk_probe+0x0/0x2b4 [mmc_block]) from [<bf15efa4>] (mmc_bus_probe+0x1c/0x20 [mmc_core])
    warn kernel: [ 243.152326] [<bf15ef88>] (mmc_bus_probe+0x0/0x20 [mmc_core]) from [<c0248b94>] (driver_probe_device+0x13c/0x348)
    warn kernel: [ 243.169169] [<c0248a58>] (driver_probe_device+0x0/0x348) from [<c0248dd0>] (__device_attach+0x30/0x4c)
    warn kernel: [ 243.184511] r9:c057a048 r8:cd5fb808 r7:00000000 r6:c0248da0 r5:cd5fbc08
    warn kernel: [ 243.184511] r4:bf18bcec
    warn kernel: [ 243.199957] [<c0248da0>] (__device_attach+0x0/0x4c) from [<c0246f20>] (bus_for_each_drv+0x8c/0x9c)
    warn kernel: [ 243.225984] r5:cd5fbc08 r4:00000000
    warn kernel: [ 243.258805] [<c0246e94>] (bus_for_each_drv+0x0/0x9c) from [<c02489e0>] (device_attach+0x6c/0x90)
    warn kernel: [ 243.276853] r6:cd5fbc3c r5:bf16e3b8 r4:cd5fbc08
    warn kernel: [ 243.285800] [<c0248974>] (device_attach+0x0/0x90) from [<c0247e64>] (bus_probe_device+0x30/0xa4)
    warn kernel: [ 243.302653] r7:00000000 r6:cd5fbc08 r5:bf16e3b8 r4:cd5fbc08
    warn kernel: [ 243.312879] [<c0247e34>] (bus_probe_device+0x0/0xa4) from [<c0245ec4>] (device_add+0x320/0x66c)
    warn kernel: [ 243.329121] r7:00000000 r6:cd5fbc10 r5:00000000 r4:cd5fbc08
    warn kernel: [ 243.338026] [<c0245ba4>] (device_add+0x0/0x66c) from [<bf15f3cc>] (mmc_add_card+0x188/0x1e4 [mmc_core])
    warn kernel: [ 243.357091] [<bf15f244>] (mmc_add_card+0x0/0x1e4 [mmc_core]) from [<bf163d64>] (mmc_attach_sd+0x164/0x1f0 [mmc_core])
    warn kernel: [ 243.383011] r7:bf16ec04 r6:00000000 r5:00000000 r4:cd5fb800
    warn kernel: [ 243.420163] [<bf163c00>] (mmc_attach_sd+0x0/0x1f0 [mmc_core]) from [<bf15eb40>] (mmc_rescan+0x2f4/0x360 [mmc_core])
    warn kernel: [ 243.447229] r5:cd5fb800 r4:cd5fba04
    warn kernel: [ 243.459557] [<bf15e84c>] (mmc_rescan+0x0/0x360 [mmc_core]) from [<c0055eb8>] (process_one_work+0x21c/0x344)
    warn kernel: [ 243.475499] r9:ce07a020 r8:cd473300 r7:00000000 r6:ce003600 r5:cd5fba04
    warn kernel: [ 243.475499] r4:ce051100
    warn kernel: [ 243.489256] [<c0055c9c>] (process_one_work+0x0/0x344) from [<c00570cc>] (worker_thread+0x250/0x394)
    warn kernel: [ 243.514599] [<c0056e7c>] (worker_thread+0x0/0x394) from [<c005c090>] (kthread+0xb4/0xc0)
    warn kernel: [ 243.542089] [<c005bfdc>] (kthread+0x0/0xc0) from [<c00142a0>] (ret_from_fork+0x14/0x34)
    warn kernel: [ 243.559510] r7:00000000 r6:00000000 r5:c005bfdc r4:ce05dd9c
    info kernel: [ 243.576918] task PC stack pid father
    info kernel: [ 243.576953] init S c03ab11c 0 1 0 0x00000000

  • 我的情况和你的不一样

    你这样试试,sd卡初始化的时候,先不要使用四线操作,就使用一根数据线data0

    我之前的问题,只用data0是可以的。

    mmc_sd_init_card函数中

    屏蔽下面代码

    if ((host->caps & MMC_CAP_4_BIT_DATA) &&
    (card->scr.bus_widths & SD_SCR_BUS_WIDTH_4)) {
          err = mmc_app_set_bus_width(card, MMC_BUS_WIDTH_4);
          if (err)
          goto free_card;

          mmc_set_bus_width(host, MMC_BUS_WIDTH_4);
    }

    先排除数据线的影响。

  • 如果使用1bit模式是正常的,我在设备树里把bus-width = <4>;改为了bus-width = <1>;后功能是正常的,但是这样SD卡的写速度会降低50%,如果在这种高位数据线有接触不良的情况下,是否能够识别出来做一些操作,避免发生cpu占用率高的现象。