In the Linux kernel, the following vulnerability has been resolved: blk-mq: fix IO hang from sbitmap wakeup race In blk_mq_mark_tag_wait(), __add_wait_queue() may be re-ordered with the following blk_mq_get_driver_tag() in case of getting driver tag failure. Then in __sbitmap_queue_wake_up(), waitqueue_active() may not observe the added waiter in blk_mq_mark_tag_wait() and wake up nothing, meantime blk_mq_mark_tag_wait() can't get driver tag successfully. This issue can be reproduced by running the following test in loop, and fio hang can be observed in < 30min when running it on my test VM in laptop. modprobe -r scsi_debug modprobe scsi_debug delay=0 dev_size_mb=4096 max_queue=1 host_max_queue=1 submit_queues=4 dev=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename` fio --filename=/dev/"$dev" --direct=1 --rw=randrw --bs=4k --iodepth=1 \ --runtime=100 --numjobs=40 --time_based --name=test \ --ioengine=libaio Fix the issue by adding one explicit barrier in blk_mq_mark_tag_wait(), which is just fine in case of running out of tag.
https://git.kernel.org/stable/c/f1bc0d8163f8ee84a8d5affdf624cfad657df1d2
https://git.kernel.org/stable/c/ecd7744a1446eb02ccc63e493e2eb6ede4ef1e10
https://git.kernel.org/stable/c/9525b38180e2753f0daa1a522b7767a2aa969676
https://git.kernel.org/stable/c/89e0e66682e1538aeeaa3109503473663cd24c8b
https://git.kernel.org/stable/c/7610ba1319253225a9ba8a9d28d472fc883b4e2f
https://git.kernel.org/stable/c/6d8b01624a2540336a32be91f25187a433af53a0
https://git.kernel.org/stable/c/5266caaf5660529e3da53004b8b7174cab6374ed
https://git.kernel.org/stable/c/1d9c777d3e70bdc57dddf7a14a80059d65919e56