When using kcopyd to run callbacks through dm_kcopyd_do_callback() or
submitting copy jobs with a source size of 0, the jobs are pushed
directly to the complete_jobs list, which could be under processing by
the kcopyd thread. As a result, the kcopyd thread can continue running
completed jobs indefinitely, without releasing the CPU, as long as
someone keeps submitting new completed jobs through the aforementioned
paths. Processing of work items, queued for execution on the same CPU as
the currently running kcopyd thread, is thus stalled for excessive
amounts of time, hurting performance.
Running the following test, from the device mapper test suite [1],
dmtest run --suite snapshot -n parallel_io_to_many_snaps_N
, with 8 active snapshots, we get, in dmesg, messages like the
following:
Fix this by splitting the complete_jobs list into two parts: A user
facing part, named callback_jobs, and one used internally by kcopyd,
retaining the name complete_jobs. dm_kcopyd_do_callback() and
dispatch_job() now push their jobs to the callback_jobs list, which is
spliced to the complete_jobs list once, every time the kcopyd thread
wakes up. This prevents kcopyd from hogging the CPU indefinitely and
causing workqueue stalls.
Re-running the aforementioned test:
* Workqueue stalls are eliminated
* The maximum writing time among all targets is reduced from 09m37.10s
to 06m04.85s and the total run time of the test is reduced from
10m43.591s to 7m19.199s