The problem is that it will iterate all registered PMUs, most of which
will not have anything to do. Avoid this by keeping an explicit list
of PMUs that have requested the callback.
The perf_sched_cb_{inc,dec}() functions already takes the required pmu
argument, and now that these functions are no longer called from NMI
context we can use them to manage a list.
With this patch applied the function doesn't show up in the top 4
anymore (it dropped to 18th place).