From: Eric Dumazet Date: Tue, 26 Apr 2016 22:30:07 +0000 (-0700) Subject: net-rfs: fix false sharing accessing sd->input_queue_head X-Git-Tag: v4.7-rc1~154^2~183 X-Git-Url: https://asedeno.scripts.mit.edu/gitweb/?a=commitdiff_plain;h=501e7ef569f4ea2dc7e50773cf6a5d757c94f9b4;p=linux.git net-rfs: fix false sharing accessing sd->input_queue_head sd->input_queue_head is incremented for each processed packet in process_backlog(), and read from other cpus performing Out Of Order avoidance in get_rps_cpu() Moving this field in a separate cache line keeps it mostly hot for the cpu in process_backlog(), as other cpus will only read it. In a stress test, process_backlog() was consuming 6.80 % of cpu cycles, and the patch reduced the cost to 0.65 % Signed-off-by: Eric Dumazet Acked-by: Tom Herbert Signed-off-by: David S. Miller --- diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 18d8394f2e5d..934ca866562d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2747,11 +2747,15 @@ struct softnet_data { struct sk_buff *completion_queue; #ifdef CONFIG_RPS - /* Elements below can be accessed between CPUs for RPS */ + /* input_queue_head should be written by cpu owning this struct, + * and only read by other cpus. Worth using a cache line. + */ + unsigned int input_queue_head ____cacheline_aligned_in_smp; + + /* Elements below can be accessed between CPUs for RPS/RFS */ struct call_single_data csd ____cacheline_aligned_in_smp; struct softnet_data *rps_ipi_next; unsigned int cpu; - unsigned int input_queue_head; unsigned int input_queue_tail; #endif unsigned int dropped;