aha/arch/x86/lib
Steven Rostedt 5c1ea08215 x86: enable preemption in delay
The RT team has been searching for a nasty latency. This latency shows
up out of the blue and has been seen to be as big as 5ms!

Using ftrace I found the cause of the latency.

   pcscd-2995  3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
   pcscd-2995  3dN.2 52360301us : idle_cpu (irq_exit)
   pcscd-2995  3dN.2 52360301us : rcu_irq_exit (irq_exit)
   pcscd-2995  3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
)
   pcscd-2995  3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)

Here's an example of a 400 us latency. pcscd took a timer interrupt and
returned with "need resched" enabled, but did not reschedule until after
the next interrupt came in at 52360771us 400us later!

At first I thought we somehow missed a preemption check in entry.S. But
I also noticed that this always seemed to happen during a __delay call.

   pcscd-2995  3dN.2 52360836us : rcu_irq_exit (irq_exit)
   pcscd-2995  3.N.. 52361265us : preempt_schedule (__delay)

Looking at the x86 delay, I found my problem.

In git commit 35d5d08a08, Andrew Morton
placed preempt_disable around the entire delay due to TSC's not working
nicely on SMP.  Unfortunately for those that care about latencies this
is devastating! Especially when we have callers to mdelay(8).

Here I enable preemption during the loop and account for anytime the task
migrates to a new CPU. The delay asked for may be extended a bit by
the migration, but delay only guarantees that it will delay for that minimum
time. Delaying longer should not be an issue.

[
  Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
    and to place the rep_nop between preempt_enabled/disable.
]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: akpm@osdl.org
Cc: Clark Williams <clark.williams@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi-suse@firstfloor.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-04 13:11:46 +02:00
..
checksum_32.S
clear_page_64.S
copy_page_64.S
copy_user_64.S
copy_user_nocache_64.S x86: fence oostores on 64-bit 2007-10-12 18:41:21 -07:00
csum-copy_64.S
csum-partial_64.c x86: fix csum_partial() export 2008-05-13 19:38:47 +02:00
csum-wrappers_64.c x86: clean up csum-wrappers_64.c some more 2008-02-19 16:18:32 +01:00
delay_32.c x86: enable preemption in delay 2008-06-04 13:11:46 +02:00
delay_64.c x86: enable preemption in delay 2008-06-04 13:11:46 +02:00
getuser_32.S
getuser_64.S
io_64.c x86: coding style fixes in arch/x86/lib/io_64.c 2008-02-19 16:18:32 +01:00
iomap_copy_64.S
Makefile x86, UML: remove x86-specific implementations of find_first_bit 2008-04-26 19:21:17 +02:00
memcpy_32.c x86: coding style fixes to arch/x86/lib/memcpy_32.c 2008-04-17 17:40:49 +02:00
memcpy_64.S
memmove_64.c x86: coding style fixes to arch/x86/lib/memmove_64.c 2008-04-17 17:40:48 +02:00
memset_64.S
mmx_32.c x86: clean up mmx_32.c 2008-04-17 17:40:47 +02:00
msr-on-cpu.c i386: simplify smp_call_function_single() call sequence in msr-on-cpu 2007-10-17 20:16:20 +02:00
putuser_32.S
putuser_64.S
rwlock_64.S x86: rename .i assembler includes to .h 2007-10-17 20:16:29 +02:00
semaphore_32.S Generic semaphore implementation 2008-04-17 10:42:34 -04:00
string_32.c x86: coding style fixes to arch/x86/lib/string_32.c 2008-04-17 17:40:48 +02:00
strstr_32.c x86: coding style fixes to arch/x86/lib/strstr_3 2008-04-17 17:40:49 +02:00
thunk_64.S Generic semaphore implementation 2008-04-17 10:42:34 -04:00
usercopy_32.c x86: coding style fixes to arch/x86/lib/usercopy_32.c 2008-04-17 17:40:51 +02:00
usercopy_64.c x86: use _ASM_EXTABLE macro in arch/x86/lib/usercopy_64.c 2008-02-04 16:47:57 +01:00