mirror of
https://github.com/adulau/aha.git
synced 2024-12-27 19:26:25 +00:00
sched: Document memory barriers implied by sleep/wake-up primitives
Add a section to the memory barriers document to note the implied memory barriers of sleep primitives (set_current_state() and wrappers) and wake-up primitives (wake_up() and co.). Also extend the in-code comments on the wake_up() functions to note these implied barriers. [ Impact: add documentation ] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20090428140138.1192.94723.stgit@warthog.procyon.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
parent
56a50adda4
commit
50fa610a3b
2 changed files with 151 additions and 1 deletions
|
@ -31,6 +31,7 @@ Contents:
|
||||||
|
|
||||||
- Locking functions.
|
- Locking functions.
|
||||||
- Interrupt disabling functions.
|
- Interrupt disabling functions.
|
||||||
|
- Sleep and wake-up functions.
|
||||||
- Miscellaneous functions.
|
- Miscellaneous functions.
|
||||||
|
|
||||||
(*) Inter-CPU locking barrier effects.
|
(*) Inter-CPU locking barrier effects.
|
||||||
|
@ -1217,6 +1218,132 @@ barriers are required in such a situation, they must be provided from some
|
||||||
other means.
|
other means.
|
||||||
|
|
||||||
|
|
||||||
|
SLEEP AND WAKE-UP FUNCTIONS
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
Sleeping and waking on an event flagged in global data can be viewed as an
|
||||||
|
interaction between two pieces of data: the task state of the task waiting for
|
||||||
|
the event and the global data used to indicate the event. To make sure that
|
||||||
|
these appear to happen in the right order, the primitives to begin the process
|
||||||
|
of going to sleep, and the primitives to initiate a wake up imply certain
|
||||||
|
barriers.
|
||||||
|
|
||||||
|
Firstly, the sleeper normally follows something like this sequence of events:
|
||||||
|
|
||||||
|
for (;;) {
|
||||||
|
set_current_state(TASK_UNINTERRUPTIBLE);
|
||||||
|
if (event_indicated)
|
||||||
|
break;
|
||||||
|
schedule();
|
||||||
|
}
|
||||||
|
|
||||||
|
A general memory barrier is interpolated automatically by set_current_state()
|
||||||
|
after it has altered the task state:
|
||||||
|
|
||||||
|
CPU 1
|
||||||
|
===============================
|
||||||
|
set_current_state();
|
||||||
|
set_mb();
|
||||||
|
STORE current->state
|
||||||
|
<general barrier>
|
||||||
|
LOAD event_indicated
|
||||||
|
|
||||||
|
set_current_state() may be wrapped by:
|
||||||
|
|
||||||
|
prepare_to_wait();
|
||||||
|
prepare_to_wait_exclusive();
|
||||||
|
|
||||||
|
which therefore also imply a general memory barrier after setting the state.
|
||||||
|
The whole sequence above is available in various canned forms, all of which
|
||||||
|
interpolate the memory barrier in the right place:
|
||||||
|
|
||||||
|
wait_event();
|
||||||
|
wait_event_interruptible();
|
||||||
|
wait_event_interruptible_exclusive();
|
||||||
|
wait_event_interruptible_timeout();
|
||||||
|
wait_event_killable();
|
||||||
|
wait_event_timeout();
|
||||||
|
wait_on_bit();
|
||||||
|
wait_on_bit_lock();
|
||||||
|
|
||||||
|
|
||||||
|
Secondly, code that performs a wake up normally follows something like this:
|
||||||
|
|
||||||
|
event_indicated = 1;
|
||||||
|
wake_up(&event_wait_queue);
|
||||||
|
|
||||||
|
or:
|
||||||
|
|
||||||
|
event_indicated = 1;
|
||||||
|
wake_up_process(event_daemon);
|
||||||
|
|
||||||
|
A write memory barrier is implied by wake_up() and co. if and only if they wake
|
||||||
|
something up. The barrier occurs before the task state is cleared, and so sits
|
||||||
|
between the STORE to indicate the event and the STORE to set TASK_RUNNING:
|
||||||
|
|
||||||
|
CPU 1 CPU 2
|
||||||
|
=============================== ===============================
|
||||||
|
set_current_state(); STORE event_indicated
|
||||||
|
set_mb(); wake_up();
|
||||||
|
STORE current->state <write barrier>
|
||||||
|
<general barrier> STORE current->state
|
||||||
|
LOAD event_indicated
|
||||||
|
|
||||||
|
The available waker functions include:
|
||||||
|
|
||||||
|
complete();
|
||||||
|
wake_up();
|
||||||
|
wake_up_all();
|
||||||
|
wake_up_bit();
|
||||||
|
wake_up_interruptible();
|
||||||
|
wake_up_interruptible_all();
|
||||||
|
wake_up_interruptible_nr();
|
||||||
|
wake_up_interruptible_poll();
|
||||||
|
wake_up_interruptible_sync();
|
||||||
|
wake_up_interruptible_sync_poll();
|
||||||
|
wake_up_locked();
|
||||||
|
wake_up_locked_poll();
|
||||||
|
wake_up_nr();
|
||||||
|
wake_up_poll();
|
||||||
|
wake_up_process();
|
||||||
|
|
||||||
|
|
||||||
|
[!] Note that the memory barriers implied by the sleeper and the waker do _not_
|
||||||
|
order multiple stores before the wake-up with respect to loads of those stored
|
||||||
|
values after the sleeper has called set_current_state(). For instance, if the
|
||||||
|
sleeper does:
|
||||||
|
|
||||||
|
set_current_state(TASK_INTERRUPTIBLE);
|
||||||
|
if (event_indicated)
|
||||||
|
break;
|
||||||
|
__set_current_state(TASK_RUNNING);
|
||||||
|
do_something(my_data);
|
||||||
|
|
||||||
|
and the waker does:
|
||||||
|
|
||||||
|
my_data = value;
|
||||||
|
event_indicated = 1;
|
||||||
|
wake_up(&event_wait_queue);
|
||||||
|
|
||||||
|
there's no guarantee that the change to event_indicated will be perceived by
|
||||||
|
the sleeper as coming after the change to my_data. In such a circumstance, the
|
||||||
|
code on both sides must interpolate its own memory barriers between the
|
||||||
|
separate data accesses. Thus the above sleeper ought to do:
|
||||||
|
|
||||||
|
set_current_state(TASK_INTERRUPTIBLE);
|
||||||
|
if (event_indicated) {
|
||||||
|
smp_rmb();
|
||||||
|
do_something(my_data);
|
||||||
|
}
|
||||||
|
|
||||||
|
and the waker should do:
|
||||||
|
|
||||||
|
my_data = value;
|
||||||
|
smp_wmb();
|
||||||
|
event_indicated = 1;
|
||||||
|
wake_up(&event_wait_queue);
|
||||||
|
|
||||||
|
|
||||||
MISCELLANEOUS FUNCTIONS
|
MISCELLANEOUS FUNCTIONS
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
|
@ -1366,7 +1493,7 @@ WHERE ARE MEMORY BARRIERS NEEDED?
|
||||||
|
|
||||||
Under normal operation, memory operation reordering is generally not going to
|
Under normal operation, memory operation reordering is generally not going to
|
||||||
be a problem as a single-threaded linear piece of code will still appear to
|
be a problem as a single-threaded linear piece of code will still appear to
|
||||||
work correctly, even if it's in an SMP kernel. There are, however, three
|
work correctly, even if it's in an SMP kernel. There are, however, four
|
||||||
circumstances in which reordering definitely _could_ be a problem:
|
circumstances in which reordering definitely _could_ be a problem:
|
||||||
|
|
||||||
(*) Interprocessor interaction.
|
(*) Interprocessor interaction.
|
||||||
|
|
|
@ -2458,6 +2458,17 @@ out:
|
||||||
return success;
|
return success;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* wake_up_process - Wake up a specific process
|
||||||
|
* @p: The process to be woken up.
|
||||||
|
*
|
||||||
|
* Attempt to wake up the nominated process and move it to the set of runnable
|
||||||
|
* processes. Returns 1 if the process was woken up, 0 if it was already
|
||||||
|
* running.
|
||||||
|
*
|
||||||
|
* It may be assumed that this function implies a write memory barrier before
|
||||||
|
* changing the task state if and only if any tasks are woken up.
|
||||||
|
*/
|
||||||
int wake_up_process(struct task_struct *p)
|
int wake_up_process(struct task_struct *p)
|
||||||
{
|
{
|
||||||
return try_to_wake_up(p, TASK_ALL, 0);
|
return try_to_wake_up(p, TASK_ALL, 0);
|
||||||
|
@ -5241,6 +5252,9 @@ void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
|
||||||
* @mode: which threads
|
* @mode: which threads
|
||||||
* @nr_exclusive: how many wake-one or wake-many threads to wake up
|
* @nr_exclusive: how many wake-one or wake-many threads to wake up
|
||||||
* @key: is directly passed to the wakeup function
|
* @key: is directly passed to the wakeup function
|
||||||
|
*
|
||||||
|
* It may be assumed that this function implies a write memory barrier before
|
||||||
|
* changing the task state if and only if any tasks are woken up.
|
||||||
*/
|
*/
|
||||||
void __wake_up(wait_queue_head_t *q, unsigned int mode,
|
void __wake_up(wait_queue_head_t *q, unsigned int mode,
|
||||||
int nr_exclusive, void *key)
|
int nr_exclusive, void *key)
|
||||||
|
@ -5279,6 +5293,9 @@ void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key)
|
||||||
* with each other. This can prevent needless bouncing between CPUs.
|
* with each other. This can prevent needless bouncing between CPUs.
|
||||||
*
|
*
|
||||||
* On UP it can prevent extra preemption.
|
* On UP it can prevent extra preemption.
|
||||||
|
*
|
||||||
|
* It may be assumed that this function implies a write memory barrier before
|
||||||
|
* changing the task state if and only if any tasks are woken up.
|
||||||
*/
|
*/
|
||||||
void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode,
|
void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode,
|
||||||
int nr_exclusive, void *key)
|
int nr_exclusive, void *key)
|
||||||
|
@ -5315,6 +5332,9 @@ EXPORT_SYMBOL_GPL(__wake_up_sync); /* For internal use only */
|
||||||
* awakened in the same order in which they were queued.
|
* awakened in the same order in which they were queued.
|
||||||
*
|
*
|
||||||
* See also complete_all(), wait_for_completion() and related routines.
|
* See also complete_all(), wait_for_completion() and related routines.
|
||||||
|
*
|
||||||
|
* It may be assumed that this function implies a write memory barrier before
|
||||||
|
* changing the task state if and only if any tasks are woken up.
|
||||||
*/
|
*/
|
||||||
void complete(struct completion *x)
|
void complete(struct completion *x)
|
||||||
{
|
{
|
||||||
|
@ -5332,6 +5352,9 @@ EXPORT_SYMBOL(complete);
|
||||||
* @x: holds the state of this particular completion
|
* @x: holds the state of this particular completion
|
||||||
*
|
*
|
||||||
* This will wake up all threads waiting on this particular completion event.
|
* This will wake up all threads waiting on this particular completion event.
|
||||||
|
*
|
||||||
|
* It may be assumed that this function implies a write memory barrier before
|
||||||
|
* changing the task state if and only if any tasks are woken up.
|
||||||
*/
|
*/
|
||||||
void complete_all(struct completion *x)
|
void complete_all(struct completion *x)
|
||||||
{
|
{
|
||||||
|
|
Loading…
Reference in a new issue