aha/init
Miao Xie 58568d2a82 cpuset,mm: update tasks' mems_allowed in time
Fix allocating page cache/slab object on the unallowed node when memory
spread is set by updating tasks' mems_allowed after its cpuset's mems is
changed.

In order to update tasks' mems_allowed in time, we must modify the code of
memory policy.  Because the memory policy is applied in the process's
context originally.  After applying this patch, one task directly
manipulates anothers mems_allowed, and we use alloc_lock in the
task_struct to protect mems_allowed and memory policy of the task.

But in the fast path, we didn't use lock to protect them, because adding a
lock may lead to performance regression.  But if we don't add a lock,the
task might see no nodes when changing cpuset's mems_allowed to some
non-overlapping set.  In order to avoid it, we set all new allowed nodes,
then clear newly disallowed ones.

[lee.schermerhorn@hp.com:
  The rework of mpol_new() to extract the adjusting of the node mask to
  apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
  with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
  allocation.  Fix this by adding the check for MPOL_PREFERRED and empty
  node mask to mpol_new_mpolicy().

  Remove the now unneeded 'nodes = NULL' from mpol_new().

  Note that mpol_new_mempolicy() is always called with a non-NULL
  'nodes' parameter now that it has been removed from mpol_new().
  Therefore, we don't need to test nodes for NULL before testing it for
  'empty'.  However, just to be extra paranoid, add a VM_BUG_ON() to
  verify this assumption.]
[lee.schermerhorn@hp.com:

  I don't think the function name 'mpol_new_mempolicy' is descriptive
  enough to differentiate it from mpol_new().

  This function applies cpuset set context, usually constraining nodes
  to those allowed by the cpuset.  However, when the 'RELATIVE_NODES flag
  is set, it also translates the nodes.  So I settled on
  'mpol_set_nodemask()', because the comment block for mpol_new() mentions
  that we need to call this function to "set nodes".

  Some additional minor line length, whitespace and typo cleanup.]
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:31 -07:00
..
calibrate.c x86: remove stray <6> in BogoMIPS printk 2008-07-28 14:22:26 +02:00
do_mounts.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
do_mounts.h md: move lots of #include lines out of .h files and into .c 2009-03-31 14:33:13 +11:00
do_mounts_initrd.c Freezer: Fix s2disk resume from initrd 2007-11-20 22:22:42 -05:00
do_mounts_md.c md: move lots of #include lines out of .h files and into .c 2009-03-31 14:33:13 +11:00
do_mounts_rd.c init: make initrd/initramfs decompression failure a KERN_EMERG event 2009-01-14 11:28:35 -08:00
initramfs.c initramfs: clean up messages related to initramfs unpacking 2009-05-06 16:36:10 -07:00
Kconfig Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next 2009-06-14 14:12:18 -07:00
main.c cpuset,mm: update tasks' mems_allowed in time 2009-06-16 19:47:31 -07:00
Makefile kbuild: fix make V=1 2008-02-11 17:43:54 +01:00
noinitramfs.c [PATCH] disable init/initramfs.c 2007-02-11 10:51:25 -08:00
version.c init/version.c: define version_string only if CONFIG_KALLSYMS is not defined 2008-07-25 10:53:29 -07:00