aha/kernel
Paul Mackerras ad3a37de81 perf_counter: Don't swap contexts containing locked mutex
Peter Zijlstra pointed out that under some circumstances, we can take
the mutex in a context or a counter and then swap that context or
counter to another task, potentially leading to lock order inversions
or the mutexes not protecting what they are supposed to protect.

This fixes the problem by making sure that we never take a mutex in a
context or counter which could get swapped to another task.  Most of
the cases where we take a mutex is on a top-level counter or context,
i.e. a counter which has an fd associated with it or a context that
contains such a counter.  This adds WARN_ON_ONCE statements to verify
that.

The two cases where we need to take the mutex on a context that is a
clone of another are in perf_counter_exit_task and
perf_counter_init_task.  The perf_counter_exit_task case is solved by
uncloning the context before starting to remove the counters from it.
The perf_counter_init_task is a little trickier; we temporarily
disable context swapping for the parent (forking) task by setting its
ctx->parent_gen to the all-1s value after locking the context, if it
is a cloned context, and restore the ctx->parent_gen value at the end
if the context didn't get uncloned in the meantime.

This also moves the increment of the context generation count to be
within the same critical section, protected by the context mutex, that
adds the new counter to the context.  That way, taking the mutex is
sufficient to ensure that both the counter list and the generation
count are stable.

[ Impact: fix hangs, races with inherited and PID counters ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18975.31580.520676.619896@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-29 11:02:46 +02:00
..
irq Revert "genirq: assert that irq handlers are indeed running in hardirq context" 2009-05-01 15:16:04 +02:00
power PM/Hibernate: Fix waiting for image device to appear on resume 2009-04-24 15:31:30 -07:00
time clockevents: prevent endless loop in tick_handle_periodic() 2009-05-02 10:22:27 +02:00
trace tracing: fix ref count in splice pages 2009-04-29 08:02:44 +02:00
.gitignore
acct.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
async.c async: remove the temporary (2.6.29) "async is off by default" code 2009-03-28 13:05:30 -07:00
audit.c Audit: remove spaces from audit_log_d_path 2009-04-05 13:49:04 -04:00
audit.h fixing audit rule ordering mess, part 1 2009-01-04 15:14:41 -05:00
audit_tree.c No need for crossing to mountpoint in audit_tag_tree() 2009-04-20 23:01:15 -04:00
auditfilter.c inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positive 2009-05-06 16:36:09 -07:00
auditsc.c Audit: remove spaces from audit_log_d_path 2009-04-05 13:49:04 -04:00
backtracetest.c
bounds.c
capability.c [CVE-2009-0029] System call wrappers part 04 2009-01-14 14:15:19 +01:00
cgroup.c Convert obvious places to deactivate_locked_super() 2009-05-09 10:49:40 -04:00
cgroup_debug.c debug cgroup: remove unneeded cgroup_lock 2009-04-02 19:04:54 -07:00
cgroup_freezer.c
compat.c signals: implement sys_rt_tgsigqueueinfo 2009-04-30 19:24:24 +02:00
configs.c
cpu.c cpumask: use set_cpu_active in init/main.c 2009-03-30 22:05:12 +10:30
cpuset.c cpusets: prevent PF_THREAD_BOUND tasks from attaching to non-root cpusets 2009-04-02 19:04:57 -07:00
cred-internals.h
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 2009-01-09 13:59:25 -08:00
delayacct.c
dma-coherent.c dma-coherent: Restore dma_alloc_from_coherent() large alloc fall back policy. 2009-01-21 18:51:53 +09:00
dma.c
exec_domain.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
exit.c perf_counter: Dynamically allocate tasks' perf_counter_context struct 2009-05-22 12:18:19 +02:00
extable.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
fork.c perf_counter: Propagate inheritance failures down the fork() path 2009-05-25 14:55:01 +02:00
freezer.c
futex.c futex: comment requeue key reference semantics 2009-04-02 23:39:53 +02:00
futex_compat.c
hrtimer.c hrtimer: fix rq->lock inversion (again) 2009-03-31 14:52:52 +02:00
hung_task.c softlockup: ensure the task has been switched out once 2009-02-11 11:04:16 +01:00
itimer.c timers: split process wide cpu clocks/timers 2009-02-05 13:04:33 +01:00
kallsyms.c Ksplice: Add functions for walking kallsyms symbols 2009-03-31 13:05:32 +10:30
Kconfig.freezer
Kconfig.hz
Kconfig.preempt
kexec.c kexec: vmcoreinfo_data[] can become static 2009-04-02 19:05:04 -07:00
kfifo.c
kgdb.c sysrq, intel_fb: fix sysrq g collision 2009-05-15 07:56:24 -05:00
kmod.c module: create a request_module_nowait() 2009-03-31 13:05:35 +10:30
kprobes.c kprobes: fix to use text_mutex around arm/disarm kprobe 2009-05-08 16:23:48 -07:00
ksysfs.c kernel/ksysfs.c:fix dependence on CONFIG_NET 2009-01-06 10:44:31 -08:00
kthread.c kthread: move sched-realeted initialization from kthreadd context 2009-04-09 09:50:37 +09:30
latencytop.c sched, latencytop: incorporate review feedback from Andrew Morton 2009-02-11 10:18:04 +01:00
lockdep.c lockdep: more robust lockdep_map init sequence 2009-04-17 18:00:00 +02:00
lockdep_internals.h lockdep: get_user_chars() redo 2009-02-14 23:28:22 +01:00
lockdep_proc.c lockstat: warn about disabled lock debugging 2009-02-14 23:28:28 +01:00
lockdep_states.h lockdep: move state bit definitions around 2009-02-14 23:27:59 +01:00
Makefile Merge commit 'v2.6.30-rc1' into perfcounters/core 2009-04-08 10:35:30 +02:00
marker.c
module.c async: Fix module loading async-work regression 2009-04-11 12:44:49 -07:00
mutex-debug.c mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
mutex-debug.h mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
mutex.c Merge branch 'core/locking' into perfcounters/core 2009-05-06 08:47:26 +02:00
mutex.h mutex: implement adaptive spinning 2009-01-14 18:09:02 +01:00
notifier.c
ns_cgroup.c cgroups: relax ns_can_attach checks to allow attaching to grandchild cgroups 2009-04-02 19:04:53 -07:00
nsproxy.c
panic.c Eliminate thousands of warnings with gcc 3.2 build 2009-05-06 16:36:09 -07:00
params.c param: fix charp parameters set via sysfs 2009-03-31 13:05:30 +10:30
perf_counter.c perf_counter: Don't swap contexts containing locked mutex 2009-05-29 11:02:46 +02:00
pid.c pids: refactor vnr/nr_ns helpers to make them safe 2009-04-02 19:05:02 -07:00
pid_namespace.c signals: zap_pid_ns_process() should use force_sig() 2009-04-02 19:04:58 -07:00
pm_qos_params.c
posix-cpu-timers.c kernel/posix-cpu-timers.c: fix sparse warning 2009-04-30 08:08:31 +02:00
posix-timers.c [CVE-2009-0029] System call wrappers part 05 2009-01-14 14:15:20 +01:00
printk.c Merge branch 'printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 10:23:25 -07:00
profile.c profiling: fix broken profiling regression 2009-02-10 00:50:37 +01:00
ptrace.c ptrace: ptrace_attach: fix the usage of ->cred_exec_mutex 2009-04-27 20:30:51 +10:00
rcuclassic.c kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies 2009-04-03 12:23:02 +02:00
rcupdate.c RCU: Don't try and predeclare inline funcs as it upsets some versions of gcc 2009-04-15 13:55:14 -07:00
rcupreempt.c kmemtrace, rcu: fix rcupreempt.c data structure dependencies 2009-04-03 12:23:04 +02:00
rcupreempt_trace.c
rcutorture.c cpumask: convert rcutorture.c 2009-03-30 22:05:16 +10:30
rcutree.c rcu: Make hierarchical RCU less IPI-happy 2009-04-14 11:31:50 +02:00
rcutree.h kmemtrace, rcu: fix rcu_tree_trace.c data structure dependencies 2009-04-03 12:23:03 +02:00
rcutree_trace.c rcu: Make hierarchical RCU less IPI-happy 2009-04-14 11:31:50 +02:00
relay.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
res_counter.c memcg: memory cgroup resource counters for hierarchy 2009-01-08 08:31:05 -08:00
resource.c Remove 'recurse into child resources' logic from 'reserve_region_with_split()' 2009-04-18 21:44:24 -07:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c locking, rtmutex.c: Documentation cleanup 2009-04-29 23:20:17 +02:00
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c perf_counter: Fix dynamic irq_period logging 2009-05-23 19:37:44 +02:00
sched_clock.c Merge branch 'tracing/core-v2' into tracing-for-linus 2009-04-02 00:49:02 +02:00
sched_cpupri.c sched_rt: don't allocate cpumask in fastpath 2009-04-01 13:24:51 +02:00
sched_cpupri.h cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sched_debug.c sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
sched_fair.c Merge branch 'sched/urgent'; commit 'v2.6.29-rc5' into sched/core 2009-02-15 21:15:16 +01:00
sched_features.h Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-30 17:17:35 -07:00
sched_idletask.c
sched_rt.c Merge commit 'v2.6.30-rc1' into sched/urgent 2009-04-08 17:26:00 +02:00
sched_stats.h sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
seccomp.c x86-64: seccomp: fix 32/64 syscall hole 2009-03-02 15:41:30 -08:00
semaphore.c
signal.c signals: implement sys_rt_tgsigqueueinfo 2009-04-30 19:24:24 +02:00
slow-work.c Delete slow-work timers properly 2009-04-24 07:47:59 -07:00
smp.c generic-ipi: eliminate WARN_ON()s during oops/panic 2009-03-13 10:47:34 +01:00
softirq.c kernel/softirq.c: fix sparse warning 2009-04-17 01:57:54 +02:00
softlockup.c softlockup: decouple hung tasks check from softlockup detection 2009-01-16 14:06:04 +01:00
spinlock.c Allow rwlocks to re-enable interrupts 2009-04-02 19:05:11 -07:00
srcu.c
stacktrace.c
stop_machine.c cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sys.c Merge branch 'linus' into perfcounters/core 2009-04-29 14:47:05 +02:00
sys_ni.c Merge commit 'v2.6.29-rc2' into perfcounters/core 2009-01-21 16:37:27 +01:00
sysctl.c perf_counter: Generic per counter interrupt throttle 2009-05-25 21:41:12 +02:00
sysctl_check.c net: add ARP notify option for devices 2009-02-01 01:04:33 -08:00
taskstats.c cpumask: convert rest of files in kernel/ 2009-01-01 10:12:28 +10:30
test_kprobes.c kprobes: add tests for register_kprobes 2009-01-06 15:59:20 -08:00
time.c [CVE-2009-0029] System call wrappers part 01 2009-01-14 14:15:18 +01:00
timeconst.pl
timer.c Merge branch 'linus' into perfcounters/core 2009-04-29 14:47:05 +02:00
tracepoint.c tracepoints: dont update zero-sized tracepoint sections 2009-03-18 19:55:00 +01:00
tsacct.c Fix fixpoint divide exception in acct_update_integrals 2009-03-09 08:13:35 -07:00
uid16.c [CVE-2009-0029] System call wrappers part 19 2009-01-14 14:15:26 +01:00
up.c smp_call_function_single(): be slightly less stupid, fix #2 2009-01-12 16:04:37 +01:00
user.c Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
user_namespace.c Fix recursive lock in free_uid()/free_user_ns() 2009-02-27 16:26:21 -08:00
utsname.c
utsname_sysctl.c proc_sysctl: use CONFIG_PROC_SYSCTL around ipc and utsname proc_handlers 2009-04-02 19:05:01 -07:00
wait.c wait: prevent exclusive waiter starvation 2009-02-05 12:56:48 -08:00
workqueue.c work_on_cpu(): rewrite it to create a kernel thread on demand 2009-04-09 09:50:37 +09:30