Commit graph

178155 commits

Author SHA1 Message Date
Peter Zijlstra
e4f4288842 sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE
We should skip !SD_LOAD_BALANCE domains.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091216170517.653578430@chello.nl>
CC: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 19:01:55 +01:00
Peter Zijlstra
e6c8fba777 sched: Fix task_hot() test order
Make sure not to access sched_fair fields before verifying it is
indeed a sched_fair task.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
CC: stable@kernel.org
LKML-Reference: <20091216170517.577998058@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 19:01:54 +01:00
Xiaotian Feng
9ee349ad6d sched: Fix set_cpu_active() in cpu_down()
Sachin found cpu hotplug test failures on powerpc, which made
the kernel hang on his POWER box.

The problem is that we fail to re-activate a cpu when a
hot-unplug fails. Fix this by moving the de-activation into
_cpu_down after doing the initial checks.

Remove the synchronize_sched() calls and rely on those implied
by rebuilding the sched domains using the new mask.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Tested-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091216170517.500272612@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 19:01:53 +01:00
Peter Zijlstra
933b0618d8 sched: Mark boot-cpu active before smp_init()
A UP machine has 1 active cpu, not having the boot-cpu in the
active map when starting the scheduler confuses things.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091216170517.423469527@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 19:01:53 +01:00
Anisse Astier
de078e5747 msi-wmi: depend on backlight and fix corner-cases problems
Now depends on BACKLIGHT_CLASS_DEVICE.
Driver will return an error if it can't get actual backlight value
Fix remapping of brightness keys when backlight is not controlled by ACPI.

Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:54 -05:00
Anisse Astier
c30116c6f0 msi-wmi: switch to using input sparse keymap library
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:54 -05:00
Anisse Astier
d607af9300 msi-wmi: replace one-condition switch-case with if statement
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:54 -05:00
Anisse Astier
977f9b921c msi-wmi: remove unused field 'instance' in key_entry structure
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:53 -05:00
Anisse Astier
822ddc042a msi-wmi: remove custom runtime debug implementation
Rely on DYNAMIC_DEBUG instead if needed

Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:53 -05:00
Anisse Astier
46b51eb9e1 msi-wmi: rework init
There should be less code duplication with usage of gotos
Driver won't load if there's no hardware to control
Safer error handling at input driver allocation

Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:53 -05:00
Anisse Astier
addd65aac7 msi-wmi: remove useless includes
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:53 -05:00
Thomas Renninger
d12d8baff9 X86 drivers: Introduce msi-wmi driver
This driver serves backlight (including switching) and volume up/down
keys for MSI machines providing a specific wmi interface:
551A1F84-FBDD-4125-91DB-3EA8F44F1D45
B6F3EEF2-3D2F-49DC-9DE3-85BCE18C62F2

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Carlos Corbacho <carlos@strangeworlds.co.uk>
CC: Matthew Garrett <mjg59@srcf.ucam.org>
Tested-by: Matt Chen <machen@novell.com>
Reviewed-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 12:40:53 -05:00
Ingo Molnar
ee1156c11a Merge branch 'linus' into sched/urgent
Conflicts:
	kernel/sched_idletask.c

Merge reason: resolve the conflicts, pick up latest changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:33:49 +01:00
Peter Zijlstra
60ab271617 perf record: Use per-task-per-cpu events for inherited events
Create events with a pid and cpu contraint for inherited events
so that we get a stream per cpu, instead of all cpus contending
on a single stream.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091216165904.987643843@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:30:13 +01:00
Peter Zijlstra
856e96608a perf record: Properly synchronize child creation
Remove that ugly usleep and provide proper serialization between
parent and child just like perf-stat does.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091216165904.908184135@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:30:12 +01:00
Peter Zijlstra
f4c4176f21 perf events: Allow per-task-per-cpu counters
In order to allow for per-task-per-cpu counters, useful for
scalability when profiling task hierarchies, we allow installing
events with event->cpu != -1 in task contexts.

__perf_event_sched_in() already skips events where ->cpu
mis-matches the current cpu, fix up __perf_install_in_context()
and __perf_event_enable() to also respect this filter.

This does lead to vary hard to interpret enabled/running times
for such counters, but I don't see a simple solution for that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091216165904.831451147@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:30:11 +01:00
Arnaldo Carvalho de Melo
9b33827de6 perf diff: Percent calcs should use double values
Otherwise we do integer math and the delta values round up to
multiples of 1.0%.

Also, calculate absolute values. Things look precise now:

$ perf report -i perf.data.old --sort dso,symbol | head -13
     9.02%  libc-2.10.1.so               [.] _IO_vfprintf_internal
     4.88%  find                         [.] 0x00000000014af0
     2.91%  [kernel]                     [k] __kmalloc
     2.85%  [kernel]                     [k] ext4_htree_store_dirent
     2.50%  libc-2.10.1.so               [.] __GI_memmove
     2.44%  [kernel]                     [k] half_md4_transform
     2.43%  [kernel]                     [k] _spin_lock
     2.33%  [kernel]                     [k] system_call
$ perf report -i perf.data --sort dso,symbol | head -13
     8.55%  libc-2.10.1.so               [.] _IO_vfprintf_internal
     3.11%  [kernel]                     [k] __kmalloc
     3.07%  [kernel]                     [k] ext4_htree_store_dirent
     2.66%  find                         [.] 0x00000000016bcf
     2.61%  [kernel]                     [k] _atomic_dec_and_lock
     2.46%  [kernel]                     [k] half_md4_transform
     2.41%  libc-2.10.1.so               [.] __GI_memmove
     2.30%  find                         [.] 0x00000000009219
$ perf diff | head -13
     9.02%     -0.47%  libc-2.10.1.so               [.] _IO_vfprintf_internal
     2.91%     +0.20%  [kernel]                     [k] __kmalloc
     2.85%     +0.23%  [kernel]                     [k] ext4_htree_store_dirent
     1.99%     +0.62%  [kernel]                     [k] _atomic_dec_and_lock
     2.44%     +0.02%  [kernel]                     [k] half_md4_transform
     2.50%     -0.09%  libc-2.10.1.so               [.] __GI_memmove
     1.88%     +0.01%  [kernel]                     [k] __d_lookup
     2.43%     -0.75%  [kernel]                     [k] _spin_lock
     0.97%     +0.62%  [kernel]                     [k] path_get
     1.99%     -0.42%  libc-2.10.1.so               [.] _int_malloc
$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260981109-2621-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 18:29:10 +01:00
Christoph Hellwig
c05c4edd87 direct I/O fallback sync simplification
In the case of direct I/O falling back to buffered I/O we sync data
twice currently: once at the end of generic_file_buffered_write using
filemap_write_and_wait_range and once a little later in
__generic_file_aio_write using do_sync_mapping_range with all flags set.

The wait before write of the do_sync_mapping_range call does not make
any sense, so just keep the filemap_write_and_wait_range call and move
it to the right spot.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:50 -05:00
Christoph Hellwig
2cfd30adf6 ocfs: stop using do_sync_mapping_range
do_sync_mapping_range(..., SYNC_FILE_RANGE_WRITE) is a very awkward way
to perform a filemap_fdatawrite_range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
Christoph Hellwig
1e431f5ce7 cleanup blockdev_direct_IO locking
Currently the locking in blockdev_direct_IO is a mess, we have three different
locking types and very confusing checks for some of them.  The most
complicated one is DIO_OWN_LOCKING for reads, which happens to not actually be
used.

This patch gets rid of the DIO_OWN_LOCKING - as mentioned above the read case
is unused anyway, and the write side is almost identical to DIO_NO_LOCKING.
The difference is that DIO_NO_LOCKING always sets the create argument for
the get_blocks callback to zero, but we can easily move that to the actual
get_blocks callbacks.  There are four users of the DIO_NO_LOCKING mode:
gfs already ignores the create argument and thus is fine with the new
version, ocfs2 only errors out if create were ever set, and we can remove
this dead code now, the block device code only ever uses create for an
error message if we are fully beyond the device which can never happen,
and last but not least XFS will need the new behavour for writes.

Now we can replace the lock_type variable with a flags one, where no flag
means the DIO_NO_LOCKING behaviour and DIO_LOCKING is kept as the first
flag.  Separate out the check for not allowing to fill holes into a separate
flag, although for now both flags always get set at the same time.

Also revamp the documentation of the locking scheme to actually make sense.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
Christoph Hellwig
1c7c474c31 make generic_acl slightly more generic
Now that we cache the ACL pointers in the generic inode all the generic_acl
cruft can go away and generic_acl.c can directly implement xattr handlers
dealing with the full Posix ACL semantics for in-memory filesystems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
Christoph Hellwig
431547b3c4 sanitize xattr handler prototypes
Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods.  This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute.  With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
H Hartley Sweeten
ef26ca97e8 libfs: move EXPORT_SYMBOL for d_alloc_name
The EXPORT_SYMBOL for d_alloc_name is in fs/libfs.c but the function
is in fs/dcache.c.  Move the EXPORT_SYMBOL to the line immediately
after the closing function brace line in fs/dcache.c as mentioned
in Documentation/CodingStyle.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:48 -05:00
Jeff Layton
39159de2a0 vfs: force reval of target when following LAST_BIND symlinks (try #7)
procfs-style symlinks return a last_type of LAST_BIND without an actual
path string. This causes __follow_link to skip calling __vfs_follow_link
and so the dentry isn't revalidated.

This is a problem when the link target sits on NFSv4 as it depends on
the VFS to revalidate the dentry before using it on an open call. Ensure
that this occurs by forcing a revalidation of the target dentry of
LAST_BIND symlinks.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:48 -05:00
Mimi Zohar
d1625436b4 ima: limit imbalance msg
Limit the number of imbalance messages to once per filesystem type instead of
once per system boot.  (it's actually slightly racy and could give you a
couple per fs, but this isn't a real issue)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:48 -05:00
Al Viro
1429b3eca2 Untangling ima mess, part 3: kill dead code in ima
Kill the 'update' argument of ima_path_check(), kill
dead code in ima.

Current rules: ima counters are bumped at the same time
when the file switches from put_filp() fodder to fput()
one.  Which happens exactly in two places - alloc_file()
and __dentry_open().  Nothing else needs to do that at
all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
Al Viro
b65a9cfc2c Untangling ima mess, part 2: deal with counters
* do ima_get_count() in __dentry_open()
* stop doing that in followups
* move ima_path_check() to right after nameidata_to_filp()
* don't bump counters on it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
Al Viro
0552f879d4 Untangling ima mess, part 1: alloc_file()
There are 2 groups of alloc_file() callers:
	* ones that are followed by ima_counts_get
	* ones giving non-regular files
So let's pull that ima_counts_get() into alloc_file();
it's a no-op in case of non-regular files.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
Al Viro
7715b52122 O_TRUNC open shouldn't fail after file truncation
* take truncate logics into a helper (handle_truncate())
* rip it out of may_open()
* call it from the only caller of may_open() that might pass
O_TRUNC
* and do that after we'd finished with opening.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
Eric Paris
85a17f552d ima: call ima_inode_free ima_inode_free
ima_inode_free() has some funky #define just to confuse the crap out of me.

void ima_iint_delete(struct inode *inode)

and then things actually call ima_inode_free() and nothing calls
ima_iint_delete().

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:46 -05:00
Eric Paris
e0d5bd2aec IMA: clean up the IMA counts updating code
We currently have a lot of duplicated code around ima file counts.  Clean
that all up.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:46 -05:00
Eric Paris
9353384ec8 ima: only insert at inode creation time
iints are supposed to be allocated when an inode is allocated (during
security_inode_alloc())  But we have code which will attempt to allocate
an iint during measurement calls.  If we couldn't allocate the iint and we
cared, we should have died during security_inode_alloc().  Not make the
code more complex and less efficient.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:46 -05:00
Eric Paris
ec29ea544b ima: valid return code from ima_inode_alloc
ima_inode_alloc returns 0 and 1, but the LSM hooks expects an errno.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:46 -05:00
Eric Paris
e81e3f4dca fs: move get_empty_filp() deffinition to internal.h
All users outside of fs/ of get_empty_filp() have been removed.  This patch
moves the definition from the include/ directory to internal.h so no new
users crop up and removes the EXPORT_SYMBOL.  I'd love to see open intents
stop using it too, but that's a problem for another day and a smarter
developer!

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:45 -05:00
Al Viro
b75b5086be Sanitize exec_permission_lite()
Use the sucker in other places in pathname resolution
that check MAY_EXEC for directories; lose the _lite
from name, it's equivalent of full-blown inode_permission()
for its callers (albeit still lighter, since large parts
of generic_permission() do not apply for pure MAY_EXEC).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:45 -05:00
Al Viro
6e6b1bd1e7 Kill cached_lookup() and real_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:45 -05:00
Al Viro
2dd6d1f418 Kill path_lookup_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:45 -05:00
Al Viro
3cac260ad8 Take hash recalculation into do_lookup()
Both callers of do_lookup() do the same thing before it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:44 -05:00
Al Viro
e9496ff46a fix mismerge with Trond's stuff (create_mnt_ns() export is gone now)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:44 -05:00
Al Viro
b0446be4be switch cachefiles to kern_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:44 -05:00
Al Viro
306bb73d12 fix the crap in dst/dcore
* don't reinvent the wheels, please - open_bdev_exclusive() is there
  for purpose
* both open_by_devnum() and open_bdev_exclusive() return ERR_PTR(...)
  upon error, not NULL

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:44 -05:00
Al Viro
6de88d7292 kill __link_path_walk()/link_path_walk() distinction
put retry logics into path_walk() and do_filp_open()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:43 -05:00
Al Viro
258fa99905 lift path_put(path) to callers of __do_follow_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:43 -05:00
Al Viro
d231412db6 switch create_read_pipe() to alloc_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:43 -05:00
Al Viro
2c48b9c455 switch alloc_file() to passing struct path
... and have the caller grab both mnt and dentry; kill
leak in infiniband, while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
a95161aaa8 switch nilfs2 to deactivate_locked_super()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
3d1e463158 get rid of init_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
cc3808f8c3 switch sock_alloc_file() to alloc_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
6b18662e23 9p connect fixes
* if we fail in p9_conn_create(), we shouldn't leak references to struct file.
  Logics in ->close() doesn't help - ->trans is already gone by the time it's
  called.
* sock_create_kern() can fail.
* use of sock_map_fd() is all fscked up; I'd fixed most of that, but the
  rest will have to wait for a bit more work in net/socket.c (we still are
  violating the basic rule of working with descriptor table: "once the reference
  is installed there, don't rely on finding it there again").

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:41 -05:00
Al Viro
7cbe66b6b5 merge sock_alloc_fd/sock_attach_fd into a new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:41 -05:00