When some objects are allocated by one CPU but freed by another CPU we can
consume lot of cycles doing divides in obj_to_index().
(Typical load on a dual processor machine where network interrupts are
handled by one particular CPU (allocating skbufs), and the other CPU is
running the application (consuming and freeing skbufs))
Here on one production server (dual-core AMD Opteron 285), I noticed this
divide took 1.20 % of CPU_CLK_UNHALTED events in kernel. But Opteron are
quite modern cpus and the divide is much more expensive on oldest
architectures :
On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1
cycle for a multiply.
Doing some math, we can use a reciprocal multiplication instead of a divide.
If we want to compute V = (A / B) (A and B being u32 quantities)
we can instead use :
V = ((u64)A * RECIPROCAL(B)) >> 32 ;
where RECIPROCAL(B) is precalculated to ((1LL << 32) + (B - 1)) / B
Note :
I wrote pure C code for clarity. gcc output for i386 is not optimal but
acceptable :
mull 0x14(%ebx)
mov %edx,%eax // part of the >> 32
xor %edx,%edx // useless
mov %eax,(%esp) // could be avoided
mov %edx,0x4(%esp) // useless
mov (%esp),%ebx
[akpm@osdl.org: small cleanups]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Elaborate the API for calling cpuset_zone_allowed(), so that users have to
explicitly choose between the two variants:
cpuset_zone_allowed_hardwall()
cpuset_zone_allowed_softwall()
Until now, whether or not you got the hardwall flavor depended solely on
whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
argument.
If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
version.
Unfortunately, this meant that users would end up with the softwall version
without thinking about it. Since only the softwall version might sleep,
this led to bugs with possible sleeping in interrupt context on more than
one occassion.
The hardwall version requires that the current tasks mems_allowed allows
the node of the specified zone (or that you're in interrupt or that
__GFP_THISNODE is set or that you're on a one cpuset system.)
The softwall version, depending on the gfp_mask, might allow a node if it
was allowed in the nearest enclusing cpuset marked mem_exclusive (which
requires taking the cpuset lock 'callback_mutex' to evaluate.)
This patch removes the cpuset_zone_allowed() call, and forces the caller to
explicitly choose between the hardwall and the softwall case.
If the caller wants the gfp_mask to determine this choice, they should (1)
be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
cpuset_zone_allowed_softwall() routine.
This adds another 100 or 200 bytes to the kernel text space, due to the few
lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
routines. It should save a few instructions executed for the calls that
turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
set (before the call) then check (within the call) the __GFP_HARDWALL flag.
For the most critical call, from get_page_from_freelist(), the same
instructions are executed as before -- the old cpuset_zone_allowed()
routine it used to call is the same code as the
cpuset_zone_allowed_softwall() routine that it calls now.
Not a perfect win, but seems worth it, to reduce this chance of hitting a
sleeping with irq off complaint again.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
More cleanups for slab.h
1. Remove tabs from weird locations as suggested by Pekka
2. Drop the check for NUMA and SLAB_DEBUG from the fallback section
as suggested by Pekka.
3. Uses static inline for the fallback defs as also suggested by Pekka.
4. Make kmem_ptr_valid take a const * argument.
5. Separate the NUMA fallback definitions from the kmalloc_track fallback
definitions.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a response to an earlier discussion on linux-mm about splitting
slab.h components per allocator. Patch is against 2.6.19-git11. See
http://marc.theaimsgroup.com/?l=linux-mm&m=116469577431008&w=2
This patch cleans up the slab header definitions. We define the common
functions of slob and slab in slab.h and put the extra definitions needed
for slab's kmalloc implementations in <linux/slab_def.h>. In order to get
a greater set of common functions we add several empty functions to slob.c
and also rename slob's kmalloc to __kmalloc.
Slob does not need any special definitions since we introduce a fallback
case. If there is no need for a slab implementation to provide its own
kmalloc mess^H^H^Hacros then we simply fall back to __kmalloc functions.
That is sufficient for SLOB.
Sort the function in slab.h according to their functionality. First the
functions operating on struct kmem_cache * then the kmalloc related
functions followed by special debug and fallback definitions.
Also redo a lot of comments.
Signed-off-by: Christoph Lameter <clameter@sgi.com>?
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove calls to pci_disable_device except in fail_all_cmds. The
pci_disable_device function does something nasty to Smart Array controllers
that pci_enable_device does not undo. So if the driver is unloaded it
cannot be reloaded.
Also, customers can disable any pci device via the ROM Based Setup Utility
(RBSU). If the customer has disabled the controller we should not try to
blindly enable the card from the driver. Please consider this for
inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Map out more memory for our config table. It's required to reach offset
0x214 to disable DMA on the P600. I'm not sure how I lost this hunk.
Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When CONFIG_PCI is not defined (i.e. PCI bus is disabled), the sx driver
fails to link, since some pci functions are not available. Fix this
behaviour to be able to compile this driver on machines with no PCI bus
(but with ISA bus support).
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When CONFIG_PCI is not defined (i.e. PCI bus is disabled), the mxser_new
driver fails to link, since some pci functions are not available. Fix this
behaviour to be able to compile this driver on machines with no PCI bus
(but with ISA bus support).
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CONFIG_PCI=n:
drivers/char/isicom.c: In function 'isicom_probe':
drivers/char/isicom.c:1793: warning: implicit declaration of function
'pci_request_region'
drivers/char/isicom.c:1827: warning: implicit declaration of function
'pci_release_region'
Let's CONFIG_ISI depend on CONFIG_PCI.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Based on patch from Alexander Rigbo <alexander.rigbo@acgnystrom.se>
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It seems macbooks set bit 2 but not bit 0, which is an "enabled but vmxon will
fault" setting.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Tested-by: Alex Larsson (sometimes testing helps)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They're not on speaking terms.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Thanks Jens for alerting me to this.
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <raziebe@gmail.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The previous checkstack fix for UML, which needs to use the host's tools,
was wrong in the crossbuilding case. It would use the build host's, rather
than the target's, toolchain.
This patch removes the old fix and adds an explicit special case for UML,
leaving everyone else alone.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fallback_alloc() does not do the check for GFP_WAIT as done in
cache_grow(). Thus interrupts are disabled when we call kmem_getpages()
which results in the failure.
Duplicate the handling of GFP_WAIT in cache_grow().
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fields of struct pipe_buf_operations have not a precise layout (ie not
optimized to fit cache lines nor reduce cache line ping pongs)
The bufs[] array is *large* and is placed near the beginning of the
structure, so all following fields have a large offset. This is
unfortunate because many archs have smaller instructions when using small
offsets relative to a base register. On x86 for example, 7 bits offsets
have smaller instruction lengths.
Moving bufs[] at the end of pipe_buf_operations permits all fields to have
small offsets, and reduce text size, and icache pressure.
# size vmlinux.pre vmlinux
text data bss dec hex filename
3268989 664356 492196 4425541 438745 vmlinux.pre
3268765 664356 492196 4425317 438665 vmlinux
So this patch reduces text size by 224 bytes on my x86_64 machine. Similar
results on ia32.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up a little.
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Added function sets "void (*conf_changed_callback)(void)". Call it, if
.config's changed state changes. Use above in qconf.cc to set gui's
save-widget's sensitvity.
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Those two functions are
void sym_set_change_count(int count)
and
void sym_add_change_count(int count)
All write accesses to sym_change_count are replaced by calls to above
functions.
Variable and changer-functions are moved to confdata.c. IMO thats ok, as
sym_change_count is an attribute of the .config's change state.
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Run "make xconfig" on a freshly untarred kernel-tree. Look at the floppy disk
icon of the qt application, that has just started: Its in a normal, active
state.
Mouse click on it: .config is being saved.
This patch series changes things so taht
after the mouse click on the floppy disk icon, the icon is greyed out.
If you mouse click on it now, nothing happens.
If you change some CONFIG_*, the floppy disk icon returns to "active state",
that is, if you mouse click it now, .config is written.
This patch:
Returns sym_change_count to reflect the .config's change state.
All read only accesses of
sym_change_count
are replaced by calls to
conf_get_changed()
.
mconfig.c is manipulated to ask for saving only when
conf_get_changed() returned true.
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The code has been fixed to use kill_pid instead of kill_proc fix the
comments as well.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit 373beb35cd.
No one is using this identifier yet. The purpose of this identifier is to
export nsproxy to user space which is wrong. nsproxy is an internal
implementation optimization, which should keep our fork times from getting
slower as we increase the number of global namespaces you don't have to
share.
Adding a global identifier like this is inappropriate because it makes
namespaces inherently non-recursive, greatly limiting what we can do with
them in the future.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- pipe/splice should use const pipe_buf_operations and file_operations
- struct pipe_inode_info has an unused field "start" : get rid of it.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The pcd, pwt, and pat bits on page table entries affect the cpu cache. Since
the cache is a host resource, the guest should not be able to control it.
Moreover, the meaning of these bits changes depending on whether pat is
enabled or not.
So, force these bits to zero on shadow page table entries at all times.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The arch splitting patchset left an extra put_cpu() in core code, where it can
cause trouble for CONFIG_PREEMPT kernels.
Reported-by: Huihong Luo <huisinro@yahoo.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes the SET_SREGS ioctl behave symmetrically to the GET_SREGS ioctl wrt
the segment access rights flag.
Signed-off-by: Uri Lublin <uril@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Section .parainstructions should not warn about section mismatches.
WARNING: drivers/net/hamradio/scc.o - Section mismatch: reference to .exit.text: from .parainstructions after '' (at offset 0x0)
WARNING: drivers/net/hamradio/scc.o - Section mismatch: reference to .exit.text: from .parainstructions after '' (at offset 0x8)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i2o_exec_exit and i2o_driver_exit were marked as __exit which is a bug
because both are invoked from __init and __exit functions.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eliminate some possibilities for user processes writing to the Gigaset
character device to be left sleeping indefinitely, by adding wakeup calls
to error paths and properly disposing of pending write requests when the
device is disconnected.
It also removes unnecessary NULL checks before usb_free_urb() and
usb_kill_urb() calls.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix up the work on stack and exit scope trouble by placing the work_struct
in the uml_net_private data.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some of the header file rearrangements broke the build for board-osk.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <tony@atomide.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
More fixes to build breakage from the work_struct changes ... this updates
the tps65010 driver. Plus, fix some dependencies related to the way it's
used on the OMAP OSK: force static linking there, since the resulting
kernel can't link.
NOTE that until the i2c core gets fixed to work without SMBUS_QUICK,
kernels needing this driver must still use "tps65010.force=0,0x48" on the
command line.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As per akpm's request.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
By letting gcc choose the temporary register for us, we lose arch dependency
and some ugliness. Conceivably gcc will also generate marginally better code.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Instead of in the main drivers menu.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
load_TR_desc() lives in asm/desc.h, so #include that file.
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
#ifdef CONFIG_SMP in a file which isn't compiled in non-SMP kernels.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] kprobe clears qp bits for special instructions
[IA64] enable trap code on slot 1
[IA64] Take defensive stance on ia64_pal_get_brand_info()
[IA64] fix possible XPC deadlock when disconnecting
[IA64] - Reduce overhead of FP exception logging messages
[IA64] fix arch/ia64/mm/contig.c:235: warning: unused variable `nid'
[IA64] s/termios/ktermios/ in simserial.c
[IA64] kexec/kdump: tidy up declaration of relocate_new_kernel_t
[IA64] Kexec/Kdump: honour non-zero crashkernel offset.
[IA64] CONFIG_KEXEC/CONFIG_CRASH_DUMP permutations
[IA64] Do not call SN_SAL_SET_CPU_NUMBER twice on cpu 0
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
[AGPGART] VIA and SiS AGP chipsets are x86-only
[AGPGART] agp-amd64: section mismatches with HOTPLUG=n
[AGPGART] Fix up misprogrammed bridges with incorrect AGPv2 rates.
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Make sure struct ipoib_neigh.queue is always initialized
IB/iser: Use the new verbs DMA mapping functions
IB/srp: Use new verbs IB DMA mapping functions
IPoIB: Use the new verbs DMA mapping functions
IB/core: Use the new verbs DMA mapping functions
IB/ipath: Implement new verbs DMA mapping functions
IB: Add DMA mapping functions to allow device drivers to interpose
RDMA/cma: Export rdma cm interface to userspace
RDMA/cma: Add support for RDMA_PS_UDP
RDMA/cma: Allow early transition to RTS to handle lost CM messages
RDMA/cma: Report connect info with connect events
RDMA/cma: Remove unneeded qp_type parameter from rdma_cm
IB/ipath: Fix IRQ for PCI Express HCAs
RDMA/amso1100: Fix memory leak in c2_qp_modify()
IB/iser: Remove unused "write-only" variables
IB/ipath: Remove unused "write-only" variables
IB/fmr: ib_flush_fmr_pool() may wait too long
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
Fix inotify maintainers entry
Fix typo in new debug options.
Jon needs a new shift key.
fs: Convert kmalloc() + memset() to kzalloc() in fs/.
configfs.h: Remove dead macro definitions.
kconfig: Standardize "depends" -> "depends on" in Kconfig files
e100: replace kmalloc with kcalloc
um: replace kmalloc+memset with kzalloc
fix typo in net/ipv4/ip_fragment.c
include/linux/compiler.h: reject gcc 3 < gcc 3.2
Kconfig: fix spelling error in config KALLSYMS help text
Remove duplicate "have to" in comment
Fix small typo in drivers/serial/icom.c
Use consistent casing in help message
EXT{2,3,4}_FS: remove outdated part of the help text
There's no point in troubling the Alpha, IA-64, PowerPC and PARISC
people with SiS and VIA options. Andrew thinks it helps find bugs,
but there's no evidence of that.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Dave Jones <davej@redhat.com>
When CONFIG_HOTPLUG=n, agp_amd64_resume() calls nforce3_agp_init(),
which is __devinit == __init, so has been discarded and is not
usable for resume.
WARNING: drivers/char/agp/amd64-agp.o - Section mismatch: reference to .init.text: from .text between 'agp_amd64_resume' (at offset 0x249) and 'amd64_tlbflush'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Dave Jones <davej@redhat.com>