When running several kvm processes with lots of memory overcommitment,
we have seen an oops during process shutdown:
------------[ cut here ]------------
Kernel BUG at 0000000000193434 [verbose debug info unavailable]
addressing exception: 0005 [#1] PREEMPT SMP
Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8
Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650)
Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0
00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400
00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0
0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0
Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12
000000000019342c: a7180000 lhi %r1,0
0000000000193430: b2290012 iske %r1,%r2
>0000000000193434: a7110002 tmll %r1,2
0000000000193438: a7840006 brc 8,193444
000000000019343c: 9602c000 oi 0(%r12),2
0000000000193440: 96806000 oi 0(%r6),128
0000000000193444: a7110004 tmll %r1,4
Call Trace:
([<00000000001933f6>] unmap_vmas+0x846/0xf10)
[<0000000000199680>] exit_mmap+0x210/0x458
[<000000000012a8f8>] mmput+0x54/0xfc
[<000000000012f714>] exit_mm+0x134/0x144
[<000000000013120c>] do_exit+0x240/0x878
[<00000000001318dc>] do_group_exit+0x98/0xc8
[<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358
[<000000000010bee0>] do_signal+0xec/0x860
[<0000000000112e30>] sysc_sigpending+0xe/0x22
[<000002000013198a>] 0x2000013198a
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4
<4>---[ end trace bc19f1d51ac9db7c ]---
The faulting instruction is the storage key operation (iske) in
ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske
reads dirty and reference bit information for a physical page and
requires a valid physical address. Since we are in pte_clear, we
cannot rely on the pte containing a valid address. Fortunately we
dont need these information in pte_clear - after all there is no
mapping. The best fix is to remove the needless call to ptep_rcp_copy
that contains the iske.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot:
..
[ 0.000000] Movable zone: 0 pages used for memmap
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 775679
[ 0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[6920575.975232] console [ttyS0] enabled
[6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
..
The s390 implementation of sched_clock uses the store clock instruction and
subtracts jiffies_timer_cc.
jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used
for sched_clock and monotonic clock. For historical reasons there is an offset
on that value. With todays code this offset is unnecessary. By removing that
offset we can get a sched_clock which returns the nanoseconds after time_init.
This improves CONFIG_PRINTK_TIME.
Since sched_clock is the only user, I have also renamed jiffies_timer_cc to
sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant
and can be romved as well.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
syscall_get_nr() currently returns a valid result only if the call
chain of the traced process includes do_syscall_trace_enter(). But
collect_syscall() can be called for any sleeping task, the result of
syscall_get_nr() in general is completely bogus.
To make syscall_get_nr() work for any sleeping task the traps field
in pt_regs is replace with svcnr - the system call number the process
is executing. If svcnr == 0 the process is not on a system call path.
The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
for the first system call parameter. This is incorrect since gprs[2]
may have been overwritten with the system call number if the call
chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5330/1: mach-pxa: Fixup reset for systems using reboot=cold or other strings
[ARM] pxa: fix incorrect PCMCIA PSKTSEL pin configuration for spitz
[ARM] pxa: fix I2C controller device being registered twice on Akita
pxafb: only initialize the smart panel thread when dealing with a smartpanel
pxafb: introduce LCD_TYPE_MASK and use it.
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: ACE1001 patch for cp2101.c
USB: usbmon: fix read(2)
USB: gadget rndis: send notifications
USB: gadget rndis: stop windows self-immolation
USB: storage: update unusual_devs entries for Nokia 5300 and 5310
USB: storage: updates unusual_devs entry for the Nokia 6300
usb: musb: fix bug in musb_schedule
USB: fix SB700 usb subsystem hang bug
fix xen_get_eflags. It doesn't take any argument.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
pv_cpu_ops.getreg(_IA64_REG_IP) returned constant.
But the returned ip valued should be the one in the caller, not of the callee.
This patch fixes that.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/kernel/pci-dma.c only needs to include iommu once.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Using printk from MCA/INIT context is unsafe since it can cause deadlock.
The ia64_mca_modify_original_stack is called from both of mca handler and
init handler, so it should use mprintk instead of printk.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Itanium processors can handle some misaligned data accesses. They
also provide a mode where all such accesses are forced to trap. The
kernel was schizophrenic about use of this mode:
* Base kernel code ran in permissive mode where the only traps
generated were from those cases that the h/w could not handle.
* Interrupt, syscall and trap code ran in strict mode where all
unaligned accesses caused traps to the 0x5a00 unaligned reference
vector.
Use strict alignment checking throughout the kernel, but make
sure that we continue to let user mode use more relaxed mode
as the default.
Signed-off-by: Tony Luck <tony.luck@intel.com>
When we migrate an interrupt from one CPU to another, we set the
move_in_progress flag and clean up the vectors later once they're not
being used. If you're unlucky and call destroy_irq() before the vectors
become un-used, the move_in_progress flag is never cleared, which causes
the interrupt to become unusable.
This was discovered by Jesse Brandeburg for whom it manifested as an
MSI-X device refusing to use MSI-X mode when the driver was unloaded
and reloaded repeatedly.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a regression reported by Max Kellermann whereby kernel profiling
showed that his clients were spending 45% of their time in
rpcauth_lookup_credcache.
It turns out that although his processes had identical uid/gid/groups,
generic_match() was failing to detect this, because the task->group_info
pointers were not shared. This again lead to the creation of a huge number
of identical credentials at the RPC layer.
The regression is fixed by comparing the contents of task->group_info
if the actual pointers are not identical.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Do not attempt to close invalidated file handles
[CIFS] fix check for dead tcon in smb_init
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.
MIPS: csrc-r4k: Fix spelling mistake.
MIPS: RB532: Provide functions for gpio configuration
MIPS: IP22: Make indy_sc_ops variable static
MIPS: RB532: GPIO register offsets are relative to GPIOBASE
MIPS: Malta: Fix include paths in malta-amon.c
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
net: fix tiny output corruption of /proc/net/snmp6
atl2: don't request irq on resume if netif running
ipv6: use seq_release_private for ip6mr.c /proc entries
pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
smc911x: Fix printf format typo in smc911x driver.
asix: Fix asix-based cards connecting to 10/100Mbs LAN.
mv643xx_eth: fix recycle check bound
mv643xx_eth: fix the order of mdiobus_{unregister, free}() calls
sh: sh_eth: Update to change of mii_bus
TPROXY: supply a struct flowi->flags argument in inet_sk_rebuild_header()
TPROXY: fill struct flowi->flags in udp_sendmsg()
net: ipg.c fix bracing on endian swapping
phylib: Fix auto-negotiation restart avoidance
net: jme.c rxdesc.flags is __le16, other missing endian swaps
phylib: fix phy name example in documentation
net: Do not fire linkwatch events until the device is registered.
phonet: fix compilation with gcc-3.4
ixgbe: fix compilation with gcc-3.4
pktgen: fix multiple queue warning
net: fix ip_mr_init() error path
...
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: uaccess_64: fix return value in __copy_from_user()
x86: quirk for reboot stalls on a Dell Optiplex 330
Commit 81e192d6ce ("parisc: convert to
generic compat_sys_ptrace") introduced a bug which segfaults the parisc
64bit kernel when stracing 32bit applications:
Kernel Fault: Code=15 regs=00000000bafa42b0 (Addr=00000001baf5ab57)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001101111111100001011 Tainted: G W
r00-03 000000ff0806ff0b 000000004068edc0 00000000401203f8 00000000fb3e2508
r04-07 0000000040686dc0 00000000baf5a800 fffffffffffffffc fffffffffb3e2508
r08-11 00000000baf5a800 000000000004b068 00000000000402b0 0000000000040d68
r12-15 0000000000042a9c 0000000000040a9c 0000000000040d60 0000000000042e9c
r16-19 000000000004b060 000000000004b058 0000000000042d9c ffffffffffffffff
r20-23 000000000800000b 0000000000000000 000000000800000b fffffffffb3e2508
r24-27 00000000fffffffc 0000000000000003 00000000fffffffc 0000000040686dc0
r28-31 00000001baf5a7ff 00000000bafa4280 00000000bafa42b0 00000000000001d7
sr00-03 0000000000fca000 0000000000000000 0000000000000000 0000000000fca000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040120400 0000000040120404
IIR: 4b9a06b0 ISR: 0000000000000000 IOR: 00000001baf5ab57
CPU: 0 CR30: 00000000bafa4000 CR31: 00000000d22344e0
ORIG_R28: 00000000fb3e2248
IAOQ[0]: compat_arch_ptrace+0xb8/0x160
IAOQ[1]: compat_arch_ptrace+0xbc/0x160
RP(r2): compat_arch_ptrace+0xb0/0x160
Backtrace:
[<00000000401612ac>] compat_sys_ptrace+0x15c/0x180
[<0000000040104ef8>] syscall_exit+0x0/0x14
The problem is that compat_arch_ptrace() enters with an addr value of
type compat_ulong_t and calls translate_usr_offset() to translate the
address offset into a struct pt_regs offset like this:
addr = translate_usr_offset(addr)
this means that any return value of translate_usr_offset() is stored
back as compat_ulong_t type into the addr variable.
But since translate_usr_offset() returns -1 for invalid offsets, addr
can now get the value 0xffffffff which then fails the next return-value
sanity check and thus the kernel tries to access invalid memory:
if (addr < 0)
break;
Fix this bug by modifying translate_usr_offset() to take and return
values of type compat_ulong_t, and by returning the value
"sizeof(struct pt_regs)" as an error indicator.
Additionally change the sanity check to check for return values
for >= sizeof(struct pt_regs).
This patch survived my compile and run-tests.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a connection with open file handles has gone down
and come back up and reconnected without reopening
the file handle yet, do not attempt to send an SMB close
request for this handle in cifs_close. We were
checking for the connection being invalid in cifs_close
but since the connection may have been reconnected
we also need to check whether the file handle
was marked invalid (otherwise we could close the
wrong file handle by accident).
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
As gpiolib doesn't support pin multiplexing, it provides no way to
access the GPIOFUNC register. Also there is no support for setting
interrupt status and level. These functions provide access to them and
are needed by the CompactFlash driver.
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The indy_sc_ops variable in arch/mips/mm/sc-ip22.c is needlessly defined
global, and this patch makes it static.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
This patch fixes the wrong use of GPIO register offsets
in devices.c. To avoid further problems, use gpio_get_value
to return the NAND status instead of our own expanded code.
Also define the zero offset of the alternate function register to allow
consistent access.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On linux-queue, malta doesn't build after the include file relocation.
This should fix it.
There some occurrences of 'asm-mips' in the comments of quite a few
files, but this is the only place I found it in any code.
Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Because "name" is static, it can be occasionally be filled with
somewhat garbage if two processes read /proc/net/snmp6.
Also, remove useless casts and "-1" -- snprintf() correctly terminates it's
output.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ip6mr.c, /proc entries /proc/net/ip6_mr_cache and /proc/net/ip6_mr_vif
are opened with seq_open_private(), thus seq_release_private() should be
used to release them.
Should fix a small memory leak.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nla_nest_start() might return NULL, causing a NULL pointer dereference.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add AX_MEDIUM_ENCK also when speed = 10/100Mbps. This allows my belkin
f5d5055 to work with my 100Mbps switch and with an old 10Mbps ISA card.
Without this patch, the card is recognized and the interface is brought
up fine, but no packets actually flow through the interface.
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
Acked-by: David Hollis <dhollis@davehollis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mv643xx_eth allocates skbuffs, it adds
'dma_get_cache_alignment() - 1' to the length it needs, so that it can
align the skb's ->data pointer to a cache boundary. When checking
whether a transmitted skbuff can be reused as a receive buffer, these
bytes needs to be included into the minimum bound for the recycle check.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update to change of mii_bus interface and fix some warning.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_sk_rebuild_header() does a new route lookup if the dst_entry
associated with a socket becomes stale. However inet_sk_rebuild_header()
didn't use struct flowi->flags, causing the route lookup to
fail for foreign-bound IP_TRANSPARENT sockets, causing an error
state to be set for the sockets in question.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
udp_sendmsg() didn't fill struct flowi->flags, which means that
the route lookup would fail for non-local IPs even if the
IP_TRANSPARENT sockopt was set.
This prevents sendto() to work properly for UDP sockets, whereas
bind(foreign-ip) + connect() + send() worked fine.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch which adds IDs for AKTAKOM USB->RS232 cable
(http://www.aktakom.ru/product/kio/ace-1001.htm) is attached.
From: M Kondrin <mkondrin@hppi.troitsk.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There's a bug in the usbmon binary reader: When using read() to fetch
the packets and a packet's data is partially read, the next read call
will once again return up to len_cap bytes of data. The b_read counter
is not regarded when determining the remaining chunk size.
So, when dumping USB data with "cat /dev/usbmon0 > usbmon.trace" while
reading from a USB storage device and analyzing the dump file
afterwards it will get out of sync after a couple of packets.
Signed-off-by: Ingo van Lil <inguin@gmx.de>
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It turns out that atomic_inc_return() returns the *new* value
not the original one, so the logic in rndis_response_available()
kept the first RNDIS response notification from getting out.
This prevented interoperation with MS-Windows (but not Linux).
Fix this to make RNDIS behave again.
Signed-off-by: Richard Röjfors <richard.rojfors@endian.se>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Somewhere in the conversion of the RNDIS gadget code to the new
framework, the descriptor of its data interface seems to have
been copied from the CDC Ethernet driver. Unfortunately that
means it got a nonzero altsetting ... which is incorrect. Issue
uncovered by Richard Röjfors <richard.rojfors@endian.se>.
This patch fixes that problem, and resolves at least some cases
of Windows XP bluescreening itself.
Tested-by: Richard Röjfors <richard.rojfors@endian.se>.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1168) updates the unusual_devs entry for the Nokia 5300.
According to Jorge Lucangeli Obes <t4m5yn@gmail.com>, some existing
models have a revision number lower than the lower limit of the
current entry.
The patch also moves the entry for the Nokia 5310 to its correct place
in the file.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1169) modifies the unusual_devs entry for the Nokia
6300. According to Maciej Gierok <mgierok@gmail.com> and David
McBride <dwm@doc.ic.ac.uk>, the revision limits need to be wider.
This fixes Bugzilla #11768.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This bug was introduced recently. Fix it before bigger
problems appear.
Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is required for AMD SB700 south bridge revision A12 and A13 to avoid
USB subsystem hang symptom. The USB subsystem hang symptom is observed when the
system has multiple USB devices connected to it. In some cases a USB hub may be
required to observe this symptom.
This patch works around the problem by correcting the internal register setting
that will help by changing the behavior of the internal logic to avoid the
USB subsystem hang issue. The change in the behavior of the logic does not
impact the normal operation of the USB subsystem.
Reported-by: Volker Armin Hemmann <volker.armin.hemmann@tu-clausthal.de>
Tested-by: Volker Armin Hemmann <volker.armin.hemmann@tu-clausthal.de>
Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Libin Yang <libin.yang@amd.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'x86/numa' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: make NUMA on 32-bit depend on EXPERIMENTAL again
x86, hibernate: fix breakage on x86_32 with CONFIG_NUMA set