Startup code for i386 in arch/x86/kernel/head_32.S is using the
reference variable initial_code that is located in the .cpuinit.data
section. If CONFIG_HOTPLUG_CPU is enabled, startup code is not in an
init section and can be called later too. In this case the reference
initial_code must be kept too. This patch fixes this. See below for
the section mismatch warning.
WARNING: vmlinux.o(.cpuinit.data+0x0): Section mismatch in reference
from the variable initial_code to the function
.init.text:i386_start_kernel()
The variable __cpuinitdata initial_code references
a function __init i386_start_kernel().
If i386_start_kernel is only used by initial_code then
annotate i386_start_kernel with a matching annotation.
Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1248716632-26844-1-git-send-email-robert.richter@amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: geode: Mark mfgpt irq IRQF_TIMER to prevent resume failure
x86, amd: Don't probe for extended APIC ID if APICs are disabled
x86, mce: Rename incorrect macro name "CONFIG_X86_THRESHOLD"
x86-64: Fix bad_srat() to clear all state
x86, mce: Fix set_trigger() accessor
x86: Fix movq immediate operand constraints in uaccess.h
x86: Fix movq immediate operand constraints in uaccess_64.h
x86: Add reboot fixup for SBC-fitPC2
x86: Include all of .data.* sections in _edata on 64-bit
x86: Add quirk for Intel DG45ID board to avoid low memory corruption
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
Upcoming paches to support the new 64-bit "BookE" powerpc architecture
will need to have the virtual address corresponding to PTE page when
freeing it, due to the way the HW table walker works.
Basically, the TLB can be loaded with "large" pages that cover the whole
virtual space (well, sort-of, half of it actually) represented by a PTE
page, and which contain an "indirect" bit indicating that this TLB entry
RPN points to an array of PTEs from which the TLB can then create direct
entries. Thus, in order to invalidate those when PTE pages are deleted,
we need the virtual address to pass to tlbilx or tlbivax instructions.
The old trick of sticking it somewhere in the PTE page struct page sucks
too much, the address is almost readily available in all call sites and
almost everybody implemets these as macros, so we may as well add the
argument everywhere. I added it to the pmd and pud variants for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Timer interrupts are excluded from being disabled during suspend. The
clock events code manages the disabling of clock events on its own
because the timer interrupt needs to be functional before the resume
code reenables the device interrupts.
The mfgpt timer request its interrupt without setting the IRQF_TIMER
flag so suspend_device_irqs() disables it as well which results in a
fatal resume failure.
Adding IRQF_TIMER to the interupt flags when requesting the mrgpt
timer interrupt solves the problem.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Cc: Andres Salomon <dilinger@debian.org>
Cc: stable@kernel.org
* 'perf-counters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-perf: (31 commits)
perf_counter tools: Give perf top inherit option
perf_counter tools: Fix vmlinux symbol generation breakage
perf_counter: Detect debugfs location
perf_counter: Add tracepoint support to perf list, perf stat
perf symbol: C++ demangling
perf: avoid structure size confusion by using a fixed size
perf_counter: Fix throttle/unthrottle event logging
perf_counter: Improve perf stat and perf record option parsing
perf_counter: PERF_SAMPLE_ID and inherited counters
perf_counter: Plug more stack leaks
perf: Fix stack data leak
perf_counter: Remove unused variables
perf_counter: Make call graph option consistent
perf_counter: Add perf record option to log addresses
perf_counter: Log vfork as a fork event
perf_counter: Synthesize VDSO mmap event
perf_counter: Make sure we dont leak kernel memory to userspace
perf_counter tools: Fix index boundary check
perf_counter: Fix the tracepoint channel to perfcounters
perf_counter, x86: Extend perf_counter Pentium M support
...
If we've logically disabled apics, don't probe the PCI space for the
AMD extended APIC ID.
[ Impact: prevent boot crash under Xen. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reported-by: Bastian Blank <bastian@waldi.eu.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
CONFIG_X86_THRESHOLD used in arch/x86/kernel/irqinit.c is always
undefined. Rename it to the correct name "CONFIG_X86_MCE_THRESHOLD".
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <4A667FD4.3010509@hitachi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Need to clear both nodes and nodes_add state for start/end.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <20090718065657.GA2898@basil.fritz.box>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
Fix the condition checking the result of strchr() (which previously
could result in an oops), and make the function return the number of
bytes actively used.
[ Impact: fix oops ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <4A5F04B7020000780000AB59@vpn.id2.novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The movq instruction, generated by __put_user_asm() when used for
64-bit data, takes a sign-extended immediate ("e") not a zero-extended
immediate ("Z").
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: stable@kernel.org
arch/x86/include/asm/uaccess_64.h uses wrong asm operand constraint
("ir") for movq insn. Since movq sign-extends its immediate operand,
"er" constraint should be used instead.
Attached patch changes all uses of __put_user_asm in uaccess_64.h to use
"er" when "q" insn suffix is involved.
Patch was compile tested on x86_64 with defconfig.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
The CompuLab SBC-fitPC2 board needs to reboot via BIOS.
Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The .data.read_mostly and .data.cacheline_aligned sections
aren't covered by the _sdata .. _edata range on x86-64. This
affects kmemleak reporting leading to possible false
positives by not scanning the whole data section.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Alexey Fisher <bug-track@fisher-privat.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <1247565175.28240.37.camel@pc1117.cambridge.arm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Sam Ravnborg <sam@ravnborg.org>
AMI BIOS with low memory corruption was found on Intel DG45ID
board (Bug 13710). Add this board to the blacklist - in the
(somewhat optimistic) hope of future boards/BIOSes from Intel
not having this bug.
Also see:
http://bugzilla.kernel.org/show_bug.cgi?id=13736
Signed-off-by: Alexey Fisher <bug-track@fisher-privat.net>
Cc: ykzhao <yakui.zhao@intel.com>
Cc: alan@lxorguk.ukuu.org.uk
Cc: <stable@kernel.org>
LKML-Reference: <1247660169-4503-1-git-send-email-bug-track@fisher-privat.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Avoid the following:
[ 0.012093] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_dummy+0x2f/0x40()
Rather than chase each new cpuid-detected feature, just lie about the highest
valid CPUID so this code is never run.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
when building 32-bit, I see this ..
arch/x86/kernel/pvclock.c:63:7: warning: "__x86_64__" is not defined
Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <20090713201437.GA12165@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The variable apic_numaq placed in noninit section references the
function wakeup_secondary_cpu_via_nmi(), which is in __cpuinit
section. Thus causes a section mismatch warning. To avoid such
mismatch we mark apic_numaq as __refdata.
We were warned by the following warning:
WARNING: arch/x86/kernel/built-in.o(.data+0x932c): Section mismatch in
reference from the variable apic_numaq to the function
.cpuinit.text:wakeup_secondary_cpu_via_nmi()
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120407p6b4f67dtf4d563155488188a@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The variable apic_es7000_cluster references the function __cpuinit
wakeup_secondary_cpu_via_mip() from a noninit section. So we've been
warned by the following warning. To avoid possible collision between
init/noninit, its best to mark the variable as __refdata.
We were warned by the following warning:
LD arch/x86/kernel/apic/built-in.o
WARNING: arch/x86/kernel/apic/built-in.o(.data+0x198c): Section
mismatch in reference from the variable apic_es7000_cluster to the
function .cpuinit.text:wakeup_secondary_cpu_via_mip()
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <b9df5fa10907120404k6279a10ch5e9682432272706f@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I've attached a patch to remove the Pentium M special casing of
EMON and as noticed at least with my Pentium M the hardware PMU
now works:
Performance counter stats for '/bin/ls /var/tmp':
1.809988 task-clock-msecs # 0.125 CPUs
1 context-switches # 0.001 M/sec
0 CPU-migrations # 0.000 M/sec
224 page-faults # 0.124 M/sec
1425648 cycles # 787.656 M/sec
912755 instructions # 0.640 IPC
Vince suggested that this code was trying to address erratum
Y17 in Pentium-M's:
http://download.intel.com/support/processors/mobile/pm/sb/25266532.pdf
But that erratum (related to IA32_MISC_ENABLES.7) does not
affect perfcounters as we dont use this toggle to disable RDPMC
and WRMSR/RDMSR access to performance counters. We keep cr4's
bit 8 (X86_CR4_PCE) clear so unprivileged RDPMC access is not
allowed anyway.
Cc: Vince Weaver <vince@deater.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since commit 5fd29d6c ("printk: clean up handling of log-levels
and newlines"), the kernel logs segfaults like:
<6>gnome-power-man[24509]: segfault at 20 ip 00007f9d4950465a sp 00007fffbb50fc70 error 4 in libgobject-2.0.so.0.2103.0[7f9d494f7000+45000]
with the extra "<6>" being KERN_INFO. This happens because the
printk in show_signal_msg() started with KERN_CONT and then
used "%s" to pass in the real level; and KERN_CONT is no longer
an empty string, and printk only pays attention to the level at
the very beginning of the format string.
Therefore, remove the KERN_CONT from this printk, since it is
now actively causing problems (and never really made any
sense).
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <874otjitkj.fsf@shaolin.home.digitalvampire.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
dma-debug: Fix the overlap() function to be correct and readable
oprofile: reset bt_lost_no_mapping with other stats
x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
signals: declare sys_rt_tgsigqueueinfo in syscalls.h
rcu: Mark Hierarchical RCU no longer experimental
dma-debug: Put all hash-chain locks into the same lock class
dma-debug: fix off-by-one error in overlap function
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
perf report: Add "Fractal" mode output - support callchains with relative overhead rate
perf_counter tools: callchains: Manage the cumul hits on the fly
perf report: Change default callchain parameters
perf report: Use a modifiable string for default callchain options
perf report: Warn on callchain output request from non-callchain file
x86: atomic64: Inline atomic64_read() again
x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative()
x86: atomic64: Improve atomic64_xchg()
x86: atomic64: Export APIs to modules
x86: atomic64: Improve atomic64_read()
x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP
x86: atomic64: Fix unclean type use in atomic64_xchg()
x86: atomic64: Make atomic_read() type-safe
x86: atomic64: Reduce size of functions
x86: atomic64: Improve atomic64_add_return()
x86: atomic64: Improve cmpxchg8b()
x86: atomic64: Improve atomic64_read()
x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file
x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too
perf report: Annotate variable initialization
...
Pull the initial preempt_count value into a single
definition site.
Maintainers for: alpha, ia64 and m68k, please have a look,
your arch code is funny.
The header magic is a bit odd, but similar to the KERNEL_DS
one, CPP waits with expanding these macros until the
INIT_THREAD_INFO macro itself is expanded, which is in
arch/*/kernel/init_task.c where we've already included
sched.h so we're good.
Cc: tony.luck@intel.com
Cc: rth@twiddle.net
Cc: geert@linux-m68k.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?)
Dann bisected it down to:
commit 30a18d6c3f
Date: Tue Feb 19 03:21:20 2008 -0800
x86: multi pci root bus with different io resource range, on
64-bit
It turns out that:
1. that AMD-based systems have two HT chains.
2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc
3. that multi-peer-root patch will try to split root resources to peer
root resources according to PCI conf of NB
4. PCI core assigns unassigned resources, but they overlap with BARs
that are used by ioapic addr of io4 and 8132.
The reason: at that point ioapic address are not inserted yet. Solution
is to insert ioapic resources into the tree a bit earlier.
Reported-by: Stephen Frost <sfrost@snowman.net>
Reported-and-Tested-by: dann frazier <dannf@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@jbarnes-g45.(none)>
Commit 1faa16d228 accidentally broke
the bdi congestion wait queue logic, causing us to wait on congestion
for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Ingo noticed that both AMD and P6 call
x86_pmu_disable_counter() on *_pmu_enable_counter(). This is
because we rely on the side effect of that call to program
the event config but not touch the EN bit.
We change that for AMD by having enable_all() simply write
the full config in, and for P6 by explicitly coding it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The P6 doesn't seem to support cache ref/hit/miss counts, so
we extend the generic hardware event codes to have 0 and -1
mean the same thing as for the generic cache events.
Furthermore, it turns out the 0 event does not count
(that is, its reported that on PPro it actually does count
something), therefore use a event configuration that's
specified not to count to disable the counters.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add basic P6 PMU support. The P6 uses the EVNTSEL0 EN bit to
enable/disable both its counters. We use this for the
global enable/disable, and clear all config bits (except EN)
to disable individual counters.
Actual ia32 hardware doesn't support lfence, so use a locked
op without side-effect to implement a full barrier.
perf stat and perf record seem to function correctly.
[a.p.zijlstra@chello.nl: cleanups and complete the enable/disable code]
Signed-off-by: Vince Weaver <vince@deater.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <Pine.LNX.4.64.0907081718450.2715@pianoman.cluster.toy>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
cxgb3: Fix crash caused by stashing wrong netdev_queue
ixgbe: Fix coexistence of FCoE and Flow Director in 82599
memory barrier: adding smp_mb__after_lock
net: adding memory barrier to the poll and receive callbacks
netpoll: Fix carrier detection for drivers that are using phylib
includecheck fix: include/linux, rfkill.h
p54: tx refused but queue active
Atheros Kconfig needs to be dependent on WLAN_80211
mac80211: fix docbook
mac80211_hwsim: avoid NULL access
ssb: Add support for 4318E
b43: Add support for 4318E
zd1211rw: adding SONY IFU-WLM2 (054c:0257) as a zd1211b device
zd1211rw: 07b8:6001 is a ZD1211B
r6040: bump driver version to 0.24 and date to 08 July 2009
r6040: restore MIER register correctly when IRQ line is shared
ipv4: Fix fib_trie rebalancing, part 4 (root thresholds)
davinci_emac: fix kernel oops when changing MAC address while interface is down
igb: set lan id prior to configuring phy
mac80211: minstrel: avoid accessing negative indices in rix_to_ndx()
...
The short name of the achitecture is 'arch_perfmon'. This patch
changes the kernel parameter to use this name.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adding smp_mb__after_lock define to be used as a smp_mb call after
a lock.
Making it nop for x86, since {read|write|spin}_lock() on x86 are
full memory barriers.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex found that specjbb2005 still can not run with hugepages on an
x86-64 machine. This only happens when numa is not compiled in.
The root cause: node_set_state will not set it back for us in that case,
so don't clear that when numa is not select in config
[ v2: use node_clear_state instead ]
Reported-and-Tested-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics. printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.
<level> is now included in the output on each additional use.
Remove all uses of multiple KERN_<level>s in formats.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Powernow-k8: support family 0xf with 2 low p-states
[CPUFREQ] fix (utter) cpufreq_add_dev mess
[CPUFREQ] Cleanup locking in conservative governor
[CPUFREQ] Cleanup locking in ondemand governor
[CPUFREQ] Mark policy_rwsem as going static in cpufreq.c wont be exported
[CPUFREQ] Eliminate the recent lockdep warnings in cpufreq
Patch 08687aec71bc9134fe336e561f6did877bacf74fc0a (x86: unify
power/cpu_(32|64).c) renamed cpu_32.c to cpu.c, but did not update
the special compilation flags for the file for the new name.
This patch fixes the compilation flags, and therefore fixes resume
from suspend on my Acer Aspire One.
[rjw: The regression from 2.6.30 fixed by this patch is tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=13661]
Signed-off-by: Peter Chubb <peterc@nicta.com.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Provide support for family 0xf processors with 2 P-states
below the elevator voltage. Remove the checks that prevent
this configuration from being supported and increase the
transition voltage to prevent errors during the transition.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix usage of bios intcall()
x86: Remove unused function lapic_watchdog_ok()
x86: Remove unused variable disable_x2apic
x86, kvm: Fix section mismatches in kvm.c
x86: Add missing annotation to arch/x86/lib/copy_user_64.S::copy_to_user
x86: Fix fixmap page order for FIX_TEXT_POKE0,1
amd-iommu: set evt_buf_size correctly
amd-iommu: handle alias entries correctly in init code
x86: Fix printk call in print_local_apic()
x86: Declare check_efer() before it gets used
x86: Mark device_nb as static and fix NULL noise
x86: Remove double declaration of MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1
xen: Use kcalloc() in xen_init_IRQ()
x86: Fix fixmap ordering
x86: Fix symbol annotation for arch/x86/lib/clear_page_64.S::clear_page_c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Fix IRQ swizzling for ARI-enabled devices
ia64/PCI: adjust section annotation for pcibios_setup()
x86/PCI: get root CRS before scanning children
x86/PCI: fix boundary checking when using root CRS
PCI MSI: Fix restoration of MSI/MSI-X mask states in suspend/resume
PCI MSI: Unmask MSI if setup failed
PCI MSI: shorten PCI_MSIX_ENTRY_* symbol names
PCI: make pci_name() take const argument
PCI: More PATA quirks for not entering D3
PCI: fix kernel-doc warnings
PCI: check if bus has a proper bridge device before triggering SBR
PCI: remove pci_dac_dma_... APIs on mn10300
PCI ECRC: Remove unnecessary semicolons
PCI MSI: Return if alloc_msi_entry for MSI-X failed
* git://git.infradead.org/iommu-2.6:
intel-iommu: Don't use identity mapping for PCI devices behind bridges
intel-iommu: Use iommu_should_identity_map() at startup time too.
intel-iommu: No mapping for non-PCI devices
intel-iommu: Restore DMAR_BROKEN_GFX_WA option for broken graphics drivers
intel-iommu: Add iommu_should_identity_map() function
intel-iommu: Fix reattaching of devices to identity mapping domain
intel-iommu: Don't set identity mapping for bypassed graphics devices
intel-iommu: Fix dma vs. mm page confusion with aligned_nrpages()
Some intcall() misuses the input biosregs as output in
cf06de7b9c
This fixes the problem vga=ask boot option doesn't show enough modes.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
LKML-Reference: <20090701021307.GA3127@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
We need to give people a little more time to fix the broken drivers.
Re-introduce this, but tied in properly with the 'iommu=pt' support this
time. Change the config option name and make it default to 'no' too.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Now atomic64_read() is light weight (no register pressure and
small icache), we can inline it again.
Also use "=&A" constraint instead of "+A" to avoid warning
about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
$ size vmlinux.prev vmlinux.after
text data bss dec hex filename
4908667 451676 1684868 7045211 6b805b vmlinux.prev
4908651 451676 1684868 7045195 6b804b vmlinux.after
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <4A4E1AA2.30002@gmail.com>
[ Also fix typo in atomic64_set() export ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that the variable name 'old_val' is
confusingly named in these functions - the correct
naming is 'new_val'.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907030942260.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove the read-first logic from atomic64_xchg() and simplify
the loop.
This function was the last user of __atomic64_read() - remove it.
Also, change the 'real_val' assumption from the somewhat quirky
1ULL << 32 value to the (just as arbitrary, but simpler) value
of 0.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
atomic64_t primitives are used by a handful of drivers,
so export the APIs consistently. These were inlined
before.
Also mark atomic64_32.o a core object, so that the symbols
are available even if not linked to core kernel pieces.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Occasionally we get bugs where atomic_read or atomic_set are
used on atomic64_t variables or vice versa. These bugs don't
generate warnings on x86 because atomic_read and atomic_set are
coded as macros rather than C functions, so we don't get any
type-checking on their arguments; similarly for atomic64_read
and atomic64_set in 64-bit kernels.
This converts them to C functions so that the arguments are
type-checked and bugs like this will get caught more easily. It
also converts atomic_cmpxchg and atomic_xchg, and
atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get
type-checking on their arguments too.
Compiling a typical 64-bit x86 config, this generates no new
warnings, and the vmlinux text is 86 bytes smaller.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
lapic_watchdog_ok() is a global function but no one is using it.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
setup_nox2apic() is writing 1 to disable_x2apic but no one is reading it.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1246554239.2242.27.camel@jaswinder.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The function paravirt_ops_setup() has been refering the
variable no_timer_check, which is a __initdata. Thus generates
the following warning. paravirt_ops_setup() function is called
from kvm_guest_init() which is a __init function. So to fix
this we mark paravirt_ops_setup as __init.
The sections-check output that warned us about this was:
LD arch/x86/built-in.o
WARNING: arch/x86/built-in.o(.text+0x166ce): Section mismatch in
reference from the function paravirt_ops_setup() to the variable
.init.data:no_timer_check
The function paravirt_ops_setup() references
the variable __initdata no_timer_check.
This is often because paravirt_ops_setup lacks a __initdata
annotation or the annotation of no_timer_check is wrong.
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Acked-by: Avi Kivity <avi@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <b9df5fa10907012240y356427b8ta4bd07f0efc6a049@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While examining symbol generation in perf_counter tools, I
noticed that copy_to_user() had no size in vmlinux's symtab.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <1246512440.13293.3.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Masami reported:
> Since the fixmap pages are assigned higher address to lower,
> text_poke() has to use it with inverted order (FIX_TEXT_POKE1
> to FIX_TEXT_POKE0).
I prefer to just invert the order of the fixmap declaration.
It's simpler and more straightforward.
Backward fixmaps seems to be used by both x86 32 and 64.
It's really rare but a nasty bug, because it only hurts when
instructions to patch are crossing a page boundary. If this
happens, the fixmap write accesses will spill on the following
fixmap, which may very well crash the system. And this does not
crash the system, it could leave illegal instructions in place.
Thanks Masami for finding this.
It seems to have crept into the 2.6.30-rc series, so this calls
for a -stable inclusion.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090701213722.GH19926@Krystal>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that atomic64_xchg() uses atomic_read(), which
happens to work because atomic_read() is a macro so the
.counter value gets u64-read on 32-bit too - but this is really
bogus and serious bugs are waiting to happen.
Fix atomic64_xchg() to use __atomic64_read() instead.
No code changed:
arch/x86/lib/atomic64_32.o:
text data bss dec hex filename
435 0 0 435 1b3 atomic64_32.o.before
435 0 0 435 1b3 atomic64_32.o.after
md5:
bd8ab95e69c93518578bfaf0ea3be4d9 atomic64_32.o.before.asm
bd8ab95e69c93518578bfaf0ea3be4d9 atomic64_32.o.after.asm
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that atomic64_xchg() uses atomic_read(), which
happens to work because atomic_read() is a macro so the
.counter value gets u64-read on 32-bit too - but this is really
bogus and serious bugs are waiting to happen.
Change atomic_read() to be a type-safe inline, and this exposes
the atomic64 bogosity as well:
arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’:
arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
cmpxchg8b is a huge instruction in terms of register footprint,
we almost never want to inline it, not even within the same
code module.
GCC 4.3 still messes up for two functions, under-judging the
true cost of this instruction - so annotate two key functions
to reduce the bloat:
arch/x86/lib/atomic64_32.o:
text data bss dec hex filename
1763 0 0 1763 6e3 atomic64_32.o.before
435 0 0 435 1b3 atomic64_32.o.after
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noted (based on Eric Dumazet's numbers) that we would
probably be better off not trying an atomic_read() in
atomic64_add_return() but intead intentionally let the first
cmpxchg8b fail - to get a cache-friendly 'give me ownership
of this cacheline' transaction. That can then be followed
by the real cmpxchg8b which sets the value local to the CPU.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
constraint, to increase compiler freedom in code generation and
possibly better code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that the 32-bit version of atomic64_read() was
being overly complex with re-reading the value and doing a
retry loop over that.
Instead we can just rely on cmpxchg8b returning either the new
value or returning the current value.
We can use any 'old' value, which will be faster as it can be
loaded via immediates. Using some value that is not equal to
the real value in memory the instruction gets faster.
This also has the advantage that the CPU could avoid dirtying
the cacheline.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noted that the atomic64_t primitives are all inlines
currently which is crazy because these functions have a large
register footprint anyway.
Move them to a separate file: arch/x86/lib/atomic64_32.c
Also, while at it, rename all uses of 'unsigned long long' to
the much shorter u64.
This makes the appearance of the prototypes a lot nicer - and
it also uncovered a few bugs where (yet unused) API variants
had 'long' as their return type instead of u64.
[ More intrusive changes are not yet done in this patch. ]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Locked instructions on two cache lines at once are painful. If
atomic64_t uses two cache lines, my test program is 10x slower.
The chance for that is significant: 4/32 or 12.5%.
Make sure an atomic64_t is 8 bytes aligned.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
[ changed it to __aligned(8) as per Andrew's suggestion ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.infradead.org/iommu-2.6: (38 commits)
intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()
intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loops
intel-iommu: Use cmpxchg64_local() for setting PTEs
intel-iommu: Warn about unmatched unmap requests
intel-iommu: Kill superfluous mapping_lock
intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386
intel-iommu: Make iommu=pt work on i386 too
intel-iommu: Performance improvement for dma_pte_free_pagetable()
intel-iommu: Don't free too much in dma_pte_free_pagetable()
intel-iommu: dump mappings but don't die on pte already set
intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()
intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()
intel-iommu: Simplify __intel_alloc_iova()
intel-iommu: Performance improvement for domain_pfn_mapping()
intel-iommu: Performance improvement for dma_pte_clear_range()
intel-iommu: Clean up iommu_domain_identity_map()
intel-iommu: Remove last use of PHYSICAL_PAGE_MASK, for reserving PCI BARs
intel-iommu: Make iommu_flush_iotlb_psi() take pfn as argument
intel-iommu: Change aligned_size() to aligned_nrpages()
intel-iommu: Clean up intel_map_sg(), remove domain_page_mapping()
...
fix hang with HIGHMEM_64G and 32bit resource. According to hpa and
Linus, use (resource_size_t)-1 to fend off big ranges.
Analyzed by hpa
Reported-and-tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These macros had two bugs:
- the type of the mask was not correctly expanded to the full size of
the argument being expanded, resulting in possible loss of high bits
when mixing types.
- the alignment argument was evaluated twice, despite the macro looking
like a fancy function (but it really does need to be a macro, since
it works on arbitrary integer types)
Noticed by Peter Anvin, and with a fix that is a modification of his
suggestion (bug noticed by Yinghai Lu).
Cc: Peter Anvin <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The setting of this variable got lost during the suspend/resume
implementation. But keeping this variable zero causes a divide-by-zero
error in the interrupt handler. This patch fixes this.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
An alias entry in the ACPI table means that the device can send requests to the
IOMMU with both device ids, its own and the alias. This is not handled properly
in the ACPI init code. This patch fixes the issue.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
About every callchains recorded with perf record are filled up
including the internal perfcounter nmi frame:
perf_callchain
perf_counter_overflow
intel_pmu_handle_irq
perf_counter_nmi_handler
notifier_call_chain
atomic_notifier_call_chain
notify_die
do_nmi
nmi
We want ignore this frame as it's not interesting for
instrumentation. To solve this, we simply ignore every frames
from nmi context.
New example of "perf report -s sym -c" after this patch:
9.59% [k] search_by_key
4.88%
search_by_key
reiserfs_read_locked_inode
reiserfs_iget
reiserfs_lookup
do_lookup
__link_path_walk
path_walk
do_path_lookup
user_path_at
vfs_fstatat
vfs_lstat
sys_newlstat
system_call_fastpath
__lxstat
0x406fb1
3.19%
search_by_key
search_by_entry_key
reiserfs_find_entry
reiserfs_lookup
do_lookup
__link_path_walk
path_walk
do_path_lookup
user_path_at
vfs_fstatat
vfs_lstat
sys_newlstat
system_call_fastpath
__lxstat
0x406fb1
[...]
For now this patch only solves the problem in x86-64.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We can run a 32-bit kernel on boxes with an IOMMU, so we need
pci_unmap_addr() etc. to work -- without it, drivers will leak mappings.
To be honest, this whole thing looks like it's more pain than it's
worth; I'm half inclined to remove the no-op #else case altogether.
But this is the minimal fix, which just does the right thing if
CONFIG_DMAR is set.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org [ for 2.6.30 ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This sparse warning:
arch/x86/mm/init.c:83:16: warning: symbol 'check_efer' was not declared. Should it be static?
triggers because check_efer() is not decalared before using it.
asm/proto.h includes the declaration of check_efer(), so
including asm/proto.h to fix that - this also addresses the
sparse warning.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1246458263.6940.22.camel@hpdv5.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This sparse warning:
arch/x86/kernel/amd_iommu.c:1195:23: warning: symbol 'device_nb' was not declared. Should it be static?
triggers because device_nb is global but is only used in a
single .c file. change device_nb to static to fix that - this
also addresses the sparse warning.
This sparse warning:
arch/x86/kernel/amd_iommu.c:1766:10: warning: Using plain integer as NULL pointer
triggers because plain integer 0 is used in place of a NULL
pointer. change 0 to NULL to fix that - this also address the
sparse warning.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <1246458194.6940.20.camel@hpdv5.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (47 commits)
perf report: Add --symbols parameter
perf report: Add --comms parameter
perf report: Add --dsos parameter
perf_counter tools: Adjust only prelinked symbol's addresses
perf_counter: Provide a way to enable counters on exec
perf_counter tools: Reduce perf stat measurement overhead/skew
perf stat: Use percentages for scaling output
perf_counter, x86: Update x86_pmu after WARN()
perf stat: Micro-optimize the code: memcpy is only required if no event is selected and !null_run
perf stat: Improve output
perf stat: Fix multi-run stats
perf stat: Add -n/--null option to run without counters
perf_counter tools: Remove dead code
perf_counter: Complete counter swap
perf report: Print sorted callchains per histogram entries
perf_counter tools: Prepare a small callchain framework
perf record: Fix unhandled io return value
perf_counter tools: Add alias for 'l1d' and 'l1i'
perf-report: Add bare minimum PERF_EVENT_READ parsing
perf-report: Add modes for inherited stats and no-samples
...
Nathan reported that
| commit 73d60b7f74
| Author: Yinghai Lu <yinghai@kernel.org>
| Date: Tue Jun 16 15:33:00 2009 -0700
|
| page-allocator: clear N_HIGH_MEMORY map before we set it again
|
| SRAT tables may contains nodes of very small size. The arch code may
| decide to not activate such a node. However, currently the early boot
| code sets N_HIGH_MEMORY for such nodes. These nodes therefore seem to be
| active although these nodes have no present pages.
|
| For 64bit N_HIGH_MEMORY == N_NORMAL_MEMORY, so that works for 64 bit too
unintentionally and incorrectly clears the cpuset.mems cgroup attribute on
an i386 kvm guest, meaning that cpuset.mems can not be used.
Fix this by only clearing node_states[N_NORMAL_MEMORY] for 64bit only.
and need to do save/restore for that in find_zone_movable_pfn
Reported-by: Nathan Lynch <ntl@pobox.com>
Tested-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The merge of the 32- and 64-bit fixmap headers made a latent
bug on x86-64 a real one: with the right config settings
it is possible for FIX_OHCI1394_BASE to overlap the FIX_BTMAP_*
range.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: <stable@kernel.org> # for 2.6.30.x
LKML-Reference: <4A4A0A8702000078000082E8@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Noticed the zero-sized function symbol while looking at 'perf' profiles,
it causes the profiler to display those addresses in hexa.
Turns out that this was wrong/bogus for an eternity.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1246366820.6538.1.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows us to remove adjust_transparent_bridge_resources and give
x86_pci_root_bus_res_quirks a chance when _CRS is not used or not there.
Acked-by: Gary Hade <garyhade@us.ibm.com>
Tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Don't touch info->res_num if we are out of space.
Acked-by: Gary Hade <garyhade@us.ibm.com>
Tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "x86: cap iomem_resource to addressable physical memory"
There's no need for the GFX workaround now we have 'iommu=pt' for the
cases where people really care about performance. There's no need to
have a special case for just one type of device.
This also speeds up the iommu=pt path and reduces memory usage by
setting up the si_domain _once_ and then using it for all devices,
rather than giving each device its own private page tables.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The print out should read the value before changing the value.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A487017.4090007@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: shut up uninit compiler warning in paging_tmpl.h
KVM: Ignore reads to K7 EVNTSEL MSRs
KVM: VMX: Handle vmx instruction vmexits
KVM: s390: Allow stfle instruction in the guest
KVM: kvm/x86_emulate.c toggle_interruptibility() should be static
KVM: ia64: fix ia64 build due to missing kallsyms_lookup() and double export
KVM: protect concurrent make_all_cpus_request
KVM: MMU: Allow 4K ptes with bit 7 (PAT) set
KVM: Fix dirty bit tracking for slots with large pages
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, delay: tsc based udelay should have rdtsc_barrier
x86, setup: correct include file in <asm/boot.h>
x86, setup: Fix typo "CONFIG_x86_64" in <asm/boot.h>
x86, mce: percpu mcheck_timer should be pinned
x86: Add sysctl to allow panic on IOCK NMI error
x86: Fix uv bau sending buffer initialization
x86, mce: Fix mce resume on 32bit
x86: Move init_gbpages() to setup_arch()
x86: ensure percpu lpage doesn't consume too much vmalloc space
x86: implement percpu_alloc kernel parameter
x86: fix pageattr handling for lpage percpu allocator and re-enable it
x86: reorganize cpa_process_alias()
x86: prepare setup_pcpu_lpage() for pageattr fix
x86: rename remap percpu first chunk allocator to lpage
x86: fix duplicate free in setup_pcpu_remap() failure path
percpu: fix too lazy vunmap cache flushing
x86: Set cpu_llc_id on AMD CPUs
Dixes compilation warning:
CC arch/x86/kernel/io_delay.o
arch/x86/kvm/paging_tmpl.h: In function ‘paging64_fetch’:
arch/x86/kvm/paging_tmpl.h:279: warning: ‘sptep’ may be used uninitialized in this function
arch/x86/kvm/paging_tmpl.h: In function ‘paging32_fetch’:
arch/x86/kvm/paging_tmpl.h:279: warning: ‘sptep’ may be used uninitialized in this function
warning is bogus (always have a least one level), but need to shut the compiler
up.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
In commit 7fe29e0faa we ignored the
reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines.
Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD
hosts.
This fixes Kaspersky antivirus crashing Windows guests on AMD hosts.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
IF a guest tries to use vmx instructions, inject a #UD to let it know the
instruction is not implemented, rather than crashing.
This prevents guest userspace from crashing the guest kernel.
Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
toggle_interruptibility() is used only by same file, it should be static.
Fixed following sparse warning :
arch/x86/kvm/x86_emulate.c:1364:6: warning: symbol 'toggle_interruptibility' was not declared. Should it be static?
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This reverts commit 95ee14e437.
Mikael Petterson <mikepe@it.uu.se> reported that at least one of his
systems will not boot as a result. We have ruled out the detection
algorithm malfunctioning, so it is not a matter of producing the
incorrect bitmasks; rather, something in the application of them
fails.
Revert the commit until we can root cause and correct this problem.
-stable team: this means the underlying commit should be rejected.
Reported-and-isolated-by: Mikael Petterson <mikpe@it.uu.se>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <200906261559.n5QFxJH8027336@pilspetsen.it.uu.se>
Cc: stable@kernel.org
Cc: Grant Grundler <grundler@parisc-linux.org>
<asm/boot.h> needs <asm/pgtable_types.h>, not <asm/page_types.h> in
order to resolve PMD_SHIFT. Also, correct a +1 which really should be
+ THREAD_ORDER.
This is a build error which was masked by a typoed #ifdef.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
CONFIG_X86_64 was misspelled (wrong case), which caused the x86-64
kernel to advertise itself as more relocatable than it really is.
This could in theory cause boot failures once bootloaders start
support the new relocation fields.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
If CONFIG_NO_HZ + CONFIG_SMP, timer added via add_timer() might
be migrated on other cpu. Use add_timer_on() instead.
Avoids the following failure:
Maciej Rutecki wrote:
> > After normal boot I try:
> >
> > echo 1 > /sys/devices/system/machinecheck/machinecheck0/check_interval
> >
> > I found this in dmesg:
> >
> > [ 141.704025] ------------[ cut here ]------------
> > [ 141.704039] WARNING: at arch/x86/kernel/cpu/mcheck/mce.c:1102
> > mcheck_timer+0xf5/0x100()
Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch introduces a new sysctl:
/proc/sys/kernel/panic_on_io_nmi
which defaults to 0 (off).
When enabled, the kernel panics when the kernel receives an NMI
caused by an IO error.
The IO error triggered NMI indicates a serious system
condition, which could result in IO data corruption. Rather
than contiuing, panicing and dumping might be a better choice,
so one can figure out what's causing the IO error.
This could be especially important to companies running IO
intensive applications where corruption must be avoided, e.g. a
bank's databases.
[ SuSE has been shipping it for a while, it was done at the
request of a large database vendor, for their users. ]
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: Roberto Angelino <robertangelino@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <20090624213211.GA11291@kroah.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>