| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adjustments to Xen's persistent clock via update_persistent_clock()
don't actually persist, as the Xen wallclock is a software only clock
and modifications to it do not modify the underlying CMOS RTC.
The x86_platform.set_wallclock hook is there to keep the hardware RTC
synchronized. On a guest this is pointless.
On Dom0 we can use the native implementaion which actually updates the
hardware RTC, but we still need to keep the software emulation of RTC
for the guests up to date. The subscription to the pvclock_notifier
allows us to emulate this easily. The notifier is called at every tick
and when the clock was set.
Right now we only use that notifier when the clock was set, but due to
the fact that it is called periodically from the timekeeping update
code, we can utilize it to emulate the NTP driven drift compensation
of update_persistant_clock() for the Xen wall (software) clock.
Add a 11 minutes periodic update to the pvclock_gtod notifier callback
to achieve that. The static variable 'next' which maintains that 11
minutes update cycle is protected by the core code serialization so
there is no need to add a Xen specific serialization mechanism.
[ tglx: Massaged changelog and added a few comments ]
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/1372329348-20841-6-git-send-email-david.vrabel@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the Xen wallclock is only updated every 11 minutes if NTP is
synchronized to its clock source (using the sync_cmos_clock() work).
If a guest is started before NTP is synchronized it may see an
incorrect wallclock time.
Use the pvclock_gtod notifier chain to receive a notification when the
system time has changed and update the wallclock to match.
This chain is called on every timer tick and we want to avoid an extra
(expensive) hypercall on every tick. Because dom0 has historically
never provided a very accurate wallclock and guests do not expect one,
we can do this simply: the wallclock is only updated if the clock was
set.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/1372329348-20841-5-git-send-email-david.vrabel@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some new users of the ARM sched_clock framework are going through
the arm-soc tree. Before 38ff87f (sched_clock: Make ARM's
sched_clock generic for all architectures, 2013-06-01) the header
file was in asm, but now it's in linux. One solution would be to
do an evil merge of the arm-soc tree and fix up the asm users,
but it's easier to add a temporary asm header that we can remove
along with the few stragglers after the merge window is over.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 55a68c23e0a675b2b8ac2656fd6edbf98b78e4c6.
In order to avoid a collision with dw_apb_timer changes in
the arm-soc tree, revert this change.
I'm leaving it to the arm-soc folks to sort out if they want
to keep the other side of the collision or if they're just going
to back it all out and try again during the next release cycle.
Reported-by: Dinh Nguyen <dinguyen@altera.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
| |
Nothing about the sched_clock implementation in the ARM port is
specific to the architecture. Generalize the code so that other
architectures can use it by selecting GENERIC_SCHED_CLOCK.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
[jstultz: Merge minor collisions with other patches in my tree]
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're suspended and sched_clock() is called we're going to
read the hardware one more time and throw away that value and
return back the cached value we saved during the suspend
callback. This is wasteful. Let's short circuit all that and
return the cached value as early as possible if we're suspended.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
| |
The needs_suspend member is unused now that we always do the
suspend/resume handling (see 6a4dae5 (ARM: 7565/1: sched: stop
sched_clock() during suspend, 2012-10-23)).
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch "x86: Increase precision of x86_platform.get/set_wallclock"
changed the x86 platform set_wallclock/get_wallclock interfaces to
use nsec granular timespecs instead of a second granular interface.
However, that patch missed converting the vrtc code, so this patch
converts those functions to use timespecs.
Many thanks to the kbuild test robot for finding this!
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the virtualized platforms (KVM, lguest and Xen) have persistent
wallclocks that have more than one second of precision.
read_persistent_wallclock() and update_persistent_wallclock() allow
for nanosecond precision but their implementation on x86 with
x86_platform.get/set_wallclock() only allows for one second precision.
This means guests may see a wallclock time that is off by up to 1
second.
Make set_wallclock() and get_wallclock() take a struct timespec
parameter (which allows for nanosecond precision) so KVM and Xen
guests may start with a more accurate wallclock time and a Xen dom0
can maintain a more accurate wallclock for guests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems we made a mistake when creating dw_apb_timer_of.c:
picoxcell sched_clock had parts that were not related to
dw_apb_timer, yet we moved them to dw_apb_timer_of, and tried to
use them on socfpga.
This results in system where user/system time is not measured
properly, as demonstrated by
time dd if=/dev/urandom of=/dev/zero bs=100000 count=100
So this patch switches sched_clock to hardware that exists on both
platforms, and adds missing of_node_put() in dw_apb_timer_init().
Signed-off-by: Pavel Machek <pavel@denx.de>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
- Fix for a CPU hot-add deadlock in microcode update code
- Fix for idle consolidation fallout
- Documentation update for initial kernel direct mapping
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Add missing comments for initial kernel direct mapping
x86/microcode: Add local mutex to fix physical CPU hot-add deadlock
x86: Fix idle consolidation fallout
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Two sets of comments were lost during patch-series shuffling:
- comments for init_range_memory_mapping()
- comments in init_mem_mapping that is helpful for reminding people
that the pagetable is setup top-down
The comments were written by Yinghai in his patch in:
https://lkml.org/lkml/2012/11/28/620
This patch reintroduces them.
Originally-From: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/518BC776.7010506@gmail.com
[ Tidied it all up a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This can easily be triggered if a new CPU is added (via
ACPI hotplug mechanism) and from user-space you do:
echo 1 > /sys/devices/system/cpu/cpu3/online
(or wait for UDEV to do it) on a newly appeared physical CPU.
The deadlock is that the "store_online" in drivers/base/cpu.c
takes the cpu_hotplug_driver_lock() lock, then calls "cpu_up".
"cpu_up" eventually ends up calling "save_mc_for_early"
which also takes the cpu_hotplug_driver_lock() lock.
And here is that lockdep thinks of it:
smpboot: Stack at about ffff880075c39f44
smpboot: CPU3: has booted.
microcode: CPU3 sig=0x206a7, pf=0x2, revision=0x25
=============================================
[ INFO: possible recursive locking detected ]
3.9.0upstream-10129-g167af0e #1 Not tainted
---------------------------------------------
sh/2487 is trying to acquire lock:
(x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
but task is already holding lock:
(x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(x86_cpu_hotplug_driver_mutex);
lock(x86_cpu_hotplug_driver_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
6 locks held by sh/2487:
#0: (sb_writers#5){.+.+.+}, at: [<ffffffff811ca48d>] vfs_write+0x17d/0x190
#1: (&buffer->mutex){+.+.+.}, at: [<ffffffff812464ef>] sysfs_write_file+0x3f/0x160
#2: (s_active#20){.+.+.+}, at: [<ffffffff81246578>] sysfs_write_file+0xc8/0x160
#3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
#4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810961c2>] cpu_maps_update_begin+0x12/0x20
#5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810962a7>] cpu_hotplug_begin+0x27/0x60
Suggested-and-Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: fenghua.yu@intel.com
Cc: xen-devel@lists.xensource.com
Cc: stable@vger.kernel.org # for v3.9
Link: http://lkml.kernel.org/r/1368029583-23337-1-git-send-email-konrad.wilk@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The core code expects the arch idle code to return with interrupts
enabled. The conversion missed two x86 cases which fail to do that.
Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305021557030.3972@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
- Cure for not using zalloc in the first place, which leads to random
crashes with CPUMASK_OFF_STACK.
- Revert a user space visible change which broke udev
- Add a missing cpu_online early return introduced by the new full
dyntick conversions
- Plug a long standing race in the timer wheel cpu hotplug code.
Sigh...
- Cleanup NOHZ per cpu data on cpu down to prevent stale data on cpu
up.
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
tick: Cleanup NOHZ per cpu data on cpu down
tick: Use zalloc_cpumask_var for allocating offstack cpumasks
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
which enables some minor compile time optimization to avoid
uncessary code in mostly the suspend/resume path could cause
problems for userland.
In particular, the dependency for RTC_HCTOSYS on
!ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
twice and simplifies suspend/resume, has the side effect
of causing the /sys/class/rtc/rtcN/hctosys flag to always be
zero, and this flag is commonly used by udev to setup the
/dev/rtc symlink to /dev/rtcN, which can cause pain for
older applications.
While the udev rules could use some work to be less fragile,
breaking userland should strongly be avoided. Additionally
the compile time optimizations are fairly minor, and the code
being optimized is likely to be reworked in the future, so
lets revert this change.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: stable <stable@vger.kernel.org> #3.9
Cc: Feng Tang <feng.tang@intel.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
- Two fixlets for the fallout of the generic idle task conversion
- Documentation update
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit
idle: Fix hlt/nohlt command-line handling in new generic idle
kthread: Document ways of reducing OS jitter due to per-CPU kthreads
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit d1669912 (idle: Implement generic idle function) added a new
generic idle along with support for hlt/nohlt command line options to
override default idle loop behavior. However, the command-line
processing is never compiled.
The command-line handling is wrapped by CONFIG_GENERIC_IDLE_POLL_SETUP
and arches that use this feature select it in their Kconfigs.
However, no Kconfig definition was created for this option, so it is
never enabled, and therefore command-line override of the idle-loop
behavior is broken after migrating to the generic idle loop.
To fix, add a Kconfig definition for GENERIC_IDLE_POLL_SETUP.
Tested on ARM (OMAP4/Panda) which enables the command-line overrides
by default.
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/1366849153-25564-1-git-send-email-khilman@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull ARM fixes from Russell King:
"A small number of fixes for stuff from the last merge window, and in
one case (IRQ time accounting) the previous merge window."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value
ARM: 7715/1: MCPM: adapt to GIC changes after upstream merge
ARM: 7714/1: mmc: mmci: Ensure return value of regulator_enable() is checked
ARM: 7712/1: Remove trailing whitespace in arch/arm/Makefile
ARM: 7711/1: dove: fix Dove cpu type from V7 to PJ4
ARM: finally enable IRQ time accounting config
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
old/new value
The implementation of cmpxchg64() for the ARM v6 and v7 architecture
casts parameter 2 and 3 (the old and new 64bit values) to an unsigned
long before calling the atomic_cmpxchg64() function. This clears
the top 32 bits of the old and new values, resulting in the wrong
values being compare-exchanged. Luckily, this only appears to be used
for 64-bit sched_clock, which we don't (yet) have on ARM.
This bug was introduced by commit 3e0f5a15f500 ("ARM: 7404/1: cmpxchg64:
use atomic64 and local64 routines for cmpxchg64").
Cc: <stable@vger.kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Since commit c0114709ed85 ("irqchip: gic: Perform the gic_secondary_init()
call via CPU notifier") it is no longer required nor possible to call
gic_secondary_init() from platform code.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Clean up some trailing whitespace issues in arch/arm/Makefile.
Signed-off-by: Daniel Tang <dt.tangr@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The CPU used in Marvell Dove SoCs is a PJ4 Sheeva core. Using
CONFIG_CPU_PJ4 instead of CONFIG_CPU_V7 will enable iWMMXt
extensions on Dove.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We've had IRQ time accounting for the last six months, except for the
Kconfig symbol. This somehow got missed out of the original patch.
Add this now.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"This is mostly bug fixes (some of them regressions, some of them I
deemed worth merging now) along with some patches from Li Zhong
hooking up the new context tracking stuff (for the new full NO_HZ)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
powerpc: Set show_unhandled_signals to 1 by default
powerpc/perf: Fix setting of "to" addresses for BHRB
powerpc/pmu: Fix order of interpreting BHRB target entries
powerpc/perf: Move BHRB code into CONFIG_PPC64 region
powerpc: select HAVE_CONTEXT_TRACKING for pSeries
powerpc: Use the new schedule_user API on userspace preemption
powerpc: Exit user context on notify resume
powerpc: Exception hooks for context tracking subsystem
powerpc: Syscall hooks for context tracking subsystem
powerpc/booke64: Fix kernel hangs at kernel_dbg_exc
powerpc: Fix irq_set_affinity() return values
powerpc: Provide __bswapdi2
powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3
powerpc/powernv: Detect OPAL v3 API version
powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again
powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS
powerpc: Bring all threads online prior to migration/hibernation
powerpc/rtas_flash: Fix validate_flash buffer overflow issue
powerpc/kexec: Fix kexec when using VMX optimised memcpy
powerpc: Fix build errors STRICT_MM_TYPECHECKS
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Just like other architectures
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Currently we only set the "to" address in the branch stack when the CPU
explicitly gives us a value. Unfortunately it only does this for XL form
branches (eg blr, bctr, bctar) and not I and B form branches (eg b, bc).
Fortunately if we read the instruction from memory we can extract the offset of
a branch and calculate the target address.
This adds a function power_pmu_bhrb_to() to calculate the target/to address of
the corresponding I and B form branches. It handles branches in both user and
kernel spaces. It also plumbs this into the perf brhb reading code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The current Branch History Rolling Buffer (BHRB) code misinterprets the order
of entries in the hardware buffer. It assumes that a branch target address
will be read _after_ its corresponding branch. In reality the branch target
comes before (lower mfbhrb entry) it's corresponding branch.
This is a rewrite of the code to take this into account.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The new Branch History Rolling buffer (BHRB) code is only useful on 64bit
processors, so move it into the #ifdef CONFIG_PPC64 region.
This avoids code bloat on 32bit systems.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Start context tracking support from pSeries.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch corresponds to
[PATCH] x86: Use the new schedule_user API on userspace preemption
commit 0430499ce9d78691f3985962021b16bf8f8a8048
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch allows RCU usage in do_notify_resume, e.g. signal handling.
It corresponds to
[PATCH] x86: Exit RCU extended QS on notify resume
commit edf55fda35c7dc7f2d9241c3abaddaf759b457c6
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This is the syscall slow path hooks for context tracking subsystem,
corresponding to
[PATCH] x86: Syscall hooks for userspace RCU extended QS
commit bf5a3c13b939813d28ce26c01425054c740d6731
TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there
is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is
better for it to be in the same 16 bits with others in the group, so in the
asm code, andi. with this group could work.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
MSR_DE is not cleared on entry to the kernel, and we don't clear it
explicitly outside of debug code. If we have MSR_DE set in
prime_debug_regs(), and the new thread has events enabled in DBCR0
(e.g. ICMP is set in thread->dbsr0, even though it was cleared in the
real DBCR0 when the thread got scheduled out), we'll end up taking a
debug exception in the kernel when DBCR0 is loaded. DSRR0 will not
point to an exception vector, and the kernel ends up hanging at
kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new
debug state.
Another observed source of kernel_dbg_exc hangs is with the branch
taken event. If this event is active, but we take a non-debug trap
(e.g. a TLB miss or an asynchronous interrupt) before the next branch.
We end up taking a branch-taken debug exception on the initial branch
instruction of the exception vector, but because the debug exception is
DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even
checking the DSRR0 address. Fix this by checking for DBSR_BT as well
as DBSR_IC, which is what 32-bit does and what the comments suggest was
intended in the 64-bit code as well.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Some versions of GCC apparently expect this to be provided by libgcc.
Updates from Mikey to fix 32 bit version and adding "r" to registers.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The current code fails to handle kexec on OPALv2. This fixes it
and adds code to improve the situation on OPALv3 where we can
query the CPU status from the firmware and decide what to do
based on that.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Future firmwares will support that new version
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Saw this warning again, and this time from the ret_from_fork path.
It seems we could clear the back chain earlier in copy_thread(), which
could cover both path, and also fix potential lockdep usage in
schedule_tail(), or exception occurred before we clear the back chain.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We are getting build errors with CONFIG_PROC_FS=n:
arch/powerpc/kernel/rtas_flash.c
In function 'rtas_flash_init':
745:33: error: unused variable 'f' [-Werror=unused-variable]
But rtas_flash.c should not be built when CONFIG_PROC_FS=n, beacause all
it does is provide a /proc interface to the RTAS flash routines.
CONFIG_RTAS_FLASH already depends on CONFIG_RTAS_PROC, to indicate that
it depends on the RTAS proc support, but CONFIG_RTAS_PROC does not
depend on CONFIG_PROC_FS. So fix that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch brings online all threads which are present but not online
prior to migration/hibernation. After migration/hibernation those
threads are taken back offline.
During migration/hibernation all online CPUs must call H_JOIN, this is
required by the hypervisor. Without this patch, threads that are offline
(H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be
deadlocked (all threads either JOIN'd or CEDE'd).
Cc: <stable@kernel.org>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
ibm,validate-flash-image RTAS call output buffer contains 150 - 200
bytes of data on latest system. Presently we have output
buffer size as 64 bytes and we use sprintf to copy data from
RTAS buffer to local buffer. This causes kernel oops (see below
call trace).
This patch increases local buffer size to 256 and also uses
snprintf instead of sprintf to copy data from RTAS buffer.
Kernel call trace :
-------------------
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod
Supported: Yes
NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000
REGS: c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64)
MSR: 8000000040009032 <EE,ME,IR,DR> CR: 44022488 XER: 20000018
TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36
GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b
GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031
GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4
GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90
GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000
GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000
GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031
GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130
NIP [4520323031333130] 0x4520323031333130
LR [4520323031333130] 0x4520323031333130
Call Trace:
[c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable)
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
commit b3f271e86e5a (powerpc: POWER7 optimised memcpy using VMX and
enhanced prefetch) uses VMX when it is safe to do so (ie not in
interrupt). It also looks at the task struct to decide if we have to
save the current tasks' VMX state.
kexec calls memcpy() at a point where the task struct may have been
overwritten by the new kexec segments. If it has been overwritten
then when memcpy -> enable_altivec looks up current->thread.regs->msr
we get a cryptic oops or lockup.
I also notice we aren't initialising thread_info->cpu, which means
smp_processor_id is broken. Fix that too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Our pgtable are 2*sizeof(pte_t)*PTRS_PER_PTE which is PTE_FRAG_SIZE.
Instead of depending on frag size, mask with PMD_MASKED_BITS.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
lockdep.c has this:
/*
* So we're supposed to get called after you mask local IRQs,
* but for some reason the hardware doesn't quite think you did
* a proper job.
*/
if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
return;
Since irqs_disabled() is based on soft_enabled(), that (not just the
hard EE bit) needs to be 0 before we call trace_hardirqs_off.
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We add a machine_shutdown hook that frees the OPAL interrupts
(so they get masked at the source and don't fire while kexec'ing)
and which triggers an IODA reset on all the PCIe host bridges
which will have the effect of blocking all DMAs and subsequent
PCIs interrupts.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If the firmware returns an error such as "closed" (or hardware
error), we should drop characters.
Currently we only do that when a firmware compatible with OPAL v2
APIs is detected, in the code that calls opal_console_write_buffer_space(),
which didn't exist with OPAL v1 (or didn't work).
However, when enabling early debug consoles, the flag indicating
that v2 is supported isn't set yet, causing us, in case of errors
or closed console, to spin forever.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch adds a new udbg early debug console which utilises
statically defined input and output buffers stored within the kernel
BSS. It is primarily designed to assist with bring up of new hardware
which may not have a working console but which has a method of
reading/writing kernel memory.
This version incorporates comments made by Ben H (thanks!).
Changes from v1:
- Add memory barriers.
- Ensure updating of read/write positions is atomic.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|