kernel-ark

History

Michal Hocko 53a59fc67f mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT Since commit `e303297e6c` ("mm: extended batches for generic mmu_gather") we are batching pages to be freed until either tlb_next_batch cannot allocate a new batch or we are done. This works just fine most of the time but we can get in troubles with non-preemptible kernel (CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY) on large machines where too aggressive batching might lead to soft lockups during process exit path (exit_mmap) because there are no scheduling points down the free_pages_and_swap_cache path and so the freeing can take long enough to trigger the soft lockup. The lockup is harmless except when the system is setup to panic on softlockup which is not that unusual. The simplest way to work around this issue is to limit the maximum number of batches in a single mmu_gather. 10k of collected pages should be safe to prevent from soft lockups (we would have 2ms for one) even if they are all freed without an explicit scheduling point. This patch doesn't add any new explicit scheduling points because it relies on zap_pmd_range during page tables zapping which calls cond_resched per PMD. The following lockup has been reported for 3.0 kernel with a huge process (in order of hundreds gigs but I do know any more details). BUG: soft lockup - CPU#56 stuck for 22s! [kernel:31053] Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc mptctl mptbase autofs4 binfmt_misc dm_round_robin dm_multipath bonding cpufreq_conservative cpufreq_userspace cpufreq_powersave pcc_cpufreq mperf microcode fuse loop osst sg sd_mod crc_t10dif st qla2xxx scsi_transport_fc scsi_tgt netxen_nic i7core_edac iTCO_wdt joydev e1000e serio_raw pcspkr edac_core iTCO_vendor_support acpi_power_meter rtc_cmos hpwdt hpilo button container usbhid hid dm_mirror dm_region_hash dm_log linear uhci_hcd ehci_hcd usbcore usb_common scsi_dh_emc scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh dm_snapshot pcnet32 mii edd dm_mod raid1 ext3 mbcache jbd fan thermal processor thermal_sys hwmon cciss scsi_mod Supported: Yes CPU 56 Pid: 31053, comm: kernel Not tainted 3.0.31-0.9-default #1 HP ProLiant DL580 G7 RIP: 0010: _raw_spin_unlock_irqrestore+0x8/0x10 RSP: 0018:ffff883ec1037af0 EFLAGS: 00000206 RAX: 0000000000000e00 RBX: ffffea01a0817e28 RCX: ffff88803ffd9e80 RDX: 0000000000000200 RSI: 0000000000000206 RDI: 0000000000000206 RBP: 0000000000000002 R08: 0000000000000001 R09: ffff887ec724a400 R10: 0000000000000000 R11: dead000000200200 R12: ffffffff8144c26e R13: 0000000000000030 R14: 0000000000000297 R15: 000000000000000e FS: 00007ed834282700(0000) GS:ffff88c03f200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000068b240 CR3: 0000003ec13c5000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kernel (pid: 31053, threadinfo ffff883ec1036000, task ffff883ebd5d4100) Call Trace: release_pages+0xc5/0x260 free_pages_and_swap_cache+0x9d/0xc0 tlb_flush_mmu+0x5c/0x80 tlb_finish_mmu+0xe/0x50 exit_mmap+0xbd/0x120 mmput+0x49/0x120 exit_mm+0x122/0x160 do_exit+0x17a/0x430 do_group_exit+0x3d/0xb0 get_signal_to_deliver+0x247/0x480 do_signal+0x71/0x1b0 do_notify_resume+0x98/0xb0 int_signal+0x12/0x17 DWARF2 unwinder stuck at int_signal+0x12/0x17 Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> [3.0+] Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2013-01-04 16:11:46 -08:00
..
bitops	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux	2012-10-14 13:39:34 -07:00
4level-fixup.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
atomic64.h	lib: Provide generic atomic64_t implementation	2009-06-15 13:27:38 +10:00
atomic-long.h	asm-generic: merge branch 'master' of torvalds/linux-2.6	2009-06-12 11:32:58 +02:00
atomic.h	Remove all #inclusions of asm/system.h	2012-03-28 18:30:03 +01:00
audit_change_attr.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_dir_write.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_read.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_signal.h
audit_write.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
barrier.h	Create asm-generic/barrier.h	2012-03-28 18:30:03 +01:00
bitops.h	bitops: remove minix bitops from asm/bitops.h	2011-03-23 19:46:22 -07:00
bitsperlong.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
bug.h	bug.h: Fix up CONFIG_BUG=n implicit function declarations.	2012-06-25 10:32:49 -07:00
bugs.h
cache.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
cacheflush.h	asm-generic/cacheflush.h: flush icache when copying to user pages	2011-05-25 08:39:37 -07:00
checksum.h	Add extra arch overrides to asm-generic/checksum.h	2011-11-01 07:34:21 -07:00
clkdev.h	asm-generic: Add default clkdev.h	2012-10-03 21:33:53 +02:00
cmpxchg-local.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
cmpxchg.h	asm-generic: add linux/types.h to cmpxchg.h	2012-04-02 14:41:27 -07:00
cputime.h	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2012-01-06 08:44:54 -08:00
current.h
delay.h	asm-generic: delay.h fix udelay and ndelay for 8 bit args	2011-07-22 18:45:33 +02:00
device.h	Driver Core: Add platform device arch data V3	2009-07-22 00:28:38 +02:00
div64.h
dma-coherent.h	common: dma-mapping: add support for generic dma_mmap_* calls	2012-07-30 12:25:46 +02:00
dma-contiguous.h	mm: cma: fix condition check when setting global cma area	2012-07-06 12:02:04 +02:00
dma-mapping-broken.h	dma-mapping: remove dma_is_consistent API	2010-08-11 08:59:21 -07:00
dma-mapping-common.h	common: dma-mapping: introduce dma_get_sgtable() function	2012-07-30 12:25:46 +02:00
dma.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
emergency-restart.h
exec.h	Split arch_align_stack() out from asm-generic/system.h	2012-03-28 18:30:03 +01:00
fb.h
ftrace.h	asm-generic headers: add ftrace.h	2011-03-17 09:19:04 +08:00
futex.h	futex: Sanitize futex ops argument types	2011-03-11 12:23:31 +01:00
getorder.h	bitops: Add missing parentheses to new get_order macro	2012-02-24 10:39:27 -08:00
gpio.h	GPIO follow up patch and type change for v3.5 merge window	2012-12-11 13:00:56 -08:00
hardirq.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
hw_irq.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
ide_iops.h
int-l64.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
int-ll64.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
io-64-nonatomic-hi-lo.h	asm-generic: architecture independent readq/writeq for 32bit environment	2012-02-21 16:47:28 -08:00
io-64-nonatomic-lo-hi.h	asm-generic: architecture independent readq/writeq for 32bit environment	2012-02-21 16:47:28 -08:00
io.h	These are a few cleanups for asm-generic:	2012-12-21 16:39:08 -08:00
ioctl.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
iomap.h	[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional	2012-02-27 09:43:30 -06:00
irq_regs.h	core: Replace __get_cpu_var with __this_cpu_read if not used for an address.	2010-12-17 15:07:19 +01:00
irq.h
irqflags.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
Kbuild.asm	UAPI: Set up uapi/asm/Kbuild.asm	2012-10-02 18:01:56 +01:00
kdebug.h	asm-generic: kdebug.h: Checkpatch cleanup	2010-10-09 21:51:44 +02:00
kmap_types.h	asm-generic: remove km_type definitions	2012-07-24 15:27:30 +08:00
kvm_para.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
libata-portmap.h
linkage.h
local64.h	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
local.h	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
memory_model.h	mm: fix __page_to_pfn for a const struct page argument	2011-08-17 13:00:20 -07:00
mm_hooks.h
mmu_context.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
mmu.h	asm-generic/mmu.h: Add support for FDPIC	2012-12-09 23:14:14 +01:00
module.h	Make most arch asm/module.h files use asm-generic/module.h	2012-09-28 14:31:03 +09:30
mutex-dec.h
mutex-null.h
mutex-xchg.h	mutex: Place lock in contended state after fastpath_lock failure	2012-08-13 18:46:54 +02:00
mutex.h
page.h	The following changes since commit `3ee72ca992`	2012-01-10 17:39:40 -08:00
param.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
parport.h	include: remove __dev* attributes.	2013-01-03 15:57:16 -08:00
pci_iomap.h	[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional	2012-02-27 09:43:30 -06:00
pci-bridge.h	PCI: work around Stratus ftServer broken PCIe hierarchy	2012-04-30 15:21:02 -06:00
pci-dma-compat.h	dma-mapping: pci: move pci_set_dma_mask and pci_set_consistent_dma_mask to pci-dma-compat.h	2010-03-12 15:52:42 -08:00
pci.h	PCI: collapse pcibios_resource_to_bus	2012-02-23 20:19:04 -07:00
percpu.h	percpu: Optimize __get_cpu_var()	2010-09-10 10:56:51 +02:00
pgalloc.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
pgtable-nopmd.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
pgtable-nopud.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
pgtable.h	Automatic NUMA Balancing V11	2012-12-16 15:18:08 -08:00
ptrace.h	asm-generic/ptrace.h: start a common low level ptrace helper	2011-05-26 17:12:36 -07:00
resource.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
rtc.h
rwsem.h	Hexagon: Add locking types and functions	2011-11-01 07:34:20 -07:00
scatterlist.h	asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h	2010-05-27 09:12:54 -07:00
sections.h	x86: Separate out entry text section	2011-03-08 17:22:11 +01:00
segment.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
serial.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
siginfo.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
signal.h	unify default ptrace_signal_deliver	2012-11-29 00:01:23 -05:00
sizes.h	ARM: 7430/1: sizes.h: move from asm-generic to <linux/sizes.h>	2012-06-28 17:14:34 +01:00
spinlock.h
statfs.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
string.h
switch_to.h	Split the switch_to() wrapper out of asm-generic/system.h	2012-03-28 18:30:03 +01:00
syscall.h	asm/syscall.h: add syscall_get_arch	2012-04-14 11:13:19 +10:00
syscalls.h	take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h	2012-11-28 23:43:27 -05:00
termios-base.h
termios.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
timex.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
tlb.h	mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT	2013-01-04 16:11:46 -08:00
tlbflush.h	BUG: headers with BUG/BUG_ON etc. need linux/bug.h	2012-03-04 17:54:34 -05:00
topology.h	topology: alternate fix for ia64 tiger_defconfig build breakage	2010-08-09 20:44:57 -07:00
trace_clock.h	tracing,x86: Add a TSC trace_clock	2012-11-13 15:48:27 -05:00
uaccess-unaligned.h
uaccess.h	fix default __strnlen_user macro	2011-10-06 19:47:12 -04:00
unaligned.h
unistd.h	UAPI: (Scripted) Disintegrate include/asm-generic	2012-10-04 18:20:15 +01:00
user.h	asm-generic/user.h: Fix spelling in comment	2011-03-01 15:49:39 +01:00
vga.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
vmlinux.lds.h	vmlinux.lds.h: Allow architectures to add sections to the front of .bss	2012-10-11 11:02:37 +02:00
word-at-a-time.h	word-at-a-time: make the interfaces truly generic	2012-05-26 11:33:40 -07:00
xor.h	asm-generic: xor: mark static functions as __maybe_unused	2012-10-03 21:21:06 +02:00