kernel-ark

Author	SHA1	Message	Date
Paolo 'Blaisorblade' Giarrusso	efbbdce94f	[PATCH] x86_64: Use common sys_time64 Keeping this function does not makes sense because it's a copied (and buggy) copy of sys_time. The only difference is that now.tv_sec (which is a time_t, i.e. a 64-bit long) is copied (and truncated) into a int (32-bit). The prototype is the same (they both take a long __user *), so let's drop this and redirect it to sys_time (and make sure it exists by defining __ARCH_WANT_SYS_TIME). Only disadvantage is that the sys_stime definition is also compiled (may be fixed if needed by adding a separate __ARCH_WANT_SYS_STIME macro, and defining it for all arch's defining __ARCH_WANT_SYS_TIME except x86_64). Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:17 -08:00
Paolo 'Blaisorblade' Giarrusso	bf0f2e2383	[PATCH] x86_64: Set ____cacheline_maxaligned_in_smp alignment to 128 bytes The current value was correct before the introduction of Intel EM64T support - but now L1_CACHE_SHIFT_MAX can be less than L1_CACHE_SHIFT, which _is_ funny! Between the few users of ____cacheline_maxaligned_in_smp, we also have (for example) rcu_ctrlblk, and struct zone, with zone->{lru_,}lock. I.e. we have a lot of excess cacheline bouncing on them. No correctness issues, obviously. So this could even be merged for 2.6.14 (I'm not a fan of this idea, though). CC: Andi Kleen <ak@suse.de> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:17 -08:00
Andi Kleen	8e0d4f4e91	[PATCH] x86_64: Remove asm-x86_64/rwsem.h Not needed since x86-64 always uses the spinlock based rwsems. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:17 -08:00
Siddha, Suresh B	94605eff57	[PATCH] x86-64/i386: Intel HT, Multi core detection fixes Fields obtained through cpuid vector 0x1(ebx[16:23]) and vector 0x4(eax[14:25], eax[26:31]) indicate the maximum values and might not always be the same as what is available and what OS sees. So make sure "siblings" and "cpu cores" values in /proc/cpuinfo reflect the values as seen by OS instead of what cpuid instruction says. This will also fix the buggy BIOS cases (for example where cpuid on a single core cpu says there are "2" siblings, even when HT is disabled in the BIOS. http://bugzilla.kernel.org/show_bug.cgi?id=4359) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:16 -08:00
Andi Kleen	e90f22edf4	[PATCH] x86_64: Fix NUMA node lookup debug code which had bitrotted Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:16 -08:00
Andi Kleen	a88cde13ba	[PATCH] x86_64: Formatting fixes for arch/x86_64/kernel/process.c No functional changes. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:16 -08:00
Andi Kleen	ea0be473a1	[PATCH] x86_64: Allow modular build of ia32 aout loader Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:16 -08:00
Andi Kleen	420f8f68c9	[PATCH] x86_64: New heuristics to find out hotpluggable CPUs. With a NR_CPUS==128 kernel with CPU hotplug enabled we would waste 4MB on per CPU data of all possible CPUs. The reason was that HOTPLUG always set up possible map to NR_CPUS cpus and then we need to allocate that much (each per CPU data is roughly ~32k now) The underlying problem is that ACPI didn't tell us how many hotplug CPUs the platform supports. So the old code just assumed all, which would lead to this memory wastage. This implements some new heuristics: - If the BIOS specified disabled CPUs in the ACPI/mptables assume they can be enabled later (this is bending the ACPI specification a bit, but seems like a obvious extension) - The user can overwrite it with a new additionals_cpus=NUM option - Otherwise use half of the available CPUs or 2, whatever is more. Cc: ashok.raj@intel.com Cc: len.brown@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:15 -08:00
Andi Kleen	485832a5d9	[PATCH] x86_64: Use int operations in spinlocks to support more than 128 CPUs spinning. Pointed out by Eric Dumazet Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:15 -08:00
Andi Kleen	6b75aeedde	[PATCH] x86_64: Don't apply __PHYSICAL_MASK to page frame numbers It is for physical addresses, not for PFNs. Pointed out by Tejun Heo. Cc: htejun@gmail.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Siddha, Suresh B	f6c2e3330d	[PATCH] x86_64: Unmap NULL during early bootup We should zap the low mappings, as soon as possible, so that we can catch kernel bugs more effectively. Previously early boot had NULL mapped and didn't trap on NULL references. This patch introduces boot_level4_pgt, which will always have low identity addresses mapped. Druing boot, all the processors will use this as their level4 pgt. On BP, we will switch to init_level4_pgt as soon as we enter C code and zap the low mappings as soon as we are done with the usage of identity low mapped addresses. On AP's we will zap the low mappings as soon as we jump to C code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Andi Kleen	69d81fcde7	[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA Not go from the CPU number to an mapping array. Mode number is often used now in fast paths. This also adds a generic numa_node_id to all the topology includes Suggested by Eric Dumazet Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:14 -08:00
Andi Kleen	1dff7f3db5	[PATCH] x86_64: Fix up outdated pfn_to_page comment pfn_to_page really requires pfn_valid to be true now, no question. Some people stumbled over it, but it was misleading and wrong. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
James Cleverdon	6004e1b7ef	[PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources Here's a patch that builds on Natalie Protasevich's IRQ compression patch and tries to work for MPS boots as well as ACPI. It is meant for a 4-node IBM x460 NUMA box, which was dying because it had interrupt pins with GSI numbers > NR_IRQS and thus overflowed irq_desc. The problem is that this system has 270 GSIs (which are 1:1 mapped with I/O APIC RTEs) and an 8-node box would have 540. This is much bigger than NR_IRQS (224 for both i386 and x86_64). Also, there aren't enough vectors to go around. There are about 190 usable vectors, not counting the reserved ones and the unused vectors at 0x20 to 0x2F. So, my patch attempts to compress the GSI range and share vectors by sharing IRQs. Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Jacob Shin	89b831ef8b	[PATCH] x86_64: Support for AMD specific MCE Threshold. MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F. This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations. The user may interface through sysfs files in order to change the threshold configuration. bank%d/error_count - reads current error count, write to clear. bank%d/interrupt_enable - set/clear interrupt enable. bank%d/threshold_limit - read/write the threshold limit. APIC vector 0xF9 in hw_irq.h. 5 software defined bank ids in mce.h. new apic.c function to setup threshold apic lvt. defaults to interrupt off, count enabled, and threshold limit max. sysfs interface created on /sys/devices/system/threshold. AK: added some ifdefs to make it compile on UP Signed-off-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Jan Beulich	979edfadba	[PATCH] x86_64: Adjust, correct, and complete the HPET definitions for x86-64. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Andi Kleen	a2f1b42490	[PATCH] x86_64: Add 4GB DMA32 zone Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones. As a bit of historical background: when the x86-64 port was originally designed we had some discussion if we should use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or both. Both was ruled out at this point because it was in early 2.4 when VM is still quite shakey and had bad troubles even dealing with one DMA zone. We settled on the 16MB DMA zone mainly because we worried about older soundcards and the floppy. But this has always caused problems since then because device drivers had trouble getting enough DMA able memory. These days the VM works much better and the wide use of NUMA has proven it can deal with many zones successfully. So this patch adds both zones. This helps drivers who need a lot of memory below 4GB because their hardware is not accessing more (graphic drivers - proprietary and free ones, video frame buffer drivers, sound drivers etc.). Previously they could only use IOMMU+16MB GFP_DMA, which was not enough memory. Another common problem is that hardware who has full memory addressing for >4GB misses it for some control structures in memory (like transmit rings or other metadata). They tended to allocate memory in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent, but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory (even on AMD systems the IOMMU tends to be quite small) especially if you have many devices. With the new zone pci_alloc_consistent can just put this stuff into memory below 4GB which works better. One argument was still if the zone should be 4GB or 2GB. The main motivation for 2GB would be an unnamed not so unpopular hardware raid controller (mostly found in older machines from a particular four letter company) who has a strange 2GB restriction in firmware. But that one works ok with swiotlb/IOMMU anyways, so it doesn't really need GFP_DMA32. I chose 4GB to be compatible with IA64 and because it seems to be the most common restriction. The new zone is so far added only for x86-64. For other architectures who don't set up this new zone nothing changes. Architectures can set a compatibility define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32 as GFP_DMA. Otherwise it's a nop because on 32bit architectures it's normally not needed because GFP_NORMAL (=0) is DMA able enough. One problem is still that GFP_DMA means different things on different architectures. e.g. some drivers used to have #ifdef ia64 use GFP_DMA (trusting it to be 4GB) #elif __x86_64__ (use other hacks like the swiotlb because 16MB is not enough) ... . This was quite ugly and is now obsolete. These should be now converted to use GFP_DMA32 unconditionally. I haven't done this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent which will use GFP_DMA32 transparently. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-14 19:55:13 -08:00
Nick Piggin	8426e1f6af	[PATCH] atomic: inc_not_zero Introduce an atomic_inc_not_zero operation. Make this a special case of atomic_add_unless because lockless pagecache actually wants atomic_inc_not_negativeone due to its offset refcount. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-13 18:14:16 -08:00
Nick Piggin	4a6dae6d38	[PATCH] atomic: cmpxchg Introduce an atomic_cmpxchg operation. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-13 18:14:16 -08:00
Siddha, Suresh B	47936357c0	[PATCH] x86_64: fix tss limit Fix the x86_64 TSS limit in TSS descriptor. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-13 18:14:09 -08:00
Ashok Raj	b4033c1715	[PATCH] PCI: Change MSI to use physical delivery mode always MSI hardcoded delivery mode to use logical delivery mode. Recently x86_64 moved to use physical mode addressing to support physflat mode. With this mode enabled noticed that my eth with MSI werent working. msi_address_init() was hardcoded to use logical mode for i386 and x86_64. So when we switch to use physical mode, things stopped working. Since anyway we dont use lowest priority delivery with MSI, its always directed to just a single CPU. Its safe and simpler to use physical mode always, even when we use logical delivery mode for IPI's or other ioapic RTE's. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2005-11-10 16:09:18 -08:00
Ananth N Mavinakayanahalli	e7a510f92c	[PATCH] Kprobes: Track kprobe on a per_cpu basis - x86_64 changes x86_64 changes to track kprobe execution on a per-cpu basis. We now track the kprobe state machine independently on each cpu using a arch specific kprobe control block. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:53:46 -08:00
Tim Schmielau	8c65b4a604	[PATCH] fix remaining missing includes Fix more include file problems that surfaced since I submitted the previous fix-missing-includes.patch. This should now allow not to include sched.h from module.h, which is done by a followup patch. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:53:41 -08:00
Tony Luck	c7fb577e2a	manual update from upstream: Applied Al's change `06a544971f` to new location of swiotlb.c Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-10-31 10:51:57 -08:00
Arthur Othieno	727a53bd53	[PATCH] semaphore: Remove __MUTEX_INITIALIZER() __MUTEX_INITIALIZER() has no users, and equates to the more commonly used DECLARE_MUTEX(), thus making it pretty much redundant. Remove it for good. Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 17:37:27 -08:00
Tejun Heo	1426d7a81d	[PATCH] vm: remove unused/broken page_pte[_prot] macros This patch removes page_pte_prot and page_pte macros from all architectures. Some architectures define both, some only page_pte (broken) and others none. These macros are not used anywhere. page_pte_prot(page, prot) is identical to mk_pte(page, prot) and page_pte(page) is identical to page_pte_prot(page, __pgprot(0)). * The following architectures define both page_pte_prot and page_pte arm, arm26, ia64, sh64, sparc, sparc64 * The following architectures define only page_pte (broken) frv, i386, m32r, mips, sh, x86-64 * All other architectures define neither Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 17:37:22 -08:00
Christoph Hellwig	dfb7dac3af	[PATCH] unify sys_ptrace prototype Make sure we always return, as all syscalls should. Also move the common prototype to <linux/syscalls.h> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 17:37:20 -08:00
Brian Gerst	c531178157	[PATCH] Clean up mtrr compat ioctl code Handle 32-bit mtrr ioctls in the mtrr driver instead of the ia32 compatability layer. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 17:37:13 -08:00
Rik Van Riel	eb92f4ef32	[PATCH] add sem_is_read/write_locked() Add sem_is_read/write_locked functions to the read/write semaphores, along the same lines of the *_is_locked spinlock functions. The swap token tuning patch uses sem_is_read_locked; sem_is_write_locked is added for completeness. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-29 21:40:35 -07:00
Al Viro	f80aabb03a	[PATCH] gfp_t: dma-mapping (amd64) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-28 08:16:48 -07:00
Al Viro	06a544971f	[PATCH] gfp_t: dma-mapping (ia64) ... and related annotations for amd64 - swiotlb code is shared, but prototypes are not. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-28 08:16:47 -07:00
Linus Torvalds	79b95a454b	Revert "x86-64: Avoid unnecessary double bouncing for swiotlb" Commit id `6142891a0c` Andi Kleen reports that it seems to break things for some people, and since it's purely a small optimization, revert it for now. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-27 16:28:39 -07:00
Tony Luck	9cec58dc13	Update from upstream with manual merge of Yasunori Goto's changes to swiotlb.c made in commit `281dd25cdc` since this file has been moved from arch/ia64/lib/swiotlb.c to lib/swiotlb.c Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-10-20 10:41:44 -07:00
Andi Kleen	421c7ce6d0	[PATCH] x86_64: Allocate cpu local data for all possible CPUs CPU hotplug fills up the possible map to NR_CPUs, but it did that after setting up per CPU data. This lead to CPU data not getting allocated for all possible CPUs, which lead to various side effects. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-10 16:33:25 -07:00
Kirill Korotaev	1294b118cb	[PATCH] x86_64: Add missing () around arguments of pte_index macro x86-64: Add missing () around arguments of pte_index macro Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-30 08:54:38 -07:00
Andi Kleen	7d318d7747	[PATCH] Fix up TLB flush filter disabling I checked with AMD and they requested to only disable it for family 15. Also disable it for i386 too. And some style fixes. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-29 15:41:42 -07:00
John W. Linville	8d15d19e44	[PATCH] x86_64: implement dma_sync_single_range_for_{cpu,device} Re-implement dma_sync_single_range_for_{cpu,device} for x86_64 using swiotlb_sync_single_range_for_{cpu,device}. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-29 14:45:52 -07:00
John W. Linville	878a97cfd7	[PATCH] swiotlb: support syncing sub-ranges of mappings This patch implements swiotlb_sync_single_range_for_{cpu,device}. This is intended to support an x86_64 implementation of dma_sync_single_range_for_{cpu,device}. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-09-29 14:44:23 -07:00
Andrew Morton	73a0b538ee	[PATCH] x86_64: desc.h-needs smp.h include/asm/desc.h: In function `load_LDT': include/asm/desc.h:209: warning: implicit declaration of function `get_cpu' include/asm/desc.h:211: warning: implicit declaration of function `put_cpu' Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-17 11:50:01 -07:00
Randy Dunlap	33bf56106d	[PATCH] feature removal of io_remap_page_range() As written in Documentation/feature-removal-schedule.txt, remove the io_remap_page_range() kernel API. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-13 08:22:33 -07:00
Andi Kleen	2ade814736	[PATCH] x86-64: clean up local_add/sub arguments Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:59 -07:00
Chuck Ebbert	66759a01ad	[PATCH] x86-64: i386/x86-64: Fix time going twice as fast problem on ATI Xpress chipsets Original patch from Bertro Simul This is probably still not quite correct, but seems to be the best solution so far. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:58 -07:00
Jan Beulich	049cdefe19	[PATCH] x86-64: reduce x86-64 bug frame by 4 bytes As mentioned before, the size of the bug frame can be further reduced while continuing to use instructions to encode the information. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:58 -07:00
Jan Beulich	a2d236b3ac	[PATCH] x86-64: Lose constraints on cmpxchg While only cosmetic for x86-64, this adjusts the cmpxchg code appearantly inherited from i386 to use more generic constraints. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Jan Beulich	1a426cb764	[PATCH] x86-64: Declare NMI_VECTOR and handle it in the IPI sending code. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Andi Kleen	a2a0c992e9	[PATCH] x86-64: Remove unused vxtime.hz field Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Andi Kleen	a0d58c9741	[PATCH] x86-64: Set the stack pointer correctly in init_thread and init_tss Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Jan Beulich	1209140c3c	[PATCH] x86-64: Safe interrupts in oops_begin/end Rather than blindly re-enabling interrupts in oops_end(), save their state in oope_begin() and then restore that state. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Andi Kleen	059bf0f6c3	[PATCH] x86-64: Merge msr.c with i386 version The only difference was the inline assembly, so move that into asm/msr.h and merge with the i386 version. This adds some missing sysfs support code to x86-64. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:57 -07:00
Jan Beulich	7effaa882a	[PATCH] x86-64: Fix CFI information Being the foundation for reliable stack unwinding, this fixes CFI unwind annotations in many low-level x86_64 routines, plus a config option (available to all architectures, and also present in the previously sent patch adding such annotations to i386 code) to enable them separatly rather than only along with adding full debug information. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:56 -07:00
Andi Kleen	27183ebd33	[PATCH] x86-64: Add dma_sync_single_range_for_{cpu,device} Currently just defined to their non range parts. Pointed out by John Linville Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:56 -07:00
Adrian Bunk	9c0aa0f9a1	[PATCH] Replace extern inline with static inline in asm-x86_64/* They should be identical in the kernel now, but this makes it consistent with other code. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:56 -07:00
Nakul Saraiya	f297e4e5e4	[PATCH] Increase nodemap hash. Needed for some newer Opteron systems with E stepping and memory relocation enabled. The node addresses are different in lower bits now so the nodemap hash function needs to be enlarged. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:55 -07:00
Jim Paradis	fb048927ad	[PATCH] x86-64: Fix off by one in pfn_valid When I gave proposed the fix to pfn_valid() for RHEL4, Stephen Tweedie's sharp eyes caught this: Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:50:55 -07:00
Andi Kleen	2b4a08150e	[PATCH] x86-64: Increase TLB flush array size The generic TLB flush functions kept upto 506 pages per CPU to avoid too frequent IPIs. This value was done for the L1 cache of older x86 CPUs, but with modern CPUs it does not make much sense anymore. TLB flushing is slow enough that using the L2 cache is fine. This patch increases the flush array on x86-64 to cache 5350 pages. That is roughly 20MB with 4K pages. It speeds up large munmaps in multithreaded processes on SMP considerably. The cost is roughly 42k of memory per CPU, which is reasonable. I only increased it on x86-64 for now, but it would probably make sense to increase it everywhere. Embedded architectures with SMP may keep it smaller to save some memory per CPU. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:58 -07:00
Andi Kleen	165aeb8284	[PATCH] x86-64: Don't include config.h in asm/timex.h asm-x86-64/timex.h does not reference CONFIG constants. Do not need to include config.h. Signed-off-by: Grant Grundler <iod00d@hp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:58 -07:00
Andi Kleen	3f74478b5f	[PATCH] x86-64: Some cleanup and optimization to the processor data area. - Remove unused irqrsp field - Remove pda->me - Optimize set_softirq_pending slightly Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:58 -07:00
Andi Kleen	e5bc8b6baf	[PATCH] x86-64: Make remote TLB flush more scalable Instead of using a global spinlock to protect the state of the remote TLB flush use a lock and state for each sending CPU. To tell the receiver where to look for the state use 8 different call vectors. Each CPU uses a specific vector to trigger flushes on other CPUs. Depending on the received vector the target CPUs look into the right per cpu variable for the flush data. When the system has more than 8 CPUs they are hashed to the 8 available vectors. The limited global vector space forces us to this right now. In future when interrupts are split into per CPU domains this could be fixed, at the cost of needing more IPIs in flat mode. Also some minor cleanup in the smp flush code and remove some outdated debug code. Requires patch to move cpu_possible_map setup earlier. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:58 -07:00
Andi Kleen	69e1a33f62	[PATCH] x86-64: Use ACPI PXM to parse PCI<->node assignments Since this is shared code I had to implement it for i386 too Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:57 -07:00
Andi Kleen	b9aac10ddd	[PATCH] x86-64: Remove redundant max_mapnr and replace with end_pfn The FLATMEM people added it, but there doesn't seem a good reason because end_pfn is identical. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:57 -07:00
Andi Kleen	6142891a0c	[PATCH] x86-64: Avoid unnecessary double bouncing for swiotlb PCI_DMA_BUS_IS_PHYS has to be zero even when the GART IOMMU is disabled and the swiotlb is used. Otherwise the block layer does unnecessary double bouncing. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:56 -07:00
Andi Kleen	3f098c2605	[PATCH] x86-64: Support dualcore and 8 socket systems in k8 fallback node parsing In particular on systems where the local APIC space and node space is very different from the Linux CPU number space. Previously the older NUMA setup code directly parsing the K8 northbridge registers had some issues on 8 socket or dual core systems. This patch fixes them. This is mainly done by fixing some confusion between Linux CPU numbers and local APIC ids. We now pass the local APIC IDs to later code, which avoids mismatches. Also add some heuristics to detect cases where the Hypertransport nodeids and the local APIC IDs don't match, but are shifted by a constant offset. This is still all quite hackish, hopefully BIOS writers fill in correct SRATs instead. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:56 -07:00
Andi Kleen	b91691164b	[PATCH] x86-64: Don't cache align PDA on UP builds Suggested by someone I forgot who sorry. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:56 -07:00
Andi Kleen	0b07e984fc	[PATCH] x86-64: Don't assign CPU numbers in SRAT parsing Do that later when the CPU boots. SRAT just stores the APIC<->Node mapping node. This fixes problems on systems where the order of SRAT entries does not match the MADT. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:55 -07:00
Andi Kleen	61c11341ed	[PATCH] x86-64: Remove esr disable hack in APIC code This was just needed for the Numasaurus, which fortunately doesn't support x86-64 CPUs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:55 -07:00
Andi Kleen	eddfb4ed29	[PATCH] x86-64: Remove obsolete APIC "write around" bug workaround No x86-64 chipset has this bug Generated code doesn't change because it was always disabled. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-12 10:49:55 -07:00
Adrian Bunk	672289e9fa	[PATCH] i386/x86_64: make get_cpu_vendor() static get_cpu_vendor() no longer has any users in other files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:35 -07:00
Ingo Molnar	fb1c8f93d8	[PATCH] spinlock consolidation This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h \| 16 include/asm-x86_64/spinlock_types.h \| 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP \| UP ----------------------------\|----------------------------------- asm/spinlock_types_smp.h \| linux/spinlock_types_up.h linux/spinlock_types.h \| linux/spinlock_types.h asm/spinlock_smp.h \| linux/spinlock_up.h linux/spinlock_api_smp.h \| linux/spinlock_api_up.h linux/spinlock.h \| linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_()/etc. lowlevel implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_()/etc. version of UP builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. / All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/.h and asm/.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:21 -07:00
Linus Torvalds	486a153f0e	Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild	2005-09-09 15:46:49 -07:00
Kenji Kaneshige	24b20ac6e1	[PATCH] remove unnecessary handle_IRQ_event() prototypes The function prototype for handle_IRQ_event() in a few architctures is not needed because they use GENERIC_HARDIRQ. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 13:57:33 -07:00
Sam Ravnborg	e2d5df935d	kbuild: alpha,x86_64 use generic asm-offsets.h support Delete obsolete stuff from arch makefiles Rename .h file to asm-offsets.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2005-09-09 21:28:48 +02:00
Len Brown	64e47488c9	Merge linux-2.6 with linux-acpi-2.6	2005-09-08 01:45:47 -04:00
Stephen Rothwell	5ac353f9ba	[PATCH] Clean up struct flock definitions This patch just gathers together all the struct flock definitions except xtensa into asm-generic/fcntl.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:38 -07:00
Stephen Rothwell	1abf62afb6	[PATCH] Clean up the fcntl operations This patch puts the most popular of each fcntl operation/flag into asm-generic/fcntl.h and cleans up the arch files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:38 -07:00
Stephen Rothwell	e64ca97fd8	[PATCH] Clean up the open flags This patch puts the most popular of each open flag into asm-generic/fcntl.h and cleans up the arch files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:38 -07:00
Stephen Rothwell	9317259ead	[PATCH] Create asm-generic/fcntl.h This set of patches creates asm-generic/fcntl.h and consolidates as much as possible from the asm-/fcntl.h files into it. This patch just gathers all the identical bits of the asm-/fcntl.h files into asm-generic/fcntl.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:37 -07:00
Jesper Juhl	97de50c0ad	[PATCH] remove verify_area(): remove verify_area() from various uaccess.h headers Remove the deprecated (and unused) verify_area() from various uaccess.h headers. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:35 -07:00
Christoph Hellwig	c8d127418d	[PATCH] remove asm-*/hdreg.h unused and useless.. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:30 -07:00
H. J. Lu	36d57ac4a8	[PATCH] auxiliary vector cleanups The size of auxiliary vector is fixed at 42 in linux/sched.h. But it isn't very obvious when looking at linux/elf.h. This patch adds AT_VECTOR_SIZE so that we can change it if necessary when a new vector is added. Because of include file ordering problems, doing this necessitated the extraction of the AT_* symbols into a standalone header file. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:21 -07:00
Stephen Rothwell	202e5979af	[PATCH] compat: be more consistent about [ug]id_t When I first wrote the compat layer patches, I was somewhat cavalier about the definition of compat_uid_t and compat_gid_t (or maybe I just misunderstood :-)). This patch makes the compat types much more consistent with the types we are being compatible with and hopefully will fix a few bugs along the way. compat type type in compat arch __compat_[ug]id_t __kernel_[ug]id_t __compat_[ug]id32_t __kernel_[ug]id32_t compat_[ug]id_t [ug]id_t The difference is that compat_uid_t is always 32 bits (for the archs we care about) but __compat_uid_t may be 16 bits on some. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:19 -07:00
Jakub Jelinek	4732efbeb9	[PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter (which at least on UP usually means an immediate context switch to one of the waiter threads). This waiter wakes up and after a few instructions it attempts to acquire the cv internal lock, but that lock is still held by the thread calling pthread_cond_signal. So it goes to sleep and eventually the signalling thread is scheduled in, unlocks the internal lock and wakes the waiter again. Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal to avoid this performance issue, but it was removed when locks were redesigned to the 3 state scheme (unlocked, locked uncontended, locked contended). Following scenario shows why simply using FUTEX_REQUEUE in pthread_cond_signal together with using lll_mutex_unlock_force in place of lll_mutex_unlock is not enough and probably why it has been disabled at that time: The number is value in cv->__data.__lock. thr1 thr2 thr3 0 pthread_cond_wait 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) 0 lll_futex_wait (&cv->__data.__futex, futexval) 0 pthread_cond_signal 1 lll_mutex_lock (cv->__data.__lock) 1 pthread_cond_signal 2 lll_mutex_lock (cv->__data.__lock) 2 lll_futex_wait (&cv->__data.__lock, 2) 2 lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock) # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE 2 lll_mutex_unlock_force (cv->__data.__lock) 0 cv->__data.__lock = 0 0 lll_futex_wake (&cv->__data.__lock, 1) 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) # Here, lll_mutex_unlock doesn't know there are threads waiting # on the internal cv's lock Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal, but it will cost us not one, but 2 extra syscalls and, what's worse, one of these extra syscalls will be done for every single waiting loop in pthread_cond_wait. We would need to use lll_mutex_unlock_force in pthread_cond_signal after requeue and lll_mutex_cond_lock in pthread_cond_wait after lll_futex_wait. Another alternative is to do the unlocking pthread_cond_signal needs to do (the lock can't be unlocked before lll_futex_wake, as that is racy) in the kernel. I have implemented both variants, futex-requeue-glibc.patch is the first one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel. The kernel interface allows userland to specify how exactly an unlocking operation should look like (some atomic arithmetic operation with optional constant argument and comparison of the previous futex value with another constant). It has been implemented just for ppc*, x86_64 and i?86, for other architectures I'm including just a stub header which can be used as a starting point by maintainers to write support for their arches and ATM will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been (lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running 32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL. With the following benchmark on UP x86-64 I get: for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \ for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done time elf/ld.so --library-path .:nptl-orig /tmp/bench real 0m0.655s user 0m0.253s sys 0m0.403s real 0m0.657s user 0m0.269s sys 0m0.388s time elf/ld.so --library-path .:nptl-requeue /tmp/bench real 0m0.496s user 0m0.225s sys 0m0.271s real 0m0.531s user 0m0.242s sys 0m0.288s time elf/ld.so --library-path .:nptl-wake_op /tmp/bench real 0m0.380s user 0m0.176s sys 0m0.204s real 0m0.382s user 0m0.175s sys 0m0.207s The benchmark is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt Older futex-requeue-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt Older futex-wake_op-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt Will post a new version (just x86-64 fixes so that the patch applies against pthread_cond_signal.S) to libc-hacker ml soon. Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded testcase that will not test the atomicity of the operation, but at least check if the threads that should have been woken up are woken up and whether the arithmetic operation in the kernel gave the expected results. Acked-by: Ingo Molnar <mingo@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Jamie Lokier <jamie@shareable.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:17 -07:00
Eric Dumazet	19aaabb584	[PATCH] x86_64: prefetchw() can fall back to prefetch() if !3DNOW This is a multi-part message in MIME format. If the cpu lacks 3DNOW feature, we can use a normal prefetcht0 instruction instead of NOP5. "prefetchw (%rxx)" and "prefetcht0 (%rxx)" have the same length, ranging from 3 to 5 bytes depending on the register. So this patch even helps AMD64, shortening the length of the code. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:15 -07:00
Zachary Amsden	245067d167	[PATCH] i386: cleanup serialize msr i386 arch cleanup. Introduce the serialize macro to serialize processor state. Why the microcode update needs it I am not quite sure, since wrmsr() is already a serializing instruction, but it is a microcode update, so I will keep the semantic the same, since this could be a timing workaround. As far as I can tell, this has always been there since the original microcode update source. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:11 -07:00
Zachary Amsden	61e06037e7	[PATCH] x86_64: avoid some atomic operations during address space destruction Any architecture that has hardware updated A/D bits that require synchronization against other processors during PTE operations can benefit from doing non-atomic PTE updates during address space destruction. Originally done on i386, now ported to x86_64. Doing a read/write pair instead of an xchg() operation saves the implicit lock, which turns out to be a big win on 32-bit (esp w PAE). Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:48 -07:00
Kyle Moffett	fa5b08d5f8	[PATCH] sab: consolidate kmem_bufctl_t This is used only in slab.c and each architecture gets to define whcih underlying type is to be used. Seems a bit silly - move it to slab.c and use the same type for all architectures: unsigned int. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:48 -07:00
Chen, Kenneth W	0e5c9f39f6	[PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc() I don't think we need to call hugetlb_clean_stale_pgtable() anymore in 2.6.13 because of the rework with free_pgtables(). It now collect all the pte page at the time of munmap. It used to only collect page table pages when entire one pgd can be freed and left with staled pte pages. Not anymore with 2.6.13. This function will never be called and We should turn it into a BUG_ON. I also spotted two problems here, not Adam's fault :-) (1) in huge_pte_alloc(), it looks like a bug to me that pud is not checked before calling pmd_alloc() (2) in hugetlb_clean_stale_pgtable(), it also missed a call to pmd_free_tlb. I think a tlb flush is required to flush the mapping for the page table itself when we clear out the pmd pointing to a pte page. However, since hugetlb_clean_stale_pgtable() is never called, so it won't trigger the bug. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:46 -07:00
Adam Litke	32e51a8c97	[PATCH] hugetlb: add pte_huge() macro This patch adds a macro pte_huge(pte) for i386/x86_64 which is needed by a patch later in the series. Instead of repeating (_PAGE_PRESENT \| _PAGE_PSE), I've added __LARGE_PTE to i386 to match x86_64. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: <linux-mm@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:46 -07:00
Paolo 'Blaisorblade' Giarrusso	9b4ee40ebb	[PATCH] mm: correct _PAGE_FILE comment _PAGE_FILE does not indicate whether a file is in page / swap cache, it is set just for non-linear PTE's. Correct the comment for i386, x86_64, UML. Also clearify _PAGE_NONE. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:45 -07:00
Stephen Rothwell	fd4fd5aac1	[PATCH] mm: consolidate get_order Someone mentioned that almost all the architectures used basically the same implementation of get_order. This patch consolidates them into asm-generic/page.h and includes that in the appropriate places. The exceptions are ia64 and ppc which have their own (presumably optimised) versions. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:39 -07:00
Len Brown	129521dcc9	Merge linux-2.6 into linux-acpi-2.6 test	2005-09-03 02:44:09 -04:00
Thomas Graf	2c656491e9	[NET]: Fix ipl=>ihl typo in ip_fast_csum Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:02:48 -07:00
Patrick McHardy	b0573dea1f	[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options Allows overriding of sysctl_{wmem,rmrm}_max Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:31:35 -07:00
Len Brown	27a639a92d	Auto-update from upstream	2005-08-29 17:02:17 -04:00
Andi Kleen	485761bd6a	[PATCH] x86_64: Tell VM about holes in nodes Some nodes can have large holes on x86-64. This fixes problems with the VM allowing too many dirty pages because it overestimates the number of available RAM in a node. In extreme cases you can end up with all RAM filled with dirty pages which can lead to deadlocks and other nasty behaviour. This patch just tells the VM about the known holes from e820. Reserved (like the kernel text or mem_map) is still not taken into account, but that should be only a few percent error now. Small detail is that the flat setup uses the NUMA free_area_init_node() now too because it offers more flexibility. (akpm: lotsa thanks to Martin for working this problem out) Cc: Martin Bligh <mbligh@mbligh.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-26 19:37:12 -07:00
Len Brown	6153df7b2f	[ACPI] delete CONFIG_ACPI_PCI Delete the ability to build an ACPI kernel that does not include PCI support. When such a machine is created and it requires a tuned kernel, send a patch. http://bugzilla.kernel.org/show_bug.cgi?id=1364 Signed-off-by: Len Brown <len.brown@intel.com>	2005-08-25 12:40:44 -04:00
Len Brown	888ba6c62b	[ACPI] delete CONFIG_ACPI_BOOT it has been a synonym for CONFIG_ACPI since 2.6.12 Signed-off-by: Len Brown <len.brown@intel.com>	2005-08-24 12:08:54 -04:00
Zachary Amsden	12aaa0855b	[PATCH] i386 / desc_empty macro is incorrect Chuck Ebbert noticed that the desc_empty macro is incorrect. Fix it. Thankfully, this is not used as a security check, but it can falsely overwrite TLS segments with carefully chosen base / limits. I do not believe this is an issue in practice, but it is a kernel bug. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Chris Wright <chrisw@osdl.org> [ x86-64 had the same problem, and the same fix. Linus ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-16 12:18:01 -07:00
Linus Torvalds	2ba84684e8	Revert PCIBIOS_MIN_IO changes for 2.6.13 This reverts commits `71db63acff` [PATCH] increase PCIBIOS_MIN_IO on x86 and `0b2bfb4e7f` ACPI: increase PCIBIOS_MIN_IO on x86 since Lukas Sandströ<lukass@etek.chalmers.se> reports that this breaks his on-board nvidia audio. We should re-visit this later. For now we revert the change Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-14 18:21:30 -07:00
Ivan Kokshaysky	71db63acff	[PATCH] increase PCIBIOS_MIN_IO on x86 There is a number of x86 laptops that have some non-PCI IO ports in the 0x1000-0x1fff range, and it's quite hard to control the correct order of resource allocation between PCI and other subsystems controlling these ports. Especially with modular kernel. So just increase PCIBIOS_MIN_IO to 0x4000 to prevent any new PCI resource allocations in the problematic range (this limitation must apply _only_ to the root bus resources - see Linus' change in pci_bus_alloc_resource). As PCIBIOS_MIN_IO and PCIBIOS_MIN_CARDBUS_IO are the same now on i386 and x86-64, we can remove the latter. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-02 18:21:25 -07:00
Eric W. Biederman	3d483f4757	[PATCH] Fix sync_tsc hang sync_tsc was using smp_call_function to ask the boot processor to report it's tsc value. smp_call_function performs an IPI_send_allbutself which is a broadcast ipi. There is a window during processor startup during which the target cpu has started and before it has initialized it's interrupt vectors so it can properly process an interrupt. Receveing an interrupt during that window will triple fault the cpu and do other nasty things. Why cli does not protect us from that is beyond me. The simple fix is to match ia64 and provide a smp_call_function_single. Which avoids the broadcast and is more efficient. This certainly fixes the problem of getting stuck on boot which was very easy to trigger on my SMP Hyperthreaded Xeon, and I think it fixes it for the right reasons. Minor changes by AK Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 15:01:13 -07:00
Eric W. Biederman	8bf2755664	[PATCH] x86_64 machine_kexec: Use standard pagetable helpers Use the standard hardware page table manipulation macros. This is possible now that linux works with all 4 levels of the page tables. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 13:12:49 -07:00
Jesse Millan	8d224d32c2	[PATCH] x86_64: Fix gcc 4 warning in sched_find_first_bit This patch eliminates the GCC4 warning on the x86_64 platform: kernel/sched.c:1824: warning: control may reach end of non-void function 'sched_find_first_bit' being inlined. The change follows the lead of others, i.e. it is guaranteed that at least one of b[0], b[1], or b[2] will have a bit set and evaluate to true. That being said, GCC4.0.0 notices that the code flow does not return anything if b[0], b[1] and b[2] are not true. Since we know better, if it's not b[0] or b[1], it has to be b[2]. Signed-off-by: Jesse Millan <jessem@cs.pdx.edu> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:02 -07:00
Andi Kleen	ed6b676ca8	[PATCH] x86_64: Switch to the interrupt stack when running a softirq in local_bh_enable() This avoids some potential stack overflows with very deep softirq callchains. i386 does this too. TOADD CFI annotation Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:02 -07:00
Andi Kleen	b6a68a16dc	[PATCH] x86_64: Turn BUG data into valid instruction This avoids confusing the disassembler. Costs 2 bytes per BUG. Thanks to Suresh Siddha and Jan Beulich for suggesting suitable instructions. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:01 -07:00
Andi Kleen	17158d17aa	[PATCH] x86_64: Fix incorrectly defined MSR_K8_SYSCFG Harmless because the kernel didn't use it. Noticed by Travis Betak Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:00 -07:00
Andi Kleen	ca4e6b7402	[PATCH] x86_64: Fix some typos in system.h comments Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:00 -07:00
Andi Kleen	6391ad0aa4	[PATCH] x86_64: Remove obsolete eat_key prototype Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:45:59 -07:00
Andi Kleen	d970a52180	[PATCH] x86_64: Fix some comments in tlbflush.h Were either outdated or misleading. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:45:59 -07:00
Andi Kleen	a940199f20	[PATCH] x86_64: Some cleanup in setup64.c Minor cleanup. Move things into their include files, remove obsolete includes, fix indentation, remove obsolete special cases etc. I also added the per cpu section to asm-generic/sections.h and fixed init/main.c to use it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:45:58 -07:00
Andi Kleen	5b943fbfaf	[PATCH] x86_64: i386/x86_64: remove prototypes for not existing functions in smp.h Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:45:58 -07:00
Andi Kleen	74f0629397	[PATCH] x86_64: Use for_each_cpu_mask for clustered IPI flush Makes it slightly more efficient. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:45:57 -07:00
Eric W. Biederman	62b3a04d75	[PATCH] x86_64: Implemenent machine_emergency_restart It is not safe to call set_cpus_allowed() in interrupt context and disabling the apics is complicated code. So unconditionally skip machine_shutdown in machine_emergency_reboot on x86_64. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:42 -07:00
Eric W. Biederman	7c9034735e	[PATCH] Add emergency_restart() When the kernel is working well and we want to restart cleanly kernel_restart is the function to use. But in many instances the kernel wants to reboot when thing are expected to be working very badly such as from panic or a software watchdog handler. This patch adds the function emergency_restart() so that callers can be clear what semantics they expect when calling restart. emergency_restart() is expected to be callable from interrupt context and possibly reliable in even more trying circumstances. This is an initial generic implementation for all architectures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 14:35:41 -07:00
Robert Love	725b38ab54	[PATCH] inotify: add x86-64 syscall entries Add inotify syscall entries to x86-64. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-26 13:37:22 -07:00
Len Brown	5028770a42	[ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc... Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 17:21:56 -04:00
Venkatesh Pallipadi	02df8b9385	[ACPI] enable C2 and C3 idle power states on SMP http://bugzilla.kernel.org/show_bug.cgi?id=4401 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 00:14:36 -04:00
David Shaohua Li	c9c3e457de	[ACPI] PNPACPI vs sound IRQ http://bugme.osdl.org/show_bug.cgi?id=4016 Written-by: David Shaohua Li <shaohua.li@intel.com> Acked-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-12 00:03:30 -04:00
Shaohua Li	3b520b238e	[PATCH] MTRR suspend/resume cleanup There has been some discuss about solving the SMP MTRR suspend/resume breakage, but I didn't find a patch for it. This is an intent for it. The basic idea is moving mtrr initializing into cpu_identify for all APs (so it works for cpu hotplug). For BP, restore_processor_state is responsible for restoring MTRR. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:42 -07:00
Ingo Molnar	306e440daf	[PATCH] x86: i8253/i8259A lock cleanup Introduce proper declarations for i8253_lock and i8259A_lock. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-30 08:45:10 -07:00
Russell King	026d02a236	[PATCH] Serial: Split 8250 port table (part 2) Remove legacy ISA serial ports for Accent, Boca, Fourport, Hub6 and MCA from the architecture specific serial.h include. The only ports which remain in asm-*/serial.h are the platform specific entries. These should really be converted by platform maintainers to use a platform device, such as can be found in arch/arm/mach-footbridge/isa.c Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2005-06-29 18:45:19 +01:00
Greg KH	8644d2a42b	Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2005-06-27 22:07:56 -07:00
Andrew Morton	bb4a61b6ea	[PATCH] PCI: fix up errors after dma bursting patch and CONFIG_PCI=n With CONFIG_PCI=n: In file included from include/linux/pci.h:917, from lib/iomap.c:6: include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want. include/asm/pci.h: In function `pci_dma_burst_advice': include/asm/pci.h:106: dereferencing pointer to incomplete type include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function) include/asm/pci.h:106: (Each undeclared identifier is reported only once include/asm/pci.h:106: for each function it appears in.) make[1]: *** [lib/iomap.o] Error 1 Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2005-06-27 21:52:46 -07:00
David S. Miller	e24c2d963a	[PATCH] PCI: DMA bursting advice After seeing, at best, "guesses" as to the following kind of information in several drivers, I decided that we really need a way for platforms to specifically give advice in this area for what works best with their PCI controller implementation. Basically, this new interface gives DMA bursting advice on PCI. There are three forms of the advice: 1) Burst as much as possible, it is not necessary to end bursts on some particular boundary for best performance. 2) Burst on some byte count multiple. A DMA burst to some multiple of number of bytes may be done, but it is important to end the burst on an exact multiple for best performance. The best example of this I am aware of are the PPC64 PCI controllers, where if you end a burst mid-cacheline then chip has to refetch the data and the IOMMU translations which hurts performance a lot. 3) Burst on a single byte count multiple. Bursts shall end exactly on the next multiple boundary for best performance. Sparc64 and Alpha's PCI controllers operate this way. They disconnect any device which tries to burst across a cacheline boundary. Actually, newer sparc64 PCI controllers do not have this behavior. That is why the "pdev" is passed into the interface, so I can add code later to check which PCI controller the system is using and give advice accordingly. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2005-06-27 21:52:45 -07:00
Andrea Arcangeli	ffaa8bd6c9	[PATCH] seccomp: tsc disable I believe at least for seccomp it's worth to turn off the tsc, not just for HT but for the L2 cache too. So it's up to you, either you turn it off completely (which isn't very nice IMHO) or I recommend to apply this below patch. This has been tested successfully on x86-64 against current cogito repository (i686 compiles so I didn't bother testing ;). People selling the cpu through cpushare may appreciate this bit for a peace of mind. There's no way to get any timing info anymore with this applied (gettimeofday is forbidden of course). The seccomp environment is completely deterministic so it can't be allowed to get timing info, it has to be deterministic so in the future I can enable a computing mode that does a parallel computing for each task with server side transparent checkpointing and verification that the output is the same from all the 2/3 seller computers for each task, without the buyer even noticing (for now the verification is left to the buyer client side and there's no checkpointing, since that would require more kernel changes to track the dirty bits but it'll be easy to extend once the basic mode is finished). Eliminating a cold-cache read of the cr4 global variable will save one cacheline during the tlb flush while making the code per-cpu-safe at the same time. Thanks to Mikael Pettersson for noticing the tlb flush wasn't per-cpu-safe. The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but it'll be transparent to the switch_to code since the IPI won't make any change to the cr4 contents from the point of view of the interrupted code and since it's now all per-cpu stuff, it will not race. So no need to disable irqs in switch_to slow path. Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:11:44 -07:00
Jens Axboe	22e2c507c3	[PATCH] Update cfq io scheduler to time sliced design This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 14:33:29 -07:00
Vivek Goyal	625f1c8219	[PATCH] Kdump: Export crash notes section address through sysfs o Following patch exports kexec global variable "crash_notes" to user space through sysfs as kernel attribute in /sys/kernel. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:51 -07:00
Eric W. Biederman	5234f5eb04	[PATCH] kexec: x86_64 kexec implementation This is the x86_64 implementation of machine kexec. 32bit compatibility support has been implemented, and machine_kexec has been enhanced to not care about the changing internal kernel paget table structures. From: Alexander Nyberg <alexn@dsv.su.se> build fix Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:50 -07:00
Eric W. Biederman	d0537508a9	[PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START For one kernel to report a crash another kernel has created we need to have 2 kernels loaded simultaneously in memory. To accomplish this the two kernels need to built to run at different physical addresses. This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel so we can do just that. You need to know what you are doing and the ramifications are before changing this value, and most users won't care so I have made it depend on CONFIG_EMBEDDED bzImage kernels will work and run at a different address when compiled with this option but they will still load at 1MB. If you need a kernel loaded at a different address as well you need to boot a vmlinux. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:48 -07:00
Eric W. Biederman	208fb93162	[PATCH] kexec: x86_64: restore apic virtual wire mode on shutdown When coming out of apic mode attempt to set the appropriate apic back into virtual wire mode. This improves on previous versions of this patch by by never setting bot the local apic and the ioapic into veritual wire mode. This code looks at data from the mptable to see if an ioapic has an ExtInt input to make this decision. A future improvement is to figure out which apic or ioapic was in virtual wire mode at boot time and to remember it. That is potentially a more accurate method, of selecting which apic to place in virutal wire mode. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:47 -07:00
Eric W. Biederman	8f43d03fe2	[PATCH] kexec: x86: rename APIC_MODE_EXINT From: "Maciej W. Rozycki" <macro@linux-mips.org> Rename APIC_MODE_EXINT to APIC_MODE_EXTINT - I think it should be named after what the mode is called in documentation. From: "Eric W. Biederman" <ebiederm@lnxi.com> I have reduced this patch to just the name change in the header. And integrated the changes into the patches that add those lines. Otherwise I ran into some ugly dependencies. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:46 -07:00
Nick Piggin	687f1661d3	[PATCH] sched: sched tuning Do some basic initial tuning. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:42 -07:00
Nick Piggin	147cbb4bbe	[PATCH] sched: balance on fork Reimplement the balance on exec balancing to be sched-domains aware. Use this to also do balance on fork balancing. Make x86_64 do balance on fork over the NUMA domain. The problem that the non sched domains aware blancing became apparent on dual core, multi socket opterons. What we want is for the new tasks to be sent to a different socket, but more often than not, we would first load up our sibling core, or fill two cores of a single remote socket before selecting a new one. This gives large improvements to STREAM on such systems. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:42 -07:00
Nick Piggin	cafb20c1f9	[PATCH] sched: no aggressive idle balancing Remove the very aggressive idle stuff that has recently gone into 2.6 - it is going against the direction we are trying to go. Hopefully we can regain performance through other methods. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:42 -07:00
Nick Piggin	7897986bad	[PATCH] sched: balance timers Do CPU load averaging over a number of different intervals. Allow each interval to be chosen by sending a parameter to source_load and target_load. 0 is instantaneous, idx > 0 returns a decaying average with the most recent sample weighted at 2^(idx-1). To a maximum of 3 (could be easily increased). So generally a higher number will result in more conservative balancing. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:41 -07:00
Paul E. McKenney	b2b1866006	[PATCH] RCU: clean up a few remaining synchronize_kernel() calls 2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not all) in comments. This patch changes these synchronize_kernel() calls (and comments) to synchronize_rcu() or synchronize_sched() as follows: - arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to handle races with machine-check exceptions (synchronize_rcu() would not cut it given RCU implementations intended for hardcore realtime use. - drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to handle races with i8042_interrupt() interrupt handler. Again, synchronize_rcu() would not cut it given RCU implementations intended for hardcore realtime use. - include/*/kdebug.h comments: change to synchronize_sched() to handle races with NMIs. As before, synchronize_rcu() would not cut it... - include/linux/list.h comment: change to synchronize_rcu(), since this comment is for list_del_rcu(). - security/keys/key.c unregister_key_type(): change to synchronize_rcu(), since this is interacting with RCU read side. - security/keys/process_keys.c install_session_keyring(): change to synchronize_rcu(), since this is interacting with RCU read side. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:38 -07:00
Pavel Machek	8d783b3e02	[PATCH] swsusp: clean assembly parts This patch fixes register saving so that each register is only saved once, and adds missing saving of %cr8 on x86-64. Some reordering so that save/restore is more logical/safer (segment registers should be restored after gdt). Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:33 -07:00
Ashok Raj	884d9e40b4	[PATCH] x86_64: Dont use broadcast shortcut to make it cpu hotplug safe. Broadcast IPI's provide un-expected behaviour for cpu hotplug. CPU's in offline state also end up receiving the IPI. Once the cpus become online they receive these stale IPI's which are bad and introduce unexpected behaviour. This is easily avoided by not sending a broadcast and addressing just the CPU's in online map. Doing prelim cycle counts it appears there is no big overhead and numbers seem around 0x3000-0x3900 on an average on x86 and x86_64 systems with CPUS running 3G, both for broadcast and mask version of the API's. The shortcuts are useful only for flat mode (where the perf shows no degradation), and in cluster mode, its unicast anyway. Its simpler to just not use broadcast anymore. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:31 -07:00
Ashok Raj	76e4f660d9	[PATCH] x86_64: CPU hotplug support Experimental CPU hotplug patch for x86_64 ----------------------------------------- This supports logical CPU online and offline. - Test with maxcpus=1, and then kick other cpu's off to test if init code is all cleaned up. CONFIG_SCHED_SMT works as well. - idle threads are forked on demand from keventd threads for clean startup TBD: 1. Not tested on a real NUMA machine (tested with numa=fake=2) 2. Handle ACPI pieces for physical hotplug support. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Shaohua.li<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:30 -07:00
Stephen Rothwell	0d77e5a2c2	[PATCH] compat: introduce compat_time_t This patch is based on work by Carlos O'Donell and Matthew Wilcox. It introduces/updates the compat_time_t type and uses it for compat siginfo structures. I have built this on ppc64 and x86_64. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:32 -07:00
Jan Beulich	11c80c8367	[PATCH] adjust per_cpu definition in non-SMP case Fix (in the architectures I'm actually building for) the UP definition of per_cpu so that the cpu specified may be any expression, not just an identifier or a suffix expression. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:28 -07:00
Rusty Lynch	73649dab0f	[PATCH] x86_64 specific function return probes The following patch adds the x86_64 architecture specific implementation for function return probes. Function return probes is a mechanism built on top of kprobes that allows a caller to register a handler to be called when a given function exits. For example, to instrument the return path of sys_mkdir: static int sys_mkdir_exit(struct kretprobe_instance i, struct pt_regs regs) { printk("sys_mkdir exited\n"); return 0; } static struct kretprobe return_probe = { .handler = sys_mkdir_exit, }; <inside setup function> return_probe.kp.addr = (kprobe_opcode_t ) kallsyms_lookup_name("sys_mkdir"); if (register_kretprobe(&return_probe)) { printk(KERN_DEBUG "Unable to register return probe!\n"); / do error path / } <inside cleanup function> unregister_kretprobe(&return_probe); The way this works is that: At system initialization time, kernel/kprobes.c installs a kprobe on a function called kretprobe_trampoline() that is implemented in the arch/x86_64/kernel/kprobes.c (More on this later) * When a return probe is registered using register_kretprobe(), kernel/kprobes.c will install a kprobe on the first instruction of the targeted function with the pre handler set to arch_prepare_kretprobe() which is implemented in arch/x86_64/kernel/kprobes.c. * arch_prepare_kretprobe() will prepare a kretprobe instance that stores: - nodes for hanging this instance in an empty or free list - a pointer to the return probe - the original return address - a pointer to the stack address With all this stowed away, arch_prepare_kretprobe() then sets the return address for the targeted function to a special trampoline function called kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c * The kprobe completes as normal, with control passing back to the target function that executes as normal, and eventually returns to our trampoline function. * Since a kprobe was installed on kretprobe_trampoline() during system initialization, control passes back to kprobes via the architecture specific function trampoline_probe_handler() which will lookup the instance in an hlist maintained by kernel/kprobes.c, and then call the handler function. * When trampoline_probe_handler() is done, the kprobes infrastructure single steps the original instruction (in this case just a top), and then calls trampoline_post_handler(). trampoline_post_handler() then looks up the instance again, puts the instance back on the free list, and then makes a long jump back to the original return instruction. So to recap, to instrument the exit path of a function this implementation will cause four interruptions: - A breakpoint at the very beginning of the function allowing us to switch out the return address - A single step interruption to execute the original instruction that we replaced with the break instruction (normal kprobe flow) - A breakpoint in the trampoline function where our instrumented function returned to - A single step interruption to execute the original instruction that we replaced with the break instruction (normal kprobe flow) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:21 -07:00
Jesper Juhl	dcd497f99a	[PATCH] streamline preempt_count type across archs The preempt_count member of struct thread_info is currently either defined as int, unsigned int or __s32 depending on arch. This patch makes the type of preempt_count an int on all archs. Having preempt_count be an unsigned type prevents the catching of preempt_count < 0 bugs, and using int on some archs and __s32 on others is not exactely "neat" - much nicer when it's just int all over. A previous version of this patch was already ACK'ed by Robert Love, and the only change in this version of the patch compared to the one he ACK'ed is that this one also makes sure the preempt_count member is consistently commented. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:19 -07:00
Vincent Hanquez	e9129e56e9	[PATCH] xen: x86_64: Add macro for debugreg Add 2 macros to set and get debugreg on x86_64. This is useful for Xen because it will need only to redefine each macro to a hypervisor call. Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:14 -07:00
Vincent Hanquez	fa1e1bdf78	[PATCH] xen: x86: Rename usermode macro Rename user_mode to user_mode_vm and add a user_mode macro similar to the x86-64 one. This is useful for Xen because the linux xen kernel does not runs on the same priviledge that a vanilla linux kernel, and with this we just need to redefine user_mode(). Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:14 -07:00
Jan Beulich	32ecd42b6f	[PATCH] eliminate duplicate rdpmc definition Eliminate duplicate definition of rdpmc in x86-64's mtrr.h. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:13 -07:00
Christoph Lameter	5912100372	[PATCH] i386: Selectable Frequency of the Timer Interrupt Make the timer frequency selectable. The timer interrupt may cause bus and memory contention in large NUMA systems since the interrupt occurs on each processor HZ times per second. Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:10 -07:00
Christoph Lameter	8c5a09082f	[PATCH] x86/x86_64: pcibus_to_node Define pcibus_to_node to be able to figure out which NUMA node contains a given PCI device. This defines pcibus_to_node(bus) in include/linux/topology.h and adjusts the macros for i386 and x86_64 that already provided a way to determine the cpumask of a pci device. x86_64 was changed to not build an array of cpumasks anymore. Instead an array of nodes is build which can be used to generate the cpumask via node_to_cpumask. Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:08 -07:00
Venkatesh Pallipadi	8a9e1b0f56	[PATCH] Platform SMIs and their interferance with tsc based delay calibration Issue: Current tsc based delay_calibration can result in significant errors in loops_per_jiffy count when the platform events like SMIs (System Management Interrupts that are non-maskable) are present. This could lead to potential kernel panic(). This issue is becoming more visible with 2.6 kernel (as default HZ is 1000) and on platforms with higher SMI handling latencies. During the boot time, SMIs are mostly used by BIOS (for things like legacy keyboard emulation). Description: The psuedocode for current delay calibration with tsc based delay looks like (0) Estimate a value for loops_per_jiffy (1) While (loops_per_jiffy estimate is accurate enough) (2) wait for jiffy transition (jiffy1) (3) Note down current tsc (tsc1) (4) loop until tsc becomes tsc1 + loops_per_jiffy (5) check whether jiffy changed since jiffy1 or not and refine loops_per_jiffy estimate Consider the following cases Case 1: If SMIs happen between (2) and (3) above, we can end up with a loops_per_jiffy value that is too low. This results in shorted delays and kernel can panic () during boot (Mostly at IOAPIC timer initialization timer_irq_works() as we don't have enough timer interrupts in a specified interval). Case 2: If SMIs happen between (3) and (4) above, then we can end up with a loops_per_jiffy value that is too high. And with current i386 code, too high lpj value (greater than 17M) can result in a overflow in delay.c:__const_udelay() again resulting in shorter delay and panic(). Solution: The patch below makes the calibration routine aware of asynchronous events like SMIs. We increase the delay calibration time and also identify any significant errors (greater than 12.5%) in the calibration and notify it to user. Patch below changes both i386 and x86-64 architectures to use this new and improved calibrate_delay_direct() routine. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:08 -07:00
Matt Tolentino	bbfceef47f	[PATCH] add x86-64 specific support for sparsemem This patch adds in the necessary support for sparsemem such that x86-64 kernels may use sparsemem as an alternative to discontigmem for NUMA kernels. Note that this does no preclude one from continuing to build NUMA kernels using discontigmem, but merely allows the option to build NUMA kernels with sparsemem. Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels results in reduced text size for otherwise equivalent kernels as shown in the example builds below: text data bss dec hex filename 2371036 765884 1237108 4374028 42be0c vmlinux.discontig 2366549 776484 1302772 4445805 43d66d vmlinux.sparse Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:07 -07:00
Matt Tolentino	2b97690f4c	[PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options In order to use the alternative sparsemem implmentation for NUMA kernels, we need to reorganize the config options. This patch effectively abstracts out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases. Thus, the discontigmem implementation may be employed as always, but the sparsemem implementation may be used alternatively. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:06 -07:00

1 2 3 4 5 ...

286 Commits