kernel-ark

Author	SHA1	Message	Date
David Gibson	8e561e7eda	[POWERPC] Kill typedef-ed structs for hash PTEs and BATs Using typedefs to rename structure types if frowned on by CodingStyle. However, we do so for the hash PTE structure on both ppc32 (where it's called "PTE") and ppc64 (where it's called "hpte_t"). On ppc32 we also have such a typedef for the BATs ("BAT"). This removes this unhelpful use of typedefs, in the process bringing ppc32 and ppc64 closer together, by using the name "struct hash_pte" in both cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:16 +10:00
David Gibson	c0770f686c	[POWERPC] Remove a couple of unused definitions from pgtable_32.c In arch/powerpc/mm/pgtable_32.c, the variable io_bat_index and the macro is_power_of_4() no longer have any users. This removes them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
David Gibson	f21f49ea63	[POWERPC] Remove the dregs of APUS support from arch/powerpc APUS (the Amiga Power-Up System) is not supported under arch/powerpc and it's unlikely it ever will be. Therefore, this patch removes the fragments of APUS support code from arch/powerpc which have been copied from arch/ppc. A few APUS references are left in asm-powerpc in .h files which are still used from arch/ppc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
David Gibson	90ac19a8b2	[POWERPC] Abolish iopa(), mm_ptov(), io_block_mapping() from arch/powerpc These old-fashioned IO mapping functions no longer have any callers in code which remains relevant on arch/powerpc. Therefore, this removes them from arch/powerpc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:30:15 +10:00
will schmidt	effe24bdd4	[POWERPC] During VM oom condition, kill all threads in process group We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. lightly tested on powerpc Signed-off-by: Will <will_schmidt@vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:59 +10:00
Benjamin Herrenschmidt	3d5134ee83	[POWERPC] Rewrite IO allocation & mapping on powerpc64 This rewrites pretty much from scratch the handling of MMIO and PIO space allocations on powerpc64. The main goals are: - Get rid of imalloc and use more common code where possible - Simplify the current mess so that PIO space is allocated and mapped in a single place for PCI bridges - Handle allocation constraints of PIO for all bridges including hot plugged ones within the 2GB space reserved for IO ports, so that devices on hotplugged busses will now work with drivers that assume IO ports fit in an int. - Cleanup and separate tracking of the ISA space in the reserved low 64K of IO space. No ISA -> Nothing mapped there. I booted a cell blade with IDE on PIO and MMIO and a dual G5 so far, that's it :-) With this patch, all allocations are done using the code in mm/vmalloc.c, though we use the low level __get_vm_area with explicit start/stop constraints in order to manage separate areas for vmalloc/vmap, ioremap, and PCI IOs. This greatly simplifies a lot of things, as you can see in the diffstat of that patch :-) A new pair of functions pcibios_map/unmap_io_space() now replace all of the previous code that used to manipulate PCI IOs space. The allocation is done at mapping time, which is now called from scan_phb's, just before the devices are probed (instead of after, which is by itself a bug fix). The only other caller is the PCI hotplug code for hot adding PCI-PCI bridges (slots). imalloc is gone, as is the "sub-allocation" thing, but I do beleive that hotplug should still work in the sense that the space allocation is always done by the PHB, but if you unmap a child bus of this PHB (which seems to be possible), then the code should properly tear down all the HPTE mappings for that area of the PHB allocated IO space. I now always reserve the first 64K of IO space for the bridge with the ISA bus on it. I have moved the code for tracking ISA in a separate file which should also make it smarter if we ever are capable of hot unplugging or re-plugging an ISA bridge. This should have a side effect on platforms like powermac where VGA IOs will no longer work. This is done on purpose though as they would have worked semi-randomly before. The idea at this point is to isolate drivers that might need to access those and fix them by providing a proper function to obtain an offset to the legacy IOs of a given bus. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:56 +10:00
Benjamin Herrenschmidt	c19c03fc74	[POWERPC] unmap_vm_area becomes unmap_kernel_range for the public This makes unmap_vm_area static and a wrapper around a new exported unmap_kernel_range that takes an explicit range instead of a vm_area struct. This makes it more versatile for code that wants to play with kernel page tables outside of the standard vmalloc area. (One example is some rework of the PowerPC PCI IO space mapping code that depends on that patch and removes some code duplication and horrible abuse of forged struct vm_struct). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:56 +10:00
Jon Tollefson	3f1df7a260	[POWERPC] Move common code out of if/else Move common code out of if/else. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> ---- hash_native_64.c \| 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-06-14 22:29:55 +10:00
Kumar Gala	f1aed92464	[POWERPC] Fix modpost warning Mark pte_alloc_one_kernel as __init_refok to fix the following warning: WARNING: arch/powerpc/mm/built-in.o(.text+0x1068): Section mismatch: reference to .init.text:early_get_page (between 'pte_alloc_one_kernel' and 'steal_context') Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-05-23 07:49:37 -05:00
Benjamin Herrenschmidt	5453e7723b	[POWERPC] Fix warning in 32-bit builds with CONFIG_HIGHMEM Some missing fixup for the removal of 4 level fixup header. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-22 20:20:57 +10:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Jon Tollefson	5b82583185	[POWERPC] Correct #endif comment Fix up comment on two #endifs to match their #ifs. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> ---- hash_utils_64.c \| 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-17 21:11:19 +10:00
Benjamin Herrenschmidt	017e3c53f1	[POWERPC] Add spinlock to request_phb_iospace() request_phb_iospace() can be called from different CPUs at init time (at least with my next patch) and thus needs a spinlock. As for the next patch, this is a temporary workaround for 2.6.22 issues until my rewrite of IO mappings is ready (for 2.6.23) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-17 21:11:14 +10:00
Kumar Gala	991eb43af9	[POWERPC] Fix COMMON symbol warnings We get the following warnings in various ARCH=powerpc builds: WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "htab_hash_searches" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "next_slot" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "mmu_hash_lock" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "primary_pteg_full" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "global_dbcr0" [arch/powerpc/kernel/built-in] is COMMON symbol Switch to moving local symbols (except mmu_hash_lock which is global) and space directive instead. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-05-17 21:10:15 +10:00
Stephen Rothwell	8980ae8677	[POWERPC] Remove unused variable in hpte_decode() Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-12 11:32:47 +10:00
Stephen Rothwell	0c12fe5697	[POWERPC] Assign correct variable in hpte_decode() This case will never be hit, but it should be corrected anyway. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-12 11:32:47 +10:00
Paul Mackerras	2454c7e98c	[POWERPC] Fix warning in hpte_decode(), and generalize it This adds the necessary support to hpte_decode() to handle 1TB segments and 16GB pages, and removes an uninitialized value warning on avpn. We don't have any code to generate HPTEs for 1TB segments or 16GB pages yet, so this is mostly for completeness, and to fix the warning. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2007-05-10 21:28:13 +10:00
Linus Torvalds	aabded9c3a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 [POWERPC] EEH: log all PCI-X and PCI-E AER registers [POWERPC] EEH: capture and log pci state on error [POWERPC] EEH: Split up long error msg [POWERPC] EEH: log error only after driver notification. [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init(). [POWERPC] Don't use SLAB/SLUB for PTE pages [POWERPC] Spufs support for 64K LS mappings on 4K kernels [POWERPC] Add ability to 4K kernel to hash in 64K pages [POWERPC] Introduce address space "slices" [POWERPC] Small fixes & cleanups in segment page size demotion [POWERPC] iSeries: Make HVC_ISERIES the default [POWERPC] iSeries: suppress build warning in lparmap.c [POWERPC] Mark pages that don't exist as nosave [POWERPC] swsusp: Introduce register_nosave_region_late	2007-05-09 12:56:01 -07:00
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
David Gibson	f1a1eb299a	[POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 Commit `d1953c8888` removed the use of 4level-fixup.h for 32-bit systems under arch/powerpc. However, I missed a few things activated on some configurations, resulting in some warnings (at least with STRICT_MM_TYPECHECKS enabled) and build errors in some circumstances. This fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:01 +10:00
Hugh Dickins	517e22638c	[POWERPC] Don't use SLAB/SLUB for PTE pages The SLUB allocator relies on struct page fields first_page and slab, overwritten by ptl when SPLIT_PTLOCK: so the SLUB allocator cannot then be used for the lowest level of pagetable pages. This was obstructing SLUB on PowerPC, which uses kmem_caches for its pagetables. So convert its pte level to use normal gfp pages (whereas pmd, pud and 64k-page pgd want partpages, so continue to use kmem_caches for pmd, pud and pgd). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	16c2d47623	[POWERPC] Add ability to 4K kernel to hash in 64K pages This adds the ability for a kernel compiled with 4K page size to have special slices containing 64K pages and hash the right type of hash PTEs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	d0f13e3c20	[POWERPC] Introduce address space "slices" The basic issue is to be able to do what hugetlbfs does but with different page sizes for some other special filesystems; more specifically, my need is: - Huge pages - SPE local store mappings using 64K pages on a 4K base page size kernel on Cell - Some special 4K segments in 64K-page kernels for mapping a dodgy type of powerpc-specific infiniband hardware that requires 4K MMU mappings for various reasons I won't explain here. The main issues are: - To maintain/keep track of the page size per "segment" (as we can only have one page size per segment on powerpc, which are 256MB divisions of the address space). - To make sure special mappings stay within their allotted "segments" (including MAP_FIXED crap) - To make sure everybody else doesn't mmap/brk/grow_stack into a "segment" that is used for a special mapping Some of the necessary mechanisms to handle that were present in the hugetlbfs code, but mostly in ways not suitable for anything else. The patch relies on some changes to the generic get_unmapped_area() that just got merged. It still hijacks hugetlb callbacks here or there as the generic code hasn't been entirely cleaned up yet but that shouldn't be a problem. So what is a slice ? Well, I re-used the mechanism used formerly by our hugetlbfs implementation which divides the address space in "meta-segments" which I called "slices". The division is done using 256MB slices below 4G, and 1T slices above. Thus the address space is divided currently into 16 "low" slices and 16 "high" slices. (Special case: high slice 0 is the area between 4G and 1T). Doing so simplifies significantly the tracking of segments and avoids having to keep track of all the 256MB segments in the address space. While I used the "concepts" of hugetlbfs, I mostly re-implemented everything in a more generic way and "ported" hugetlbfs to it. Slices can have an associated page size, which is encoded in the mmu context and used by the SLB miss handler to set the segment sizes. The hash code currently doesn't care, it has a specific check for hugepages, though I might add a mechanism to provide per-slice hash mapping functions in the future. The slice code provide a pair of "generic" get_unmapped_area() (bottomup and topdown) functions that should work with any slice size. There is some trickiness here so I would appreciate people to have a look at the implementation of these and let me know if I got something wrong. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt	16f1c74675	[POWERPC] Small fixes & cleanups in segment page size demotion The code for demoting segments to 4K had some issues, like for example, when using _PAGE_4K_PFN flag, the first CPU to hit it would do the demotion, but other CPUs hitting the same page wouldn't properly flush their SLBs if mmu_ci_restriction isn't set. There are also potential issues with hash_preload not handling _PAGE_4K_PFN. All of these are non issues on current hardware but might bite us in the future. This patch thus fixes it by: - Taking the test comparing the mm and current CPU context page sizes to decide to flush SLBs out of the mmu_ci_restrictions test since that can also be triggered by _PAGE_4K_PFN pages - Due to the above being done all the time, demote_segment_4k doesn't need update the context and flush the SLB - demote_segment_4k can be static and doesn't need an EXPORT_SYMBOL - Making hash_preload ignore anything that has either _PAGE_4K_PFN or _PAGE_NO_CACHE set, thus avoiding duplication of the complicated logic in hash_page() (and possibly making hash_preload a little bit faster for the normal case). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Johannes Berg	4e8ad3e816	[POWERPC] Mark pages that don't exist as nosave On some powerpc architectures (notably 64-bit powermac) there is a memory hole, for example on powermacs between 2G and 4G. Since we use the flat memory model regardless, these pages must be marked as nosave (for suspend to disk.) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:34:56 +10:00
Linus Torvalds	df6d3916f3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (77 commits) [POWERPC] Abolish powerpc_flash_init() [POWERPC] Early serial debug support for PPC44x [POWERPC] Support for the Ebony 440GP reference board in arch/powerpc [POWERPC] Add device tree for Ebony [POWERPC] Add powerpc/platforms/44x, disable platforms/4xx for now [POWERPC] MPIC U3/U4 MSI backend [POWERPC] MPIC MSI allocator [POWERPC] Enable MSI mappings for MPIC [POWERPC] Tell Phyp we support MSI [POWERPC] RTAS MSI implementation [POWERPC] PowerPC MSI infrastructure [POWERPC] Rip out the existing powerpc msi stubs [POWERPC] Remove use of 4level-fixup.h for ppc32 [POWERPC] Add powerpc PCI-E reset API implementation [POWERPC] Holly bootwrapper [POWERPC] Holly DTS [POWERPC] Holly defconfig [POWERPC] Add support for 750CL Holly board [POWERPC] Generalize tsi108 PCI setup [POWERPC] Generalize tsi108 PHY types ... Fixed conflict in include/asm-powerpc/kdebug.h manually Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:50:19 -07:00
Randy Dunlap	e63340ae6b	header cleaning: don't include smp_lock.h when not used Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Christoph Hellwig	1eeb66a1bb	move die notifier handling to common code This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Akinobu Mita	0e6b9c98be	use SLAB_PANIC flag cleanup Use SLAB_PANIC and delete duplicated panic(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Ian Molton <spyro@f2s.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
David Gibson	d1953c8888	[POWERPC] Remove use of 4level-fixup.h for ppc32 For 32-bit systems, powerpc still relies on the 4level-fixup.h hack, to pretend that the generic pagetable handling stuff is 3-levels rather than 4. This patch removes this, instead using the newer pgtable-nopmd.h to handle the elision of both the pud and pmd pagetable levels (ppc32 pagetables are actually 2 levels). This removes a little extraneous code, and makes it more easily compared to the 64-bit pagetable code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-08 13:40:31 +10:00
Paul Mackerras	02bbc0f09c	Merge branch 'linux-2.6'	2007-05-08 13:37:51 +10:00
Michael Ellerman	0108d3fe3c	[POWERPC] Add __init annotations to reserve_mem() and stabs_alloc() reserve_mem() and stabs_alloc() are both called only from other __init routines, so can be marked __init. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-08 11:54:19 +10:00
Benjamin Herrenschmidt	d506a77251	get_unmapped_area handles MAP_FIXED on powerpc The current get_unmapped_area code calls the f_ops->get_unmapped_area or the arch one (via the mm) only when MAP_FIXED is not passed. That makes it impossible for archs to impose proper constraints on regions of the virtual address space. To work around that, get_unmapped_area() then calls some hugetlbfs specific hacks. This cause several problems, among others: - It makes it impossible for a driver or filesystem to do the same thing that hugetlbfs does (for example, to allow a driver to use larger page sizes to map external hardware) if that requires applying a constraint on the addresses (constraining that mapping in certain regions and other mappings out of those regions). - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want MAP_FIXED to be passed down in order to deal with aliasing issues. The code is there to handle it... but is never called. This series of patches moves the logic to handle MAP_FIXED down to the various arch/driver get_unmapped_area() implementations, and then changes the generic code to always call them. The hugetlbfs hacks then disappear from the generic code. Since I need to do some special 64K pages mappings for SPEs on cell, I need to work around the first problem at least. I have further patches thus implementing a "slices" layer that handles multiple page sizes through slices of the address space for use by hugetlbfs, the SPE code, and possibly others, but it requires that serie of patches first/ There is still a potential (but not practical) issue due to the fact that filesystems/drivers implemeting g_u_a will effectively bypass all arch checks. This is not an issue in practice as the only filesystems/drivers using that hook are doing so for arch specific purposes in the first place. There is also a problem with mremap that will completely bypass all arch checks. I'll try to address that separately, I'm not 100% certain yet how, possibly by making it not work when the vma has a file whose f_ops has a get_unmapped_area callback, and by making it use is_hugepage_only_range() before expanding into a new area. Also, I want to turn is_hugepage_only_range() into a more generic is_normal_page_range() as that's really what it will end up meaning when used in stack grow, brk grow and mremap. None of the above "issues" however are introduced by this patch, they are already there, so I think the patch can go ini for 2.6.22. This patch: Handle MAP_FIXED in powerpc's arch_get_unmapped_area() in all 3 implementations of it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	f0f3980b21	slab allocators: remove multiple alignment specifications It is not necessary to tell the slab allocators to align to a cacheline if an explicit alignment was already specified. It is rather confusing to specify multiple alignments. Make sure that the call sites only use one form of alignment. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	5af6083990	slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN This patch was recently posted to lkml and acked by Pekka. The flag SLAB_MUST_HWCACHE_ALIGN is 1. Never checked by SLAB at all. 2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB 3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB. The only remaining use is in sparc64 and ppc64 and their use there reflects some earlier role that the slab flag once may have had. If its specified then SLAB_HWCACHE_ALIGN is also specified. The flag is confusing, inconsistent and has no purpose. Remove it. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Luke Browning	71bf08b6c0	[POWERPC] 64K page support for kexec This fixes a couple of kexec problems related to 64K page support in the kernel. kexec issues a tlbie for each pte. The parameters for the tlbie are the page size and the virtual address. Support was missing for the computation of these two parameters for 64K pages. This adds that support. Signed-off-by: Luke Browning <lukebrowning@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-07 20:31:12 +10:00
Christoph Hellwig	9f90b997de	[POWERPC] Minor fault path optimization Call the kprobes pagefault handler directly instead of going through the complex notifier chain. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:57:39 +10:00
David Gibson	57d7909e0d	[POWERPC] Revise PPC44x MMU code for arch/powerpc This patch takes the definitions for the PPC44x MMU (a software loaded TLB) from asm-ppc/mmu.h, cleans them up of things no longer necessary in arch/powerpc and puts them in a new asm-powerpc/mmu_44x.h file. It also substantially simplifies arch/powerpc/mm/44x_mmu.c and makes a couple of small fixes necessary for the 44x MMU code to build and work properly in arch/powerpc. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:04:29 +10:00
Michael Ellerman	ed16669298	[POWERPC] Initialise spinlock in the DEBUG_PAGEALLOC code Fixes: BUG: spinlock bad magic on CPU#0, swapper/0 lock: c00000000064ec30, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Call Trace: [c00000000062b980] [c00000000000f920] .show_stack+0x6c/0x1a0 (unreliable) [c00000000062ba20] [c0000000001c2b40] .spin_bug+0xb0/0xd4 [c00000000062bab0] [c0000000001c2ed0] ._raw_spin_lock+0x44/0x184 [c00000000062bb50] [c0000000003a42b4] ._spin_lock+0x10/0x24 [c00000000062bbd0] [c00000000002b4dc] .kernel_map_pages+0x198/0x278 [c00000000062bc90] [c000000000079720] .free_hot_cold_page+0x124/0x418 [c00000000062bd70] [c000000000530278] .free_all_bootmem_core+0x14c/0x224 [c00000000062be50] [c00000000052a178] .mem_init+0x68/0x170 [c00000000062bee0] [c00000000051d874] .start_kernel+0x2a0/0x37c [c00000000062bf90] [c0000000000084c8] .start_here_common+0x54/0x8c Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 20:04:29 +10:00
Johannes Berg	a3cf4bdef0	[POWERPC] Remove unneeded page_is_ram export arch/powerpc/mm/mem.c exports page_is_ram, which is not used anywhere that could be modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-02 16:40:57 +10:00
David Gibson	37f01d64d8	[POWERPC] Abolish PHYS_FMT macro from arch/powerpc 32-bit powerpc systems define a macro, PHYS_FMT, giving a printf format string fragment for displaying physical addresses, since most 32-bit powerpc platforms use 32-bit physical addresses but a few use 64-bit physical addresses. This macro is used in exactly one place, a rare error message, where we can solve the problem more simply by just unconditionally casting the address up to 64-bit quantity before formatting it. This patch does so, meaning that as we bring MMU definitions from asm-ppc over to asm-powerpc, cleaning them up in the process, we don't need to implement this ugly macro (which additionally has a very bad name for something global). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-24 22:11:16 +10:00
David Gibson	6210230725	[POWERPC] Cleanup and fix breakage in tlbflush.h BenH's commit `a741e67969` in powerpc.git, although (AFAICT) only intended to affect ppc64, also has side-effects which break 44x. I think 40x, 8xx and Freescale Book E are also affected, though I haven't tested them. The problem lies in unconditionally removing flush_tlb_pending() from the versions of flush_tlb_mm(), flush_tlb_range() and flush_tlb_kernel_range() used on ppc64 - which are also used the embedded platforms mentioned above. The patch below cleans up the convoluted #ifdef logic in tlbflush.h, in the process restoring the necessary flushes for the software TLB platforms. There are three sets of definitions for the flushing hooks: the software TLB versions (revised to avoid using names which appear to related to TLB batching), the 32-bit hash based versions (external functions) amd the 64-bit hash based versions (which implement batching). It also moves the declaration of update_mmu_cache() to always be in tlbflush.h (previously it was in tlbflush.h except for PPC64, where it was in pgtable.h). Booted on Ebony (440GP) and compiled for 64-bit and 32-bit multiplatform. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-24 22:08:56 +10:00
Benjamin Herrenschmidt	370a908db1	[POWERPC] DEBUG_PAGEALLOC for 64-bit Here's an implementation of DEBUG_PAGEALLOC for 64 bits powerpc. It applies on top of the 32 bits patch. Unlike Anton's previous attempt, I'm not using updatepp. I'm removing the hash entries from the bolted mapping (using a map in RAM of all the slots). Expensive but it doesn't really matter, does it ? :-) Memory hot-added doesn't benefit from this unless it's added at an address that is below end_of_DRAM() as calculated at boot time. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/Kconfig.debug \| 2 arch/powerpc/mm/hash_utils_64.c \| 84 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 82 insertions(+), 4 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	88df6e90fa	[POWERPC] DEBUG_PAGEALLOC for 32-bit Here's an implementation of DEBUG_PAGEALLOC for ppc32. It disables BAT mapping and is only tested with Hash table based processor though it shouldn't be too hard to adapt it to others. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/Kconfig.debug \| 9 ++++++ arch/powerpc/mm/init_32.c \| 4 +++ arch/powerpc/mm/pgtable_32.c \| 52 +++++++++++++++++++++++++++++++++++++++ arch/powerpc/mm/ppc_mmu_32.c \| 4 ++- include/asm-powerpc/cacheflush.h \| 6 ++++ 5 files changed, 74 insertions(+), 1 deletion(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	ee4f2ea486	[POWERPC] Fix 32-bit mm operations when not using BATs On hash table based 32 bits powerpc's, the hash management code runs with a big spinlock. It's thus important that it never causes itself a hash fault. That code is generally safe (it does memory accesses in real mode among other things) with the exception of the actual access to the code itself. That is, the kernel text needs to be accessible without taking a hash miss exceptions. This is currently guaranteed by having a BAT register mapping part of the linear mapping permanently, which includes the kernel text. But this is not true if using the "nobats" kernel command line option (which can be useful for debugging) and will not be true when using DEBUG_PAGEALLOC implemented in a subsequent patch. This patch fixes this by pre-faulting in the hash table pages that hit the kernel text, and making sure we never evict such a page under hash pressure. Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org> arch/powerpc/mm/hash_low_32.S \| 22 ++++++++++++++++++++-- arch/powerpc/mm/mem.c \| 3 --- arch/powerpc/mm/mmu_decl.h \| 4 ++++ arch/powerpc/mm/pgtable_32.c \| 11 +++++++---- 4 files changed, 31 insertions(+), 9 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	3be4e6990e	[POWERPC] Cleanup 32-bit map_page The 32 bits map_page() function is used internally by the mm code for early mmu mappings and for ioremap. It should never be called for an address that already has a valid PTE or hash entry, so we add a BUG_ON for that and remove the useless flush_HPTE call. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> arch/powerpc/mm/pgtable_32.c \| 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:39 +10:00
Benjamin Herrenschmidt	a741e67969	[POWERPC] Make tlb flush batch use lazy MMU mode The current tlb flush code on powerpc 64 bits has a subtle race since we lost the page table lock due to the possible faulting in of new PTEs after a previous one has been removed but before the corresponding hash entry has been evicted, which can leads to all sort of fatal problems. This patch reworks the batch code completely. It doesn't use the mmu_gather stuff anymore. Instead, we use the lazy mmu hooks that were added by the paravirt code. They have the nice property that the enter/leave lazy mmu mode pair is always fully contained by the PTE lock for a given range of PTEs. Thus we can guarantee that all batches are flushed on a given CPU before it drops that lock. We also generalize batching for any PTE update that require a flush. Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and disabled by arch_leave_lazy_mmu_mode(). The code epects that this is always contained within a PTE lock section so no preemption can happen and no PTE insertion in that range from another CPU. When batching is enabled on a CPU, every PTE updates that need a hash flush will use the batch for that flush. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 04:09:38 +10:00
Stephen Rothwell	e2eb63927b	[POWERPC] Rename get_property to of_get_property: arch/powerpc Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:19 +10:00
Paul Mackerras	721151d004	[POWERPC] Allow drivers to map individual 4k pages to userspace Some drivers have resources that they want to be able to map into userspace that are 4k in size. On a kernel configured with 64k pages we currently end up mapping the 4k we want plus another 60k of physical address space, which could contain anything. This can introduce security problems, for example in the case of an infiniband adaptor where the other 60k could contain registers that some other program is using for its communications. This patch adds a new function, remap_4k_pfn, which drivers can use to map a single 4k page to userspace regardless of whether the kernel is using a 4k or a 64k page size. Like remap_pfn_range, it would typically be called in a driver's mmap function. It only maps a single 4k page, which on a 64k page kernel appears replicated 16 times throughout a 64k page. On a 4k page kernel it reduces to a call to remap_pfn_range. The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN, gets set on the linux PTE. This alters the way that __hash_page_4K computes the real address to put in the HPTE. The RPN field of the linux PTE becomes the 4k RPN directly rather than being interpreted as a 64k RPN. Since the RPN field is 32 bits, this means that physical addresses being mapped with remap_4k_pfn have to be below 2^44, i.e. 0x100000000000. The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c that deals with demoting a process to use 4k pages into one function that gets called in the various different places where we need to do that. There were some discrepancies between exactly what was done in the various places, such as a call to spu_flush_all_slbs in one case but not in others. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:18 +10:00
Stephen Rothwell	9213feea6e	[POWERPC] Rename prom_n_size_cells to of_n_size_cells This is more consistent and gets us closer to the Sparc code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-13 03:55:18 +10:00

1 2 3 4 5

231 Commits