kernel-ark

Author	SHA1	Message	Date
David S. Miller	a4aa2e867c	[SPARC64]: Don't use in/local regs for ldx/stx data in N1 memcpy. It doesn't matter for use in 64-bit objects, but when used in 32-bit environments the top 32-bits of the local and in registers will get chopped off on the next register window spill/restore which leads to difficult to track down and subtle bugs. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-02 16:17:17 -07:00
David S. Miller	25e5566ed3	[SPARC64]: Fix missing load-twin usage in Niagara-1 memcpy. For the case where the source is not aligned modulo 8 we don't use load-twins to suck the data in and this kills performance since normal loads allocate in the L1 cache (unlike load-twin) and thus big memcpys swipe the entire L1 D-cache. We need to allocate a register window to implement this properly, but that actually simplifies a lot of things as a nice side-effect. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-02 01:03:09 -07:00
David S. Miller	cf5adce117	[SPARC64]: Niagara-2 optimized copies. The bzero/memset implementation stays the same as Niagara-1. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-16 01:47:25 -07:00
David S. Miller	6c70b6fc7b	[SPARC64]: Do not assume sun4v chips have load-twin/store-init support. Check the cpu type in the OBP device tree before committing to using the optimized Niagara memcpy and memset implementation. If we don't recognize the cpu type, use a completely generic version. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-08 17:33:45 -07:00
David S. Miller	8b99cfb8cc	[SPARC64]: More sensible udelay implementation. Take a page from the powerpc folks and just calculate the delay factor directly. Since frequency scaling chips use a system-tick register, the value is going to be the same system-wide. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-16 04:05:02 -07:00
David S. Miller	24d559cac4	[SPARC64]: store-init needs trailing membar. The manual says that it is required and we actually have crash reports where loads see stale data due to not having membars here. In one case the networking does: memset(skb, 0, offsetof(struct sk_buff, truesize)); and then some code later checks skb->nohdr for zero, but it's still the value that was there before the memset(). Note that arch/sparc64/lib/xor.S already got this right. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-19 13:27:33 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
David S. Miller	ae5de0ff0b	[SPARC64]: Fix missing fold at end of checksums. Both csum_partial() and the csum_partial_copy*() family of routines forget to do a final fold on the computed checksum value on sparc64. So do the standard Sparc "add + set condition codes, add carry" sequence, then make sure the high 32-bits of the return value are clear. Based upon some excellent detective work and debugging done by Richard Braun and Samuel Thibault. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-04 21:32:01 -07:00
Akinobu Mita	2d78d4beb6	[PATCH] bitops: sparc64: use generic bitops - remove __{,test_and_}{set,clear,change}_bit() and test_bit() - remove ffz() - remove __ffs() - remove generic_fls() - remove generic_fls64() - remove sched_find_first_bit() - remove ffs() - unless defined(ULTRA_HAS_POPULATION_COUNT) - remove generic_hweight{64,32,16,8}() - remove find_{next,first}{,_zero}_bit() - remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit() - remove minix_{test,set,test_and_clear,test,find_first_zero}_bit() Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:57:14 -08:00
David S. Miller	bb8646d834	[SPARC64]: Optimized TSB table initialization. We only need to write an invalid tag every 16 bytes, so taking advantage of this can save many instructions compared to the simple memset() call we make now. A prefetching implementation is implemented for sun4u and a block-init store version if implemented for Niagara. The next trick is to be able to perform an init and a copy_tsb() in parallel when growing a TSB table. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:16:41 -08:00
David S. Miller	3634476239	[SPARC64]: Niagara optimized XOR functions for RAID. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:14:03 -08:00
David S. Miller	8ca2557c48	[SPARC64]: Niagara optimized memset/bzero/clear_user. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:13:50 -08:00
David S. Miller	3763be32d5	[SPARC64]: Define ARCH_HAS_READ_CURRENT_TIMER. This gives more consistent bogomips and delay() semantics, especially on sun4v. It gives weird looking values though... Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:13:29 -08:00
David S. Miller	c857e3fdbc	[SPARC64]: __bzero_noasi --> __clear_user Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:13:28 -08:00
David S. Miller	6241e5cc6a	[SPARC64]: Fix branch signedness bug in all code patching. The bug that hit SUN4V TLB patching exists elsewhere. Make sure we cure all such cases. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:12:29 -08:00
David S. Miller	c4bce90ea2	[SPARC64]: Deal with PTE layout differences in SUN4V. Yes, you heard it right, they changed the PTE layout for SUN4V. Ho hum... This is the simple and inefficient way to support this. It'll get optimized, don't worry. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:12:25 -08:00
David S. Miller	0d4bc95b9c	[SPARC64]: Fix some Niagara memcpy() bugs. We need to restore the %asi register properly. For the kernel this means get_fs(), for user this means ASI_PNF. Also, NGcopy_to_user.S was including U3memcpy.S instead of NGmemcpy.S, oops :-) Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:12:20 -08:00
David S. Miller	8591e30272	[SPARC64]: Niagara copy/clear page. Happily we have no D-cache aliasing issues on these chips, so the implementation is very straightforward. Add a stub in bootup which will be where the patching calls will be made for niagara/sun4v/hypervisor. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:11:54 -08:00
David S. Miller	398d108308	[SPARC64]: Niagara optimized memcpy() and copy_{to,from}_user(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:11:42 -08:00
David S. Miller	4da808c352	[SPARC64]: Fix bogus flush instruction usage. Some of the trap code was still assuming that alternate global %g6 was hard coded with current_thread_info(). Let's just consistently flush at KERNBASE when we need a pipeline synchronization. That's locked into the TLB and will always work. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 01:11:22 -08:00
David S. Miller	4d000d5b96	[SPARC64]: Mark __ex_table section correctly. We must use the "a" (allocate) attribute every time we emit an entry into the __ex_table section. For consistency, use "a" instead of #alloc which is some Solaris compat cruft GNU as provides on Sparc. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-04 23:23:56 -08:00
David S. Miller	ba6399334d	[SPARC64]: Fix userland FPU state corruption. We need to use stricter memory barriers around the block load and store instructions we use to save and restore the FPU register file. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-07 13:30:49 -07:00
David S. Miller	efdc1e2083	[SPARC64]: Simplify user fault fixup handling. Instead of doing byte-at-a-time user accesses to figure out where the fault occurred, read the saved fault_address from the current thread structure. For the sake of defensive programming, if the fault_address does not fall into the user buffer range, simply assume the whole area faulted. This will cause the fixup for copy_from_user() to clear the entire kernel side buffer. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-28 21:06:47 -07:00
David S. Miller	5fd29752f0	[SPARC64]: Fix fault handling in unaligned trap handler. We were not calling kernel_mna_trap_fault() correctly. Instead of being fancy, just return 0 vs. -EFAULT from the assembler stubs, and handle that return value as appropriate. Create an "__retl_efault" stub for assembler exception table entries and use it where possible. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-28 20:41:45 -07:00
David S. Miller	4db2ce0199	[LIB]: Consolidate _atomic_dec_and_lock() Several implementations were essentialy a common piece of C code using the cmpxchg() macro. Put the implementation in one spot that everyone can share, and convert sparc64 over to using this. Alpha is the lone arch-specific implementation, which codes up a special fast path for the common case in order to avoid GP reloading which a pure C version would require. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-14 21:47:01 -07:00
Ingo Molnar	fb1c8f93d8	[PATCH] spinlock consolidation This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h \| 16 include/asm-x86_64/spinlock_types.h \| 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP \| UP ----------------------------\|----------------------------------- asm/spinlock_types_smp.h \| linux/spinlock_types_up.h linux/spinlock_types.h \| linux/spinlock_types.h asm/spinlock_smp.h \| linux/spinlock_up.h linux/spinlock_api_smp.h \| linux/spinlock_api_up.h linux/spinlock.h \| linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_()/etc. lowlevel implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_()/etc. version of UP builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. / All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/.h and asm/.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:21 -07:00
David S. Miller	4d803fcdcd	[SPARC64]: Inline membar()'s again. Since GCC has to emit a call and a delay slot to the out-of-line "membar" routines in arch/sparc64/lib/mb.S it is much better to just do the necessary predicted branch inline instead as: ba,pt %xcc, 1f membar #whatever 1: instead of the current: call membar_foo dslot because this way GCC is not required to allocate a stack frame if the function can be a leaf function. This also makes this bug fix easier to backport to 2.4.x Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 14:37:53 -07:00
David S. Miller	8a36895c0d	[SPARC64]: Use 'unsigned long' for port argument to I/O string ops. This kills warnings when building drivers/ide/ide-iops.c and puts us in-line with what other platforms do here. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-31 15:01:33 -07:00
David S. Miller	dbd2fdf549	[SPARC64]: Kill BRANCH_IF_ANY_CHEETAH() from copy page. Just patch the branch at boot time instead. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-30 11:26:15 -07:00
David S. Miller	4f07118f65	[SPARC64]: More fully work around Spitfire Errata 51. It appears that a memory barrier soon after a mispredicted branch, not just in the delay slot, can cause the hang condition of this cpu errata. So move them out-of-line, and explicitly put them into a "branch always, predict taken" delay slot which should fully kill this problem. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 12:46:22 -07:00
David S. Miller	442464a500	[SPARC64]: Make debugging spinlocks usable again. When the spinlock routines were moved out of line into kernel/spinlock.c this made it so that the debugging spinlocks record lock acquisition program counts in the kernel/spinlock.c functions not in their callers. This makes the debugging info kind of useless. So record the correct caller's program counter and now this feature is useful once more. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 12:46:07 -07:00
David S. Miller	b445e26cbf	[SPARC64]: Avoid membar instructions in delay slots. In particular, avoid membar instructions in the delay slot of a jmpl instruction. UltraSPARC-I, II, IIi, and IIe have a bug, documented in the UltraSPARC-IIi User's Manual, Appendix K, Erratum 51 The long and short of it is that if the IMU unit misses on a branch or jmpl, and there is a store buffer synchronizing membar in the delay slot, the chip can stop fetching instructions. If interrupts are enabled or some other trap is enabled, the chip will unwedge itself, but performance will suffer. We already had a workaround for this bug in a few spots, but it's better to have the entire tree sanitized for this rule. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-27 15:42:04 -07:00
Ingo Molnar	39c715b717	[PATCH] smp_processor_id() cleanup This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:13 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

34 Commits