f2db633d30
Similar to x86/sparc/powerpc implementations except: 1) we implement an extremely efficient has_zero()/find_zero() sequence with both prep_zero_mask() and create_zero_mask() no-operations. 2) Our output from prep_zero_mask() differs in that only the lowest eight bits are used to represent the zero bytes nevertheless it can be safely ORed with other similar masks from prep_zero_mask() and forms input to create_zero_mask(), the two fundamental properties prep_zero_mask() must satisfy. Tests on EV67 and EV68 CPUs revealed that the generic code is essentially as fast (to within 0.5% of CPU cycles) of the old Alpha specific code for large quadword-aligned strings, despite the 30% extra CPU instructions executed. In contrast, the generic code for unaligned strings is substantially slower (by more than a factor of 3) than the old Alpha specific code. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
---|---|---|
.. | ||
callback_srm.S | ||
checksum.c | ||
clear_page.S | ||
clear_user.S | ||
copy_page.S | ||
copy_user.S | ||
csum_ipv6_magic.S | ||
csum_partial_copy.c | ||
dbg_current.S | ||
dbg_stackcheck.S | ||
dbg_stackkill.S | ||
dec_and_lock.c | ||
divide.S | ||
ev6-clear_page.S | ||
ev6-clear_user.S | ||
ev6-copy_page.S | ||
ev6-copy_user.S | ||
ev6-csum_ipv6_magic.S | ||
ev6-divide.S | ||
ev6-memchr.S | ||
ev6-memcpy.S | ||
ev6-memset.S | ||
ev6-stxcpy.S | ||
ev6-stxncpy.S | ||
ev67-strcat.S | ||
ev67-strchr.S | ||
ev67-strlen.S | ||
ev67-strncat.S | ||
ev67-strrchr.S | ||
fls.c | ||
fpreg.c | ||
Makefile | ||
memchr.S | ||
memcpy.c | ||
memmove.S | ||
memset.S | ||
srm_printk.c | ||
srm_puts.c | ||
stacktrace.c | ||
strcat.S | ||
strchr.S | ||
strcpy.S | ||
strlen.S | ||
strncat.S | ||
strncpy.S | ||
strrchr.S | ||
stxcpy.S | ||
stxncpy.S | ||
udelay.c |