kernel-ark/lib
Mike Travis 41df0d61c2 x86: Add performance variants of cpumask operators
* Increase performance for systems with large count NR_CPUS by limiting
    the range of the cpumask operators that loop over the bits in a cpumask_t
    variable.  This removes a large amount of wasted cpu cycles.

  * Add performance variants of the cpumask operators:

    int cpus_weight_nr(mask)	     Same using nr_cpu_ids instead of NR_CPUS
    int first_cpu_nr(mask)	     Number lowest set bit, or nr_cpu_ids
    int next_cpu_nr(cpu, mask)	     Next cpu past 'cpu', or nr_cpu_ids
    for_each_cpu_mask_nr(cpu, mask)  for-loop cpu over mask using nr_cpu_ids

  * Modify following to use performance variants:

    #define num_online_cpus()	cpus_weight_nr(cpu_online_map)
    #define num_possible_cpus()	cpus_weight_nr(cpu_possible_map)
    #define num_present_cpus()	cpus_weight_nr(cpu_present_map)

    #define for_each_possible_cpu(cpu) for_each_cpu_mask_nr((cpu), ...)
    #define for_each_online_cpu(cpu)   for_each_cpu_mask_nr((cpu), ...)
    #define for_each_present_cpu(cpu)  for_each_cpu_mask_nr((cpu), ...)

  * Comment added to include/linux/cpumask.h:

    Note: The alternate operations with the suffix "_nr" are used
	  to limit the range of the loop to nr_cpu_ids instead of
	  NR_CPUS when NR_CPUS > 64 for performance reasons.
	  If NR_CPUS is <= 64 then most assembler bitmask
	  operators execute faster with a constant range, so
	  the operator will continue to use NR_CPUS.

	  Another consideration is that nr_cpu_ids is initialized
	  to NR_CPUS and isn't lowered until the possible cpus are
	  discovered (including any disabled cpus).  So early uses
	  will span the entire range of NR_CPUS.

    (The net effect is that for systems with 64 or less CPU's there are no
     functional changes.)

For inclusion into sched-devel/latest tree.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Reviewed-by: Paul Jackson <pj@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-23 18:23:38 +02:00
..
lzo lzo: fix typo in decompressor 2008-04-10 15:34:05 -07:00
reed_solomon lib: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:17:17 -04:00
zlib_deflate lib/: Spelling fixes 2008-02-03 17:48:52 +02:00
zlib_inflate [ZLIB]: Fix external builds of zlib_inflate code. 2007-10-11 22:17:20 -07:00
.gitignore
argv_split.c LIB: Replace inappropriate include of <linux/bug.h> 2007-10-20 00:26:10 +02:00
audit.c
bitmap.c cpumask: remove bitmap_scnprintf_len and cpumask_scnprintf_len 2008-05-13 08:02:25 -07:00
bitrev.c
bug.c generic bug: use show_regs() instead of dump_stack() 2007-07-16 09:05:51 -07:00
bust_spinlocks.c handle recursive calls to bust_spinlocks() 2007-10-17 08:42:56 -07:00
check_signature.c uninline check_signature() 2007-07-16 09:05:50 -07:00
cmdline.c
cpumask.c x86: Add performance variants of cpumask operators 2008-05-23 18:23:38 +02:00
crc7.c CRC7 support 2007-07-17 10:23:04 -07:00
crc16.c
crc32.c lib/: Spelling fixes 2008-02-03 17:48:52 +02:00
crc32defs.h
crc-ccitt.c
crc-itu-t.c
ctype.c
debug_locks.c
debugobjects.c infrastructure to debug (dynamic) objects 2008-04-30 08:29:53 -07:00
dec_and_lock.c
devres.c [POWERPC] devres: Add devm_ioremap_prot() 2008-05-05 16:47:14 +10:00
div64.c rename div64_64 to div64_u64 2008-05-01 08:03:58 -07:00
dump_stack.c
extable.c lib/extable.c: remove an expensive integer divide in search_extable() 2008-02-06 10:41:08 -08:00
fault-inject.c libfs: allow error return from simple attributes 2008-02-08 09:22:34 -08:00
find_next_bit.c bitops: remove "optimizations" 2008-04-29 08:11:16 -07:00
gen_crc32table.c
genalloc.c Slab allocators: Replace explicit zeroing with __GFP_ZERO 2007-07-17 10:23:02 -07:00
halfmd4.c
hexdump.c lib: create common ascii hex array 2008-05-14 19:11:14 -07:00
hweight.c remove asm/bitops.h includes 2007-10-19 11:53:41 -07:00
idr.c idr: fix idr_remove() 2008-05-01 08:04:00 -07:00
inflate.c lib/inflate.c: handle failed malloc() 2008-04-29 08:06:02 -07:00
int_sqrt.c
iomap_copy.c
iomap.c iomap: fix 64 bits resources on 32 bits 2008-04-29 08:06:02 -07:00
iommu-helper.c iommu: export iommu_is_span_boundary helper function 2008-03-04 16:35:17 -08:00
ioremap.c lib/ioremap.c should #include <linux/io.h> 2007-10-17 08:42:50 -07:00
irq_regs.c
kasprintf.c lib: move kasprintf to a separate file 2007-07-31 15:39:39 -07:00
Kconfig x86, bitops: select the generic bitmap search functions 2008-04-26 19:21:17 +02:00
Kconfig.debug debugobjects: add timer specific object debugging code 2008-04-30 08:29:53 -07:00
Kconfig.kgdb kgdb: kconfig fix xconfig/menuconfig element 2008-05-05 07:13:21 -05:00
kernel_lock.c BKL: revert back to the old spinlock implementation 2008-05-10 20:58:02 -07:00
klist.c klist: fix coding style errors in klist.h and klist.c 2008-04-30 16:52:58 -07:00
kobject_uevent.c lib: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
kobject.c kobject: do not copy vargs, just pass them around 2008-04-30 16:52:48 -07:00
kref.c kref: add kref_set() 2008-01-24 20:40:05 -08:00
libcrc32c.c [LIB] crc32c: Keep intermediate crc state in cpu order 2007-11-08 21:34:09 +08:00
list_debug.c
lmb.c lmb: Fix compile warning 2008-05-18 23:35:43 -05:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
Makefile infrastructure to debug (dynamic) objects 2008-04-30 08:29:53 -07:00
parser.c add match_strlcpy() us it to make v9fs make uname and remotename parsing more robust 2008-05-14 19:23:25 -05:00
percpu_counter.c mm: bdi: export BDI attributes in sysfs 2008-04-30 08:29:49 -07:00
plist.c
prio_heap.c Fix cpusets update_cpumask 2007-10-19 11:53:41 -07:00
prio_tree.c
proportions.c mm: bdi: allow setting a maximum for the bdi dirty limit 2008-04-30 08:29:50 -07:00
radix-tree.c Remove set_migrateflags() 2008-04-28 08:58:17 -07:00
random32.c [NET]: srandom32 fixes for networking v2 2008-04-03 14:07:02 -07:00
ratelimit.c isolate ratelimit from printk.c for other use 2008-04-29 08:06:06 -07:00
rbtree.c
reciprocal_div.c
rwsem-spinlock.c lib: remove fastcall from lib/* 2008-02-08 09:22:31 -08:00
rwsem.c x86: fix UML and -regparm=3 2008-01-30 13:33:00 +01:00
scatterlist.c [SCSI] block: add sg buffer copy helper functions 2008-04-07 12:15:45 -05:00
sha1.c
smp_processor_id.c debug_smp_processor_id() fixlets 2008-02-06 10:41:09 -08:00
sort.c lib/sort.c optimization 2007-10-17 08:42:52 -07:00
spinlock_debug.c Use helpers to obtain task pid in printks 2007-10-19 11:53:43 -07:00
string.c Add a new sysfs_streq() string comparison function 2008-05-01 08:03:59 -07:00
swiotlb.c dma/ia64: update ia64 machvecs, swiotlb.c 2008-04-29 08:06:12 -07:00
textsearch.c [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure 2007-12-01 00:03:52 +11:00
ts_bm.c
ts_fsm.c
ts_kmp.c
vsprintf.c lib/vsprintf.c: fix bug omitting minus sign of numbers (module_param) 2008-02-23 17:12:14 -08:00