This was using module_init, but there is no way this code can
be modular. In the non-modular case, a module_init becomes a
device_initcall, but this really isn't a device. So we should
choose a more appropriate initcall bucket to put it in.
Assuming boot time self tests need to be observed over a console
to be useful, and that the console device could possibly not be
fully functional until after device_initcall, we move this to the
late_initcall bucket, which is immediately after device_initcall.
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Memory allocation only happens in the self test, just as random numbers
are only used there. So move the inclusion of slab.h inside the
CONFIG_TEST_LIST_SORT.
We don't need module.h and all of the stuff it carries with it, so replace
with export.h and compiler.h. Unfortunately, the ARRAY_SIZE macro from
kernel.h requires the user to ensure bug.h is also included (for
BUILD_BUG_ON_ZERO, used by __must_be_array). We used to get that through
some maze of nested includes, but just include it explicitly.
linux/string.h is then only included implicitly through
kernel.h->printk.h->dynamic_debug.h, but only if !CONFIG_DYNAMIC_DEBUG, so
just include it explicitly (for memset).
objdump -d says the generated code is the same, and wc -l says that
lib/.list_sort.o.cmd went from 579 to 165 lines.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The helper merge_and_restore_back_links() makes sure to call the
caller's cmp function during the final ->prev pointer fixup, so that the
cmp function may call cond_resched(). However, if the cmp function does
not call cond_resched() at all, this is entirely redundant. If it does,
doing at least two function calls for every two pointer assignments is a
bit excessive. This patch limits the calls to once for every 256
iterations.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no reason to maintain the list structure while freeing the
debug elements. Aside from the redundant pointer manipulations, it is
also inefficient from a locality-of-reference viewpoint, since they are
visited in a random order (wrt. the order they were allocated).
Furthermore, if we jumped to exit: after detecting list corruption, it
is actually dangerous.
So just free the elements in the order they were allocated, using the
backing array elts. Allocate that using kcalloc(), so that if
allocation of one of the debug element fails, we just end up calling
kfree(NULL) for the trailing elements.
Minor details: Use sizeof(*elts) instead of sizeof(void *), and return
err immediately when allocation of elts fails, to avoid introducing
another label just before the final return statement.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a check to make sure that the prev pointer of the list head points
to the last element on the list.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use preferable function name which implies using a pseudo-random
number generator.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improve 'lib_sort()' test and check that:
o 'cmp()' is called only for elements which were present in the original list,
i.e., the 'a' and 'b' parameters are valid
o the resulted (sorted) list consists onlly of the original elements
o intdoruce "poison" fields to make sure data around 'struc list_head' field
are not corrupted.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch unifies 'list_sort_test()' messages a bit and makes sure all of
them start with the "list_sort_test:" prefix to make it obvious for users
where the messages come from.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 'lib_sort()' test does not free memory if it fails, and it makes the
kernel panic if it cannot allocate memory. This patch fixes the problem.
This patch also changes several small things:
o use 'list_add()' helper instead of adding manually
o introduce temporary 'el1' variable to avoid ugly and unreadalbe
"if" statement
o make 'head' to be stack variable instead of 'kmalloc()'ed, which
simplifies code a bit
Overall, this patch is of clean-up type.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using own pseudo-random generator, use generic linux
'random32()' function. Presumably, this should improve test coverage.
At the same time, do the following changes:
o Use shorter macro name for test list length
o Do not use strange 'l_h' name for 'struct list_head' element,
use 'list', because it is traditional name and thus, makes the
code more obvious and readable.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I do not see any reason to use KERN_WARN for normal messages and
KERN_EMERG for error messages in the lib_sort testing routine. Let's use
more reasonable KERN_NORM and KERN_ERR levels.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While hunting a non-existing bug in 'list_sort()', I've improved the
'list_sort_test()' function which tests the 'list_sort()' library call.
Although at the end I found a bug in my code, but not in 'list_sort()', I
think my clean-ups and improvements are worth merging because they make
the test function better.
This patch:
Make the self-tests selectable via Kconfig rather than by manual enabling
of DEBUG_LIST_SORT.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the original list is a POT in length, the first callback from line 73
will pass a==b both pointing to the original list_head. This is dangerous
because the 'list_sort()' user can use 'container_of()' and accesses the
"containing" object, which does not necessary exist for the list head. So
the user can access RAM which does not belong to him. If this is a write
access, we can end up with memory corruption.
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
XFS and UBIFS can pass long lists to list_sort(); this alternative
implementation scales better, reaching ~3x performance gain when list
length exceeds the L2 cache size.
Stand-alone program timings were run on a Core 2 duo L1=32KB L2=4MB,
gcc-4.4, with flags extracted from an Ubuntu kernel build. Object size is
581 bytes compared to 455 for Mark J. Roberts' code.
Worst case for either implementation is a list length just over a power of
two, and to roughly the same degree, so here are timing results for a
range of 2^N+1 lengths. List elements were 16 bytes each including malloc
overhead; initial order was random.
time (msec)
Tatham-Roberts
| generic-Mullis-v2
loop_count length | | ratio
4000000 2 206 294 1.427
2000000 3 176 227 1.289
1000000 5 199 172 0.864
500000 9 235 178 0.757
250000 17 243 182 0.748
125000 33 261 196 0.750
62500 65 277 209 0.754
31250 129 292 219 0.75
15625 257 317 235 0.741
7812 513 340 252 0.741
3906 1025 362 267 0.737
1953 2049 388 283 0.729 ~ L1 size
976 4097 556 323 0.580
488 8193 678 361 0.532
244 16385 773 395 0.510
122 32769 844 418 0.495
61 65537 917 454 0.495
30 131073 1128 543 0.481
15 262145 2355 869 0.369 ~ L2 size
7 524289 5597 1714 0.306
3 1048577 6218 2022 0.325
Mark's code does not actually implement the usual or generic mergesort,
but rather a variant from Simon Tatham described here:
http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
Simon's algorithm performs O(log N) passes over the entire input list,
doing merges of sublists that double in size on each pass. The generic
algorithm instead merges pairs of equal length lists as early as possible,
in recursive order. For either algorithm, the elements that extend the
list beyond power-of-two length are a special case, handled as nearly as
possible as a "rounding-up" to a full POT.
Some intuition for the locality of reference implications of merge order
may be gotten by watching this animation:
http://www.sorting-algorithms.com/merge-sort
Simon's algorithm requires only O(1) extra space rather than the generic
algorithm's O(log N), but in my non-recursive implementation the actual
O(log N) data is merely a vector of ~20 pointers, which I've put on the
stack.
Long-running list_sort() calls: If the list passed in may be long, or the
client's cmp() callback function is slow, the client's cmp() may
periodically invoke cond_resched() to voluntarily yield the CPU. All
inner loops of list_sort() call back to cmp().
Stability of the sort: distinct elements that compare equal emerge from
the sort in the same order as with Mark's code, for simple test cases. A
boot-time test is provided to verify this and other correctness
requirements.
A kernel that uses drm.ko appears to run normally with this change; I have
no suitable hardware to similarly test the use by UBIFS.
[akpm@linux-foundation.org: style tweaks, fix comment, make list_sort_test __init]
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two copies of list_sort() in the tree already, one in the DRM
code, another in ubifs. Now XFS needs this as well. Create a generic
list_sort() function from the ubifs version and convert existing users
to it so we don't end up with yet another copy in the tree.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>