java.lang.Object
org.apache.lucene.util.Sorter
org.apache.lucene.util.MSBRadixSorter
- Direct Known Subclasses:
StableMSBRadixSorter
,StringMSBRadixSorter
Radix sorter for variable-length strings. This class sorts based on the most significant byte
first and falls back to
IntroSorter
when the size of the buckets to sort becomes small.
This algorithm is NOT stable. Worst-case memory usage is about 2.3 KB
.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final int[]
private final int[]
protected static final int
private final int[][]
private static final int
private static final int
protected final int
Fields inherited from class org.apache.lucene.util.Sorter
BINARY_SORT_THRESHOLD, INSERTION_SORT_THRESHOLD
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate boolean
assertHistogram
(int commonPrefixLength, int[] histogram) private void
buildHistogram
(int from, int to, int k, int[] histogram) Build an histogram of the k-th characters of values occurring between offsetsfrom
andto
, usinggetBucket(int, int)
.protected abstract int
byteAt
(int i, int k) Return the k-th byte of the entry at indexi
, or-1
if its length is less than or equal tok
.protected final int
compare
(int i, int j) Compare entries found in slotsi
andj
.private int
computeCommonPrefixLengthAndBuildHistogram
(int from, int to, int k, int[] histogram) Build a histogram of the number of values perbucket
and return a common prefix length for all visited values.protected int
getBucket
(int i, int k) Return a number for the k-th character between 0 andHISTOGRAM_SIZE
.protected Sorter
getFallbackSorter
(int k) Get a fall-back sorter which may assume that the first k bytes of all compared strings are equal.private void
introSort
(int from, int to, int k) private void
radixSort
(int from, int to, int k, int l) protected void
reorder
(int from, int to, int[] startOffsets, int[] endOffsets, int k) Reorder based on start/end offsets for each bucket.void
sort
(int from, int to) Sort the slice which starts atfrom
(inclusive) and ends atto
(exclusive).protected void
sort
(int from, int to, int k, int l) private static void
sumHistogram
(int[] histogram, int[] endOffsets) Accumulate values of the histogram so that it does not store counts but start offsets.Methods inherited from class org.apache.lucene.util.Sorter
binarySort, binarySort, checkRange, comparePivot, doRotate, heapChild, heapify, heapParent, heapSort, insertionSort, lower, lower2, mergeInPlace, reverse, rotate, setPivot, siftDown, swap, upper, upper2
-
Field Details
-
LEVEL_THRESHOLD
private static final int LEVEL_THRESHOLD- See Also:
-
HISTOGRAM_SIZE
protected static final int HISTOGRAM_SIZE- See Also:
-
LENGTH_THRESHOLD
private static final int LENGTH_THRESHOLD- See Also:
-
histograms
private final int[][] histograms -
endOffsets
private final int[] endOffsets -
commonPrefix
private final int[] commonPrefix -
maxLength
protected final int maxLength
-
-
Constructor Details
-
MSBRadixSorter
protected MSBRadixSorter(int maxLength) Sole constructor.- Parameters:
maxLength
- the maximum length of keys, passInteger.MAX_VALUE
if unknown.
-
-
Method Details
-
byteAt
protected abstract int byteAt(int i, int k) Return the k-th byte of the entry at indexi
, or-1
if its length is less than or equal tok
. This may only be called with a value ofi
between0
included andmaxLength
excluded. -
getFallbackSorter
Get a fall-back sorter which may assume that the first k bytes of all compared strings are equal. -
compare
protected final int compare(int i, int j) Description copied from class:Sorter
Compare entries found in slotsi
andj
. The contract for the returned value is the same asComparator.compare(Object, Object)
. -
sort
public void sort(int from, int to) Description copied from class:Sorter
Sort the slice which starts atfrom
(inclusive) and ends atto
(exclusive). -
sort
protected void sort(int from, int to, int k, int l) -
introSort
private void introSort(int from, int to, int k) -
radixSort
private void radixSort(int from, int to, int k, int l) - Parameters:
k
- the character number to comparel
- the level of recursion
-
assertHistogram
private boolean assertHistogram(int commonPrefixLength, int[] histogram) -
getBucket
protected int getBucket(int i, int k) Return a number for the k-th character between 0 andHISTOGRAM_SIZE
. -
computeCommonPrefixLengthAndBuildHistogram
private int computeCommonPrefixLengthAndBuildHistogram(int from, int to, int k, int[] histogram) Build a histogram of the number of values perbucket
and return a common prefix length for all visited values.- See Also:
-
buildHistogram
private void buildHistogram(int from, int to, int k, int[] histogram) Build an histogram of the k-th characters of values occurring between offsetsfrom
andto
, usinggetBucket(int, int)
. -
sumHistogram
private static void sumHistogram(int[] histogram, int[] endOffsets) Accumulate values of the histogram so that it does not store counts but start offsets.endOffsets
will store the end offsets. -
reorder
protected void reorder(int from, int to, int[] startOffsets, int[] endOffsets, int k) Reorder based on start/end offsets for each bucket. When this method returns, startOffsets and endOffsets are equal.- Parameters:
startOffsets
- start offsets per bucketendOffsets
- end offsets per bucket
-