Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.compound |
A filter that decomposes compound words you find in many Germanic
languages into the word parts.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.analysis.tokenattributes |
General-purpose attributes for text analysis.
|
org.apache.lucene.collation.tokenattributes |
Custom
AttributeImpl for indexing collation keys as index terms. |
Modifier and Type | Class and Description |
---|---|
class |
MockUTF16TermAttributeImpl
Extension of
CharTermAttributeImpl that encodes the term
text as UTF-16 bytes instead of as UTF-8 bytes. |
class |
Token
A Token is an occurrence of a term from the text of a field.
|
Modifier and Type | Field and Description |
---|---|
protected CharTermAttribute |
CompoundWordTokenFilterBase.termAtt |
Modifier and Type | Method and Description |
---|---|
void |
UAX29URLEmailTokenizerImpl.getText(CharTermAttribute t)
Fills CharTermAttribute with the current token text.
|
void |
StandardTokenizerImpl.getText(CharTermAttribute t)
Fills CharTermAttribute with the current token text.
|
Modifier and Type | Class and Description |
---|---|
class |
CharTermAttributeImpl
Default implementation of
CharTermAttribute . |
class |
PackedTokenAttributeImpl
Default implementation of the common attributes used by Lucene:
CharTermAttribute
TypeAttribute
PositionIncrementAttribute
PositionLengthAttribute
OffsetAttribute
TermFrequencyAttribute
|
Modifier and Type | Method and Description |
---|---|
CharTermAttribute |
CharTermAttributeImpl.append(char c) |
CharTermAttribute |
CharTermAttribute.append(char c) |
CharTermAttribute |
CharTermAttributeImpl.append(java.lang.CharSequence csq) |
CharTermAttribute |
CharTermAttribute.append(java.lang.CharSequence csq) |
CharTermAttribute |
CharTermAttributeImpl.append(java.lang.CharSequence csq,
int start,
int end) |
CharTermAttribute |
CharTermAttribute.append(java.lang.CharSequence csq,
int start,
int end) |
CharTermAttribute |
CharTermAttributeImpl.append(CharTermAttribute ta) |
CharTermAttribute |
CharTermAttribute.append(CharTermAttribute termAtt)
Appends the contents of the other
CharTermAttribute to this character sequence. |
CharTermAttribute |
CharTermAttributeImpl.append(java.lang.String s) |
CharTermAttribute |
CharTermAttribute.append(java.lang.String s)
Appends the specified
String to this character sequence. |
CharTermAttribute |
CharTermAttributeImpl.append(java.lang.StringBuilder s) |
CharTermAttribute |
CharTermAttribute.append(java.lang.StringBuilder sb)
Appends the specified
StringBuilder to this character sequence. |
CharTermAttribute |
CharTermAttributeImpl.setEmpty() |
CharTermAttribute |
CharTermAttribute.setEmpty()
Sets the length of the termBuffer to zero.
|
CharTermAttribute |
CharTermAttributeImpl.setLength(int length) |
CharTermAttribute |
CharTermAttribute.setLength(int length)
Set number of valid characters (length of the term) in
the termBuffer array.
|
Modifier and Type | Method and Description |
---|---|
CharTermAttribute |
CharTermAttributeImpl.append(CharTermAttribute ta) |
CharTermAttribute |
CharTermAttribute.append(CharTermAttribute termAtt)
Appends the contents of the other
CharTermAttribute to this character sequence. |
Modifier and Type | Class and Description |
---|---|
class |
CollatedTermAttributeImpl
Extension of
CharTermAttributeImpl that encodes the term
text as a binary Unicode collation key instead of as UTF-8 bytes. |
class |
ICUCollatedTermAttributeImpl
Extension of
CharTermAttributeImpl that encodes the term
text as a binary Unicode collation key instead of as UTF-8 bytes. |
Copyright © 2000–2019 The Apache Software Foundation. All rights reserved.